Interpretability vs Explainability
- Interpretability - is the extent to which a cause and effect can be observed within a system. Or, to put it another way, it is the extent to which you are able to predict what is going to happen, given a change in input or algorithmic parameters. It’s being able to look at an algorithm and go yep, I can see what’s happening here.
- Explainability - is the extent to which the internal mechanics of a machine or deep learning system can be explained in human terms
Interpretability is about being able to discern the mechanics without necessarily knowing why. Explainability is being able to quite literally explain what is happening.
Interpretability & Explainability - Techniques and Methods
|
Algorithmic Generalization | |
|
Pay Attention to Feature Importance | |
|
Leave One Column Out (LOCO) |
|
|
Permutation Impact/Importance (PI) |
|
| |
| |
|
Deep Learning Important Features (DeepLIFT) | |
|
Layer-wise Relevance Propagation |
Interpretability & Explainability - Problems
- interpretability and explainability add an additional step to the developmental process