interpret

Fit interpretable models. Explain blackbox machine learning.

6,630

763

6,630

107

View on GitHub

Top Related Projects

shap

24,183

A game theoretic approach to explain the output of any machine learning model.

lime

11,950

Lime: Explaining the predictions of any machine learning classifier

alibi

2,540

Algorithms for explaining machine learning models

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

Quick Overview

InterpretML is an open-source Python package that provides a unified framework for machine learning interpretability. It offers a collection of state-of-the-art machine learning interpretability techniques, allowing users to explain and understand the behavior of their models across various domains and model types.

Pros

Supports a wide range of interpretability methods, including SHAP, LIME, and EBM
Compatible with popular machine learning frameworks like scikit-learn, XGBoost, and LightGBM
Provides both global and local explanations for model behavior
Offers interactive visualizations for easier interpretation of results

Cons

May have a steeper learning curve for users new to interpretability concepts
Some advanced features might require additional dependencies
Performance can be slower for very large datasets or complex models
Limited support for deep learning models compared to traditional machine learning algorithms

Code Examples

Loading data and training an Explainable Boosting Machine (EBM):

from interpret.glassbox import ExplainableBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer

# Load data and split into train/test sets
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=0)

# Train an Explainable Boosting Machine
ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

Generating global explanations:

from interpret import show

# Generate global explanations
global_explanation = ebm.explain_global()

# Visualize the explanations
show(global_explanation)

Creating local explanations for individual predictions:

# Generate local explanations for the first test instance
local_explanation = ebm.explain_local(X_test[:1], y_test[:1])

# Visualize the local explanations
show(local_explanation)

Getting Started

To get started with InterpretML, follow these steps:

Install the package:

pip install interpret

Import the necessary modules and load your data:

from interpret.glassbox import ExplainableBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=0)

Train a model and generate explanations:

ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

global_explanation = ebm.explain_global()
local_explanation = ebm.explain_local(X_test[:5], y_test[:5])

Visualize the explanations:

from interpret import show

show(global_explanation)
show(local_explanation)

Competitor Comparisons

shap

24,183

A game theoretic approach to explain the output of any machine learning model.

Pros of SHAP

More focused on SHAP (SHapley Additive exPlanations) values, providing deeper insights into feature importance
Offers a wider range of visualization options for SHAP values
Supports more machine learning frameworks and model types

Cons of SHAP

Less comprehensive in terms of overall model interpretability techniques
May have a steeper learning curve for users new to SHAP concepts
Can be computationally intensive for large datasets or complex models

Code Comparison

SHAP example:

import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)
shap.summary_plot(shap_values, X)

Interpret example:

from interpret import set_visualize_provider
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)
ebm_global = ebm.explain_global()

Both libraries offer powerful tools for model interpretation, but SHAP focuses more on SHAP values and their visualizations, while Interpret provides a broader range of interpretability techniques. SHAP may be preferred for in-depth feature importance analysis, while Interpret offers a more comprehensive toolkit for overall model explanation.

lime

11,950

Lime: Explaining the predictions of any machine learning classifier

Pros of LIME

Simpler and more focused on a single interpretability technique
Widely adopted and well-established in the ML community
Easier to integrate into existing projects due to its lightweight nature

Cons of LIME

Limited to local interpretability, lacking global explanations
Less comprehensive feature set compared to Interpret
May require additional libraries for certain model types or visualizations

Code Comparison

LIME:

from lime import lime_tabular
explainer = lime_tabular.LimeTabularExplainer(X_train)
exp = explainer.explain_instance(X_test[0], clf.predict_proba)

Interpret:

from interpret import set_visualize_provider
from interpret.glassbox import ExplainableBoostingClassifier

ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)
ebm_global = ebm.explain_global()

Summary

LIME focuses on local interpretability, offering a straightforward approach for explaining individual predictions. It's widely adopted and easy to integrate but lacks global explanations. Interpret provides a more comprehensive suite of interpretability tools, including both local and global explanations, but may have a steeper learning curve. The choice between the two depends on the specific needs of the project and the desired depth of model interpretability.

alibi

2,540

Algorithms for explaining machine learning models

Pros of Alibi

Broader range of explainability methods, including counterfactual explanations and anchor explanations
Better support for NLP and image data
More active development and frequent updates

Cons of Alibi

Steeper learning curve due to more complex API
Less integration with popular ML frameworks like scikit-learn
Fewer built-in visualizations compared to Interpret

Code Comparison

Alibi example:

from alibi.explainers import AnchorTabular
explainer = AnchorTabular(predict_fn, feature_names)
explanation = explainer.explain(X)

Interpret example:

from interpret import set_visualize_provider
from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X, y)

Both libraries offer powerful explainability tools, but Alibi provides a wider range of methods and better support for diverse data types. Interpret, on the other hand, offers a more user-friendly API and tighter integration with common ML workflows. The choice between the two depends on the specific use case and the user's familiarity with explainable AI concepts.

responsible-ai-toolbox

1,593

Pros of responsible-ai-toolbox

More comprehensive suite of tools for responsible AI, including fairness assessment and causal inference
Stronger integration with Azure Machine Learning and other Microsoft services
More active development and frequent updates

Cons of responsible-ai-toolbox

Steeper learning curve due to broader scope and more complex features
Potentially heavier resource requirements for some components
More tightly coupled with Microsoft ecosystem, which may limit flexibility

Code Comparison

interpret:

from interpret import set_visualize_provider
from interpret.provider import InlineProvider
set_visualize_provider(InlineProvider())

from interpret.glassbox import ExplainableBoostingClassifier
ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

responsible-ai-toolbox:

from raiwidgets import ResponsibleAIDashboard
from responsibleai import RAIInsights

rai_insights = RAIInsights(model=model,
                           train=X_train,
                           test=X_test,
                           target_column=target_column,
                           task_type='classification')

ResponsibleAIDashboard(rai_insights)

Both libraries offer model interpretability tools, but responsible-ai-toolbox provides a more extensive set of features for responsible AI practices. interpret focuses primarily on model explanations, while responsible-ai-toolbox includes additional capabilities like fairness assessment and error analysis. The code examples demonstrate the different approaches: interpret offers a more straightforward API for model explanations, while responsible-ai-toolbox provides a dashboard-based interface for comprehensive AI analysis.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

InterpretML

License Python Version Package Version Conda Maintenance

In the beginning machines learned in darkness, and data scientists struggled in the void to explain them.

Let there be light.

InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions.

Interpretability is essential for:

Model debugging - Why did my model make this mistake?
Feature Engineering - How can I improve my model?
Detecting fairness issues - Does my model discriminate?
Human-AI cooperation - How can I understand and trust the model's decisions?
Regulatory compliance - Does my model satisfy legal requirements?
High-risk applications - Healthcare, finance, judicial, ...

Installation

Python 3.7+ | Linux, Mac, Windows

pip install interpret
# OR
conda install -c conda-forge interpret

Introducing the Explainable Boosting Machine (EBM)

EBM is an interpretable model developed at Microsoft Research^*. It uses modern machine learning techniques like bagging, gradient boosting, and automatic interaction detection to breathe new life into traditional GAMs (Generalized Additive Models). This makes EBMs as accurate as state-of-the-art techniques like random forests and gradient boosted trees. However, unlike these blackbox models, EBMs produce exact explanations and are editable by domain experts.

Dataset/AUROC	Domain	Logistic Regression	Random Forest	XGBoost	Explainable Boosting Machine
Adult Income	Finance	.907Â±.003	.903Â±.002	.927Â±.001	*.928Â±.002*
Heart Disease	Medical	.895Â±.030	.890Â±.008	.851Â±.018	*.898Â±.013*
Breast Cancer	Medical	*.995Â±.005*	.992Â±.009	.992Â±.010	*.995Â±.006*
Telecom Churn	Business	.849Â±.005	.824Â±.004	.828Â±.010	*.852Â±.006*
Credit Fraud	Security	.979Â±.002	.950Â±.007	*.981Â±.003*	*.981Â±.003*

Notebook for reproducing table

Supported Techniques

Interpretability Technique	Type
Explainable Boosting	glassbox model
APLR	glassbox model
Decision Tree	glassbox model
Decision Rule List	glassbox model
Linear/Logistic Regression	glassbox model
SHAP Kernel Explainer	blackbox explainer
LIME	blackbox explainer
Morris Sensitivity Analysis	blackbox explainer
Partial Dependence	blackbox explainer

Train a glassbox model

Let's fit an Explainable Boosting Machine

from interpret.glassbox import ExplainableBoostingClassifier

ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

# or substitute with LogisticRegression, DecisionTreeClassifier, RuleListClassifier, ...
# EBM supports pandas dataframes, numpy arrays, and handles "string" data natively.

Understand the model

from interpret import show

ebm_global = ebm.explain_global()
show(ebm_global)

Global Explanation Image

Understand individual predictions

ebm_local = ebm.explain_local(X_test, y_test)
show(ebm_local)

Local Explanation Image

And if you have multiple model explanations, compare them

show([logistic_regression_global, decision_tree_global])

Dashboard Image

If you need to keep your data private, use Differentially Private EBMs (see DP-EBMs)

from interpret.privacy import DPExplainableBoostingClassifier, DPExplainableBoostingRegressor

dp_ebm = DPExplainableBoostingClassifier(epsilon=1, delta=1e-5) # Specify privacy parameters
dp_ebm.fit(X_train, y_train)

show(dp_ebm.explain_global()) # Identical function calls to standard EBMs

For more information, see the documentation.

EBMs include pairwise interactions by default. For 3-way interactions and higher see this notebook: https://interpret.ml/docs/python/examples/custom-interactions.html

Interpret EBMs can be fit on datasets with 100 million samples in several hours. For larger workloads consider using distributed EBMs on Azure SynapseML: classification EBMs and regression EBMs

Acknowledgements

InterpretML was originally created by (equal contributions): Samuel Jenkins, Harsha Nori, Paul Koch, and Rich Caruana

EBMs are fast derivative of GA2M, invented by: Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker

Many people have supported us along the way. Check out ACKNOWLEDGEMENTS.md!

We also build on top of many great packages. Please check them out!

Citations

InterpretML

"InterpretML: A Unified Framework for Machine Learning Interpretability" (H. Nori, S. Jenkins, P. Koch, and R. Caruana 2019)

@article{nori2019interpretml,
  title={InterpretML: A Unified Framework for Machine Learning Interpretability},
  author={Nori, Harsha and Jenkins, Samuel and Koch, Paul and Caruana, Rich},
  journal={arXiv preprint arXiv:1909.09223},
  year={2019}
}

Paper link

Explainable Boosting

"Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission" (R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad 2015)

@inproceedings{caruana2015intelligible,
  title={Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission},
  author={Caruana, Rich and Lou, Yin and Gehrke, Johannes and Koch, Paul and Sturm, Marc and Elhadad, Noemie},
  booktitle={Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  pages={1721--1730},
  year={2015},
  organization={ACM}
}

Paper link

"Accurate intelligible models with pairwise interactions" (Y. Lou, R. Caruana, J. Gehrke, and G. Hooker 2013)

@inproceedings{lou2013accurate,
  title={Accurate intelligible models with pairwise interactions},
  author={Lou, Yin and Caruana, Rich and Gehrke, Johannes and Hooker, Giles},
  booktitle={Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining},
  pages={623--631},
  year={2013},
  organization={ACM}
}

Paper link

"Intelligible models for classification and regression" (Y. Lou, R. Caruana, and J. Gehrke 2012)

@inproceedings{lou2012intelligible,
  title={Intelligible models for classification and regression},
  author={Lou, Yin and Caruana, Rich and Gehrke, Johannes},
  booktitle={Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining},
  pages={150--158},
  year={2012},
  organization={ACM}
}

Paper link

"Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values" (Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana 2022)

@article{wang2022interpretability,
  title={Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values},
  author={Wang, Zijie J and Kale, Alex and Nori, Harsha and Stella, Peter and Nunnally, Mark E and Chau, Duen Horng and Vorvoreanu, Mihaela and Vaughan, Jennifer Wortman and Caruana, Rich},
  journal={arXiv preprint arXiv:2206.15465},
  year={2022}
}

Paper link

"Axiomatic Interpretability for Multiclass Additive Models" (X. Zhang, S. Tan, P. Koch, Y. Lou, U. Chajewska, and R. Caruana 2019)

@inproceedings{zhang2019axiomatic,
  title={Axiomatic Interpretability for Multiclass Additive Models},
  author={Zhang, Xuezhou and Tan, Sarah and Koch, Paul and Lou, Yin and Chajewska, Urszula and Caruana, Rich},
  booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={226--234},
  year={2019},
  organization={ACM}
}

Paper link

"Distill-and-compare: auditing black-box models using transparent model distillation" (S. Tan, R. Caruana, G. Hooker, and Y. Lou 2018)

@inproceedings{tan2018distill,
  title={Distill-and-compare: auditing black-box models using transparent model distillation},
  author={Tan, Sarah and Caruana, Rich and Hooker, Giles and Lou, Yin},
  booktitle={Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society},
  pages={303--310},
  year={2018},
  organization={ACM}
}

Paper link

"Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models" (B. Lengerich, S. Tan, C. Chang, G. Hooker, R. Caruana 2019)

@article{lengerich2019purifying,
  title={Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models},
  author={Lengerich, Benjamin and Tan, Sarah and Chang, Chun-Hao and Hooker, Giles and Caruana, Rich},
  journal={arXiv preprint arXiv:1911.04974},
  year={2019}
}

Paper link

"Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning" (H. Kaur, H. Nori, S. Jenkins, R. Caruana, H. Wallach, J. Wortman Vaughan 2020)

@inproceedings{kaur2020interpreting,
  title={Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning},
  author={Kaur, Harmanpreet and Nori, Harsha and Jenkins, Samuel and Caruana, Rich and Wallach, Hanna and Wortman Vaughan, Jennifer},
  booktitle={Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems},
  pages={1--14},
  year={2020}
}

Paper link

"How Interpretable and Trustworthy are GAMs?" (C. Chang, S. Tan, B. Lengerich, A. Goldenberg, R. Caruana 2020)

@article{chang2020interpretable,
  title={How Interpretable and Trustworthy are GAMs?},
  author={Chang, Chun-Hao and Tan, Sarah and Lengerich, Ben and Goldenberg, Anna and Caruana, Rich},
  journal={arXiv preprint arXiv:2006.06466},
  year={2020}
}

Paper link

Differential Privacy

"Accuracy, Interpretability, and Differential Privacy via Explainable Boosting" (H. Nori, R. Caruana, Z. Bu, J. Shen, J. Kulkarni 2021)

@inproceedings{pmlr-v139-nori21a,
  title = 	 {Accuracy, Interpretability, and Differential Privacy via Explainable Boosting},
  author =       {Nori, Harsha and Caruana, Rich and Bu, Zhiqi and Shen, Judy Hanwen and Kulkarni, Janardhan},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {8227--8237},
  year = 	 {2021},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  publisher =    {PMLR}
}

Paper link

LIME

"Why should i trust you?: Explaining the predictions of any classifier" (M. T. Ribeiro, S. Singh, and C. Guestrin 2016)

@inproceedings{ribeiro2016should,
  title={Why should i trust you?: Explaining the predictions of any classifier},
  author={Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos},
  booktitle={Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining},
  pages={1135--1144},
  year={2016},
  organization={ACM}
}

Paper link

SHAP

"A Unified Approach to Interpreting Model Predictions" (S. M. Lundberg and S.-I. Lee 2017)

@incollection{NIPS2017_7062,
 title = {A Unified Approach to Interpreting Model Predictions},
 author = {Lundberg, Scott M and Lee, Su-In},
 booktitle = {Advances in Neural Information Processing Systems 30},
 editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
 pages = {4765--4774},
 year = {2017},
 publisher = {Curran Associates, Inc.},
 url = {https://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf}
}

Paper link

"Consistent individualized feature attribution for tree ensembles" (Lundberg, Scott M and Erion, Gabriel G and Lee, Su-In 2018)

@article{lundberg2018consistent,
  title={Consistent individualized feature attribution for tree ensembles},
  author={Lundberg, Scott M and Erion, Gabriel G and Lee, Su-In},
  journal={arXiv preprint arXiv:1802.03888},
  year={2018}
}

Paper link

"Explainable machine-learning predictions for the prevention of hypoxaemia during surgery" (S. M. Lundberg et al. 2018)

@article{lundberg2018explainable,
  title={Explainable machine-learning predictions for the prevention of hypoxaemia during surgery},
  author={Lundberg, Scott M and Nair, Bala and Vavilala, Monica S and Horibe, Mayumi and Eisses, Michael J and Adams, Trevor and Liston, David E and Low, Daniel King-Wai and Newman, Shu-Fang and Kim, Jerry and others},
  journal={Nature Biomedical Engineering},
  volume={2},
  number={10},
  pages={749},
  year={2018},
  publisher={Nature Publishing Group}
}

Paper link

Sensitivity Analysis

"SALib: An open-source Python library for Sensitivity Analysis" (J. D. Herman and W. Usher 2017)

@article{herman2017salib,
  title={SALib: An open-source Python library for Sensitivity Analysis.},
  author={Herman, Jonathan D and Usher, Will},
  journal={J. Open Source Software},
  volume={2},
  number={9},
  pages={97},
  year={2017}
}

Paper link

"Factorial sampling plans for preliminary computational experiments" (M. D. Morris 1991)

@article{morris1991factorial,
  title={},
  author={Morris, Max D},
  journal={Technometrics},
  volume={33},
  number={2},
  pages={161--174},
  year={1991},
  publisher={Taylor \& Francis Group}
}

Paper link

Partial Dependence

"Greedy function approximation: a gradient boosting machine" (J. H. Friedman 2001)

@article{friedman2001greedy,
  title={Greedy function approximation: a gradient boosting machine},
  author={Friedman, Jerome H},
  journal={Annals of statistics},
  pages={1189--1232},
  year={2001},
  publisher={JSTOR}
}

Paper link

Open Source Software

"Scikit-learn: Machine learning in Python" (F. Pedregosa et al. 2011)

@article{pedregosa2011scikit,
  title={Scikit-learn: Machine learning in Python},
  author={Pedregosa, Fabian and Varoquaux, Ga{\"e}l and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and others},
  journal={Journal of machine learning research},
  volume={12},
  number={Oct},
  pages={2825--2830},
  year={2011}
}

Paper link

"Collaborative data science" (Plotly Technologies Inc. 2015)

@online{plotly, 
  author = {Plotly Technologies Inc.}, 
  title = {Collaborative data science}, 
  publisher = {Plotly Technologies Inc.}, 
  address = {Montreal, QC}, 
  year = {2015}, 
  url = {https://plot.ly}
}

Link

"Joblib: running python function as pipeline jobs" (G. Varoquaux and O. Grisel 2009)

@article{varoquaux2009joblib,
  title={Joblib: running python function as pipeline jobs},
  author={Varoquaux, Ga{\"e}l and Grisel, O},
  journal={packages. python. org/joblib},
  year={2009}
}

Link

Videos

External links

Papers that use or compare EBMs

Books that cover EBMs

External tools

Contact us

There are multiple ways to get in touch:

Email us at interpret@microsoft.com
Or, feel free to raise a GitHub issue

If a tree fell in your random forest, would anyone notice?

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot