responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

1,523

402

1,523

View on GitHub

Top Related Projects

explainerdashboard

2,363

Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.

shap

23,746

A game theoretic approach to explain the output of any machine learning model.

interpret

6,467

Fit interpretable models. Explain blackbox machine learning.

lime

11,860

Lime: Explaining the predictions of any machine learning classifier

xai

1,180

XAI - An eXplainability toolbox for machine learning

AIX360

1,683

Interpretability and explainability of data and machine learning models

Quick Overview

The Responsible AI Toolbox is an open-source project by Microsoft that provides a set of tools for implementing responsible AI practices. It offers components for model interpretability, fairness assessment, error analysis, and causal inference, helping developers and data scientists build more transparent, accountable, and ethical AI systems.

Pros

Comprehensive suite of tools for various aspects of responsible AI
Integration with popular machine learning frameworks like scikit-learn and PyTorch
Extensive documentation and examples for easy adoption
Active development and support from Microsoft

Cons

Steep learning curve for users new to responsible AI concepts
Limited support for some advanced machine learning models
Primarily focused on tabular data, with less support for other data types
May require additional computational resources for large-scale projects

Code Examples

Loading and analyzing a dataset:

from raiwidgets import ResponsibleAIDashboard
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import pandas as pd

# Load data
data = pd.read_csv("your_dataset.csv")
X = data.drop("target", axis=1)
y = data["target"]

# Split data and train model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier().fit(X_train, y_train)

# Launch ResponsibleAI Dashboard
ResponsibleAIDashboard(model=model, dataset=X_test, true_y=y_test, categorical_features=[])

Performing fairness assessment:

from fairlearn.metrics import MetricFrame
from fairlearn.metrics import selection_rate

# Calculate selection rate for different groups
sr = MetricFrame(metrics=selection_rate, y_true=y_test, y_pred=model.predict(X_test), sensitive_features=X_test["sensitive_attribute"])

print(sr.by_group)

Explaining model predictions:

from interpret import set_visualize_provider
from interpret.provider import InlineProvider
from interpret.blackbox import ShapKernel

set_visualize_provider(InlineProvider())

# Create a SHAP explainer
explainer = ShapKernel(model.predict, X_train)
shap_values = explainer.shap_values(X_test)

# Visualize SHAP values
ShapKernel.visualize(shap_values)

Getting Started

To get started with the Responsible AI Toolbox:

Install the package:
```
pip install raiwidgets
```

Import necessary modules:

from raiwidgets import ResponsibleAIDashboard
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import pandas as pd

Load your data, split it, and train a model:

data = pd.read_csv("your_dataset.csv")
X = data.drop("target", axis=1)
y = data["target"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier().fit(X_train, y_train)

Launch the ResponsibleAI Dashboard:

ResponsibleAIDashboard(model=model, dataset=X_test, true_y=y_test, categorical_features=[])

This will open an interactive dashboard where you can explore various aspects of your model's performance and fairness.

Competitor Comparisons

explainerdashboard

2,363

Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.

Pros of explainerdashboard

More focused on creating interactive dashboards for model explanations
Easier to set up and use for quick model insights
Supports a wider range of machine learning models out-of-the-box

Cons of explainerdashboard

Less comprehensive in terms of responsible AI practices
Fewer advanced features for bias detection and mitigation
Limited integration with enterprise-level AI development workflows

Code Comparison

explainerdashboard:

from explainerdashboard import ClassifierExplainer, ExplainerDashboard
explainer = ClassifierExplainer(model, X, y)
ExplainerDashboard(explainer).run()

responsible-ai-toolbox:

from raiwidgets import ResponsibleAIDashboard
ResponsibleAIDashboard(model, dataset, 'classification', 
                       true_y, features, categorical_features)

Both libraries offer easy-to-use interfaces for creating dashboards, but responsible-ai-toolbox provides more comprehensive tools for responsible AI practices, while explainerdashboard focuses on quick and interactive model explanations.

shap

23,746

A game theoretic approach to explain the output of any machine learning model.

Pros of SHAP

More focused on model interpretability and feature importance
Wider range of supported model types and frameworks
More established project with a larger community and ecosystem

Cons of SHAP

Less comprehensive in terms of responsible AI practices
Narrower scope, primarily focused on explanations rather than broader AI ethics
May require more manual integration with other tools for a complete responsible AI workflow

Code Comparison

SHAP:

import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)
shap.summary_plot(shap_values, X)

Responsible AI Toolbox:

from raiwidgets import ExplanationDashboard
ExplanationDashboard(global_explanation, model, dataset, true_y, features)

Summary

SHAP is a specialized library for model interpretability, offering deep insights into feature importance across various model types. The Responsible AI Toolbox provides a more comprehensive suite of tools for ethical AI development, including interpretability, fairness assessment, and error analysis. SHAP may be preferred for in-depth explanations, while the Responsible AI Toolbox offers a broader approach to responsible AI practices.

interpret

6,467

Fit interpretable models. Explain blackbox machine learning.

Pros of Interpret

More focused on model interpretability and explanation techniques
Supports a wider range of interpretability methods (e.g., SHAP, LIME, EBM)
Lightweight and easier to integrate into existing ML workflows

Cons of Interpret

Less comprehensive in terms of responsible AI features
Fewer tools for fairness assessment and mitigation
Limited support for error analysis and model debugging

Code Comparison

Interpret:

from interpret import show
from interpret.glassbox import ExplainableBoostingClassifier

ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

show(ebm.explain_global())

Responsible AI Toolbox:

from raiwidgets import ResponsibleAIDashboard
from responsibleai import RAIInsights

rai_insights = RAIInsights(model, train_data, test_data, target_column, task_type='classification')
rai_insights.compute()

ResponsibleAIDashboard(rai_insights)

Interpret focuses on providing interpretability methods, while Responsible AI Toolbox offers a more comprehensive suite of responsible AI tools, including fairness assessment, error analysis, and model explanations. Interpret is more suitable for users primarily interested in model interpretability, while Responsible AI Toolbox is better for those seeking a broader range of responsible AI features.

lime

11,860

Lime: Explaining the predictions of any machine learning classifier

Pros of LIME

Lightweight and focused on a single explainability technique
Easier to integrate into existing projects due to its simplicity
Well-established and widely used in the ML community

Cons of LIME

Limited to local interpretability, lacking global model explanations
Fewer features compared to comprehensive toolboxes
May require additional libraries for a complete explainability solution

Code Comparison

LIME:

from lime import lime_tabular
explainer = lime_tabular.LimeTabularExplainer(X_train)
exp = explainer.explain_instance(X_test[0], clf.predict_proba)

Responsible AI Toolbox:

from raiwidgets import ExplanationDashboard
ExplanationDashboard(global_explanation, model, dataset, true_y, features)

The LIME code focuses on explaining a single instance, while the Responsible AI Toolbox provides a more comprehensive dashboard for model explanations.

Summary

LIME is a specialized tool for local interpretability, offering simplicity and ease of integration. The Responsible AI Toolbox, on the other hand, provides a more comprehensive suite of tools for responsible AI practices, including fairness, interpretability, and error analysis. While LIME excels in its specific use case, the Responsible AI Toolbox offers a broader range of features for a more holistic approach to AI development and deployment.

xai

1,180

XAI - An eXplainability toolbox for machine learning

Pros of xai

Lightweight and focused on interpretability techniques
Supports multiple programming languages (Python, R, Java)
Includes a variety of XAI methods like LIME, SHAP, and counterfactual explanations

Cons of xai

Less comprehensive than Responsible AI Toolbox in terms of fairness and bias mitigation
Smaller community and fewer updates compared to Microsoft's offering
Limited integration with popular machine learning frameworks

Code Comparison

xai:

from xai import explain
explanation = explain(model, X, method='lime')

Responsible AI Toolbox:

from raiwidgets import ExplanationDashboard
ExplanationDashboard(global_explanation, dataset, true_y, features)

The xai library focuses on generating explanations with a simple API, while Responsible AI Toolbox provides interactive dashboards for exploring model behavior and fairness metrics.

AIX360

1,683

Interpretability and explainability of data and machine learning models

Pros of AIX360

Broader focus on various aspects of AI explainability, including pre-model, in-model, and post-model explanations
Includes a diverse set of algorithms for different explainability tasks
Provides educational resources and tutorials for understanding AI explainability concepts

Cons of AIX360

Less integrated with popular machine learning frameworks compared to Responsible AI Toolbox
Fewer interactive visualization tools for non-technical users
Less emphasis on fairness assessment and mitigation techniques

Code Comparison

AIX360:

from aix360.algorithms.contrastive import CEMExplainer
explainer = CEMExplainer(model)
explanation = explainer.explain_instance(x, num_samples=1000)

Responsible AI Toolbox:

from raiwidgets import ExplanationDashboard
ExplanationDashboard(global_explanation, local_explanation, dataset, true_y, features)

The AIX360 code focuses on generating explanations using specific algorithms, while Responsible AI Toolbox provides interactive dashboards for visualizing explanations and model insights.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Responsible AI Widgets Python Build UI deployment to test environment

PyPI raiwidgets PyPI responsibleai PyPI erroranalysis PyPI raiutils PyPI rai_test_utils

npm model-assessment

Responsible AI Toolbox

Responsible AI is an approach to assessing, developing, and deploying AI systems in a safe, trustworthy, and ethical manner, and take responsible decisions and actions.

Responsible AI Toolbox is a suite of tools providing a collection of model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

ResponsibleAIToolboxOverview

The Toolbox consists of three repositories:

Repository	Tools Covered
Responsible-AI-Toolbox Repository (Here)	This repository contains four visualization widgets for model assessment and decision making: 1. Responsible AI dashboard, a single pane of glass bringing together several mature Responsible AI tools from the toolbox for a holistic responsible assessment and debugging of models and making informed business decisions. With this dashboard, you can identify model errors, diagnose why those errors are happening, and mitigate them. Moreover, the causal decision-making capabilities provide actionable insights to your stakeholders and customers. 2. Error Analysis dashboard, for identifying model errors and discovering cohorts of data for which the model underperforms. 3. Interpretability dashboard, for understanding model predictions. This dashboard is powered by InterpretML. 4. Fairness dashboard, for understanding modelâs fairness issues using various group-fairness metrics across sensitive features and cohorts. This dashboard is powered by Fairlearn.
Responsible-AI-Toolbox-Mitigations Repository	The Responsible AI Mitigations Library helps AI practitioners explore different measurements and mitigation steps that may be most appropriate when the model underperforms for a given data cohort. The library currently has two modules: 1. DataProcessing, which offers mitigation techniques for improving model performance for specific cohorts. 2. DataBalanceAnalysis, which provides metrics for diagnosing errors that originate from data imbalance either on class labels or feature values. 3. Cohort: provides classes for handling and managing cohorts, which allows the creation of custom pipelines for each cohort in an easy and intuitive interface. The module also provides techniques for learning different decoupled estimators (models) for different cohorts and combining them in a way that optimizes different definitions of group fairness.
Responsible-AI-Tracker Repository	Responsible AI Toolbox Tracker is a JupyterLab extension for managing, tracking, and comparing results of machine learning experiments for model improvement. Using this extension, users can view models, code, and visualization artifacts within the same framework enabling therefore fast model iteration and evaluation processes. Main functionalities include: 1. Managing and linking model improvement artifacts 2. Disaggregated model evaluation and comparisons 3. Integration with the Responsible AI Mitigations library 4. Integration with mlflow
Responsible-AI-Toolbox-GenBit Repository	The Responsible AI Gender Bias (GenBit) Library helps AI practitioners measure gender bias in Natural Language Processing (NLP) datasets. The main goal of GenBit is to analyze your text corpora and compute metrics that give insights into the gender bias present in a corpus.

Introducing Responsible AI dashboard

Responsible AI dashboard is a single pane of glass, enabling you to easily flow through different stages of model debugging and decision-making. This customizable experience can be taken in a multitude of directions, from analyzing the model or data holistically, to conducting a deep dive or comparison on cohorts of interest, to explaining and perturbing model predictions for individual instances, and to informing users on business decisions and actions.

ResponsibleAIDashboard

In order to achieve these capabilities, the dashboard integrates together ideas and technologies from several open-source toolkits in the areas of

Error Analysis powered by Error Analysis, which identifies cohorts of data with higher error rate than the overall benchmark. These discrepancies might occur when the system or model underperforms for specific demographic groups or infrequently observed input conditions in the training data.
Fairness Assessment powered by Fairlearn, which identifies which groups of people may be disproportionately negatively impacted by an AI system and in what ways.
Model Interpretability powered by InterpretML, which explains blackbox models, helping users understand their model's global behavior, or the reasons behind individual predictions.
Counterfactual Analysis powered by DiCE, which shows feature-perturbed versions of the same datapoint who would have received a different prediction outcome, e.g., Taylor's loan has been rejected by the model. But they would have received the loan if their income was higher by $10,000.
Causal Analysis powered by EconML, which focuses on answering What If-style questions to apply data-driven decision-making â how would revenue be affected if a corporation pursues a new pricing strategy? Would a new medication improve a patientâs condition, all else equal?
Data Balance powered by Responsible AI, which helps users gain an overall understanding of their data, identify features receiving the positive outcome more than others, and visualize feature distributions.

Responsible AI dashboard is designed to achieve the following goals:

To help further accelerate engineering processes in machine learning by enabling practitioners to design customizable workflows and tailor Responsible AI dashboards that best fit with their model assessment and data-driven decision making scenarios.
To help model developers create end to end and fluid debugging experiences and navigate seamlessly through error identification and diagnosis by using interactive visualizations that identify errors, inspect the data, generate global and local explanations models, and potentially inspect problematic examples.
To help business stakeholders explore causal relationships in the data and take informed decisions in the real world.

This repository contains the Jupyter notebooks with examples to showcase how to use this widget. Get started here.

Installation

Use the following pip command to install the Responsible AI Toolbox.

If running in jupyter, please make sure to restart the jupyter kernel after installing.

pip install raiwidgets

Responsible AI dashboard Customization

The Responsible AI Toolboxâs strength lies in its customizability. It empowers users to design tailored, end-to-end model debugging and decision-making workflows that address their particular needs. Need some inspiration? Here are some examples of how Toolbox components can be put together to analyze scenarios in different ways:

Please note that model overview (including fairness analysis) and data explorer components are activated by default! Â

Responsible AI Dashboard Flow	Use Case
Model Overview -> Error Analysis -> Data Explorer	To identify model errors and diagnose them by understanding the underlying data distribution
Model Overview -> Fairness Assessment -> Data Explorer	To identify model fairness issues and diagnose them by understanding the underlying data distribution
Model Overview -> Error Analysis -> Counterfactuals Analysis and What-If	To diagnose errors in individual instances with counterfactual analysis (minimum change to lead to a different model prediction)
Model Overview -> Data Explorer -> Data Balance	To understand the root cause of errors and fairness issues introduced via data imbalances or lack of representation of a particular data cohort
Model Overview -> Interpretability	To diagnose model errors through understanding how the model has made its predictions
Data Explorer -> Causal Inference	To distinguish between correlations and causations in the data or decide the best treatments to apply to see a positive outcome
Interpretability -> Causal Inference	To learn whether the factors that model has used for decision making has any causal effect on the real-world outcome.
Data Explorer -> Counterfactuals Analysis and What-If	To address customer questions about what they can do next time to get a different outcome from an AI.
Data Explorer -> Data Balance	To gain an overall understanding of the data, identify features receiving the positive outcome more than others, and visualize feature distributions

Useful Links

Tabular Examples:

Text Examples:

Vision Examples:

Supported Models

This Responsible AI Toolbox API supports models that are trained on datasets in Python numpy.ndarray, pandas.DataFrame, iml.datatypes.DenseData, or scipy.sparse.csr_matrix format.

The explanation functions of Interpret-Community accept both models and pipelines as input as long as the model or pipeline implements a predict or predict_proba function that conforms to the Scikit convention. If not compatible, you can wrap your model's prediction function into a wrapper function that transforms the output into the format that is supported (predict or predict_proba of Scikit), and pass that wrapper function to your selected interpretability techniques.

If a pipeline script is provided, the explanation function assumes that the running pipeline script returns a prediction. The repository also supports models trained via PyTorch, TensorFlow, and Keras deep learning frameworks.

Other Use Cases

Tools within the Responsible AI Toolbox can also be used with AI models offered as APIs by providers such as Azure Cognitive Services. To see example use cases, see the folders below:

Maintainers

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot