How to Use SHAP and LIME for Model Interpretability
1. Introduction to SHAP and LIME
SHAP explains model predictions using game theory (Shapley values). It attributes the contribution of each feature to a prediction in a consistent and theoretically sound way.
LIME explains a prediction by approximating the model locally with an interpretable (usually linear) model.
2. When to Use SHAP and LIME
Criteria SHAP LIME
Theoretical soundness ✅ Strong theoretical foundation ❌ Heuristic-based
Speed ❌ Can be slower ✅ Faster (approximate)
Local explanations ✅ Yes ✅ Yes
Global interpretability ✅ Can be aggregated ❌ Primarily local
Model agnostic ✅ (via KernelExplainer) ✅ Yes
3. How to Use SHAP
Installation
bash
Copy
Edit
pip install shap
Basic Usage (for tree-based models like XGBoost, LightGBM)
python
Copy
Edit
import shap
import xgboost
import pandas as pd
# Train a model
X, y = shap.datasets.boston()
model = xgboost.XGBRegressor().fit(X, y)
# Create SHAP explainer
explainer = shap.Explainer(model)
# Compute SHAP values
shap_values = explainer(X)
# Visualize SHAP values for a single prediction
shap.plots.waterfall(shap_values[0])
# Visualize summary
shap.plots.beeswarm(shap_values)
4. How to Use LIME
Installation
bash
Copy
Edit
pip install lime
Basic Usage
python
Copy
Edit
import lime
import lime.lime_tabular
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
# Load data and train model
iris = load_iris()
X, y = iris.data, iris.target
model = RandomForestClassifier().fit(X, y)
# Create LIME explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
X_train=X,
feature_names=iris.feature_names,
class_names=iris.target_names,
mode='classification'
)
# Explain a prediction
i = 0
exp = explainer.explain_instance(X[i], model.predict_proba, num_features=4)
# Show explanation
exp.show_in_notebook()
5. Visualizations
SHAP:
waterfall plot: Shows feature contribution for a single prediction.
beeswarm plot: Shows distribution of SHAP values for all predictions.
bar plot: Shows mean absolute SHAP values per feature.
LIME:
Interactive HTML showing local decision boundaries and feature weights.
6. Tips and Best Practices
Use SHAP for consistency and accuracy, especially with tree-based models.
Use LIME when speed is more important, or for quick prototyping.
Combine with domain knowledge for better trust in the model.
Always use explanations to validate and debug models, not as a substitute for understanding the data or the model itself.
7. Conclusion
Both SHAP and LIME help make black-box models more interpretable by explaining why a prediction was made. Use them to:
Build trust with stakeholders.
Debug model decisions.
Identify biased or unexpected behavior.
Learn Data Science Course in Hyderabad
Read More
Outlier Detection Methods in Data Science
How to Handle Categorical Data in Machine Learning Models
Feature Selection Techniques: Filter, Wrapper, and Embedded Methods
How to Use Principal Component Analysis (PCA) for Dimensionality Reduction
Visit Our Quality Thought Training Institute in Hyderabad
Comments
Post a Comment