Modern ML models are black boxes — you can call them and get a number, but you can't easily see why. For high-stakes decisions (loans, medical diagnoses, parole), 'because the neural network said so' is not an acceptable answer. Explainability tools open the box just enough to attribute each prediction to specific input features. SHAP and LIME are the two most widely used techniques.
SHAP — SHapley Additive exPlanations
SHAP borrows from cooperative game theory. Imagine each feature is a player in a game, and the model's prediction is the payout. Shapley values fairly distribute the payout among all players by averaging each player's marginal contribution across every possible coalition. SHAP applies this idea to ML predictions: each feature gets a value representing how much it pushed the prediction up or down from the baseline.
Final prediction = baseline (average over training set)
+ Σ_{features} shap_value(feature)
Properties:
✓ Additive — values sum exactly to (prediction − baseline)
✓ Consistent — if a feature contributes more in one model than another, its SHAP value never decreases
✓ Globally and locally meaningfulSHAP gives the unique attribution that satisfies these three axioms.
LIME — Local Interpretable Model-agnostic Explanations
LIME takes a different approach: don't try to explain the whole model — explain one prediction at a time. To explain a single input x, LIME generates many perturbed versions of x (drop features, flip pixels, mask words), runs them through the black-box model, and fits a simple linear regression to those perturbations. The linear model's coefficients are the local explanation: 'in this neighbourhood of input space, the model behaves approximately like ŷ = w₁·x₁ + w₂·x₂ + ⋯'.
LIME minimises:
L(g) = Σ_{x' near x} π(x') · ( f(x') − g(x') )² + Ω(g)
f = black-box model
g = simple linear surrogate (the explanation)
π = locality kernel (closer perturbations weigh more)
Ω = complexity penalty (prefer sparse explanations)Fit a small linear model that mimics f near x. Read its weights as the explanation.
SHAP vs LIME — when to use which
- SHAP: theoretically grounded, globally consistent, computationally expensive on large models. Use for audits, regulatory reporting, comparing two models, and detecting global bias.
- LIME: faster, model-agnostic by default, but explanations can be unstable across runs (depends on random perturbation sampling). Use for ad-hoc debugging of single predictions.
- Tree models: TreeSHAP runs in polynomial time — basically free, use SHAP.
- Image models: both have visualisation variants — SHAP image plots vs LIME superpixel masks.
- Text models: SHAP highlights token contributions; LIME highlights word-level importance.