Advanced Topics - Advanced - 15 min

Learn Explainability (SHAP & LIME)

A free visual AI and machine learning lesson with an interactive 3D visualization, plain-English theory, and quiz.

Last updated: 2026-05-13.

Modern ML models are black boxes — you can call them and get a number, but you can't easily see why. For high-stakes decisions (loans, medical diagnoses, parole), 'because the neural network said so' is not an acceptable answer. Explainability tools open the box just enough to attribute each prediction to specific input features. SHAP and LIME are the two most widely used techniques.

SHAP — SHapley Additive exPlanations

SHAP borrows from cooperative game theory. Imagine each feature is a player in a game, and the model's prediction is the payout. Shapley values fairly distribute the payout among all players by averaging each player's marginal contribution across every possible coalition. SHAP applies this idea to ML predictions: each feature gets a value representing how much it pushed the prediction up or down from the baseline.

Final prediction = baseline (average over training set)
                  + Σ_{features}  shap_value(feature)

Properties:
  ✓ Additive — values sum exactly to (prediction − baseline)
  ✓ Consistent — if a feature contributes more in one model than another, its SHAP value never decreases
  ✓ Globally and locally meaningful

SHAP gives the unique attribution that satisfies these three axioms.

LIME — Local Interpretable Model-agnostic Explanations

LIME takes a different approach: don't try to explain the whole model — explain one prediction at a time. To explain a single input x, LIME generates many perturbed versions of x (drop features, flip pixels, mask words), runs them through the black-box model, and fits a simple linear regression to those perturbations. The linear model's coefficients are the local explanation: 'in this neighbourhood of input space, the model behaves approximately like ŷ = w₁·x₁ + w₂·x₂ + ⋯'.

LIME minimises:

  L(g) = Σ_{x' near x} π(x') · ( f(x') − g(x') )²  +  Ω(g)

  f = black-box model
  g = simple linear surrogate (the explanation)
  π = locality kernel (closer perturbations weigh more)
  Ω = complexity penalty (prefer sparse explanations)

Fit a small linear model that mimics f near x. Read its weights as the explanation.

SHAP vs LIME — when to use which

  • SHAP: theoretically grounded, globally consistent, computationally expensive on large models. Use for audits, regulatory reporting, comparing two models, and detecting global bias.
  • LIME: faster, model-agnostic by default, but explanations can be unstable across runs (depends on random perturbation sampling). Use for ad-hoc debugging of single predictions.
  • Tree models: TreeSHAP runs in polynomial time — basically free, use SHAP.
  • Image models: both have visualisation variants — SHAP image plots vs LIME superpixel masks.
  • Text models: SHAP highlights token contributions; LIME highlights word-level importance.

Practice questions

  1. What does a SHAP value for a feature represent?
  2. What is LIME's core idea?
  3. When would you prefer SHAP over LIME?
  4. Why are SHAP/LIME explanations not causal?

Related AI learning resources

Premium lesson notes and simulations | AI project templates | More Advanced Topics lessons