Attention Mechanism in AI - 3D Visual Lesson

A free visual AI and machine learning lesson with an interactive 3D visualization, plain-English theory, and quiz.

Simple theory: Evaluation metrics are numbers that tell you how a model is performing. Different metrics expose different mistakes, so accuracy alone is often not enough.

Your model is 99% accurate. Sounds great. But your dataset is 99% negative class. A model that always predicts 'negative' is 99% accurate and completely useless. Accuracy is a lie on imbalanced datasets. Precision, recall, F1, and the confusion matrix tell you what's actually happening.

The confusion matrix

For binary classification: True Positive (TP): correctly predicted positive. True Negative (TN): correctly predicted negative. False Positive (FP): predicted positive, actually negative (false alarm). False Negative (FN): predicted negative, actually positive (missed case). All metrics are derived from these four numbers.

Precision = TP / (TP + FP)
Recall    = TP / (TP + FN)
F1 Score  = 2 × (Precision × Recall) / (Precision + Recall)
Accuracy  = (TP + TN) / (TP + TN + FP + FN)

TP = True Positive   FP = False Positive
TN = True Negative   FN = False Negative

All four metrics come from the same four numbers in the confusion matrix

Choosing the right metric

Spam filter: prioritise precision (few false alarms — you don't want good emails deleted). Cancer screening: prioritise recall (catch every case — missing one is catastrophic). Credit scoring: AUC-ROC measures ranking quality across all thresholds. Imbalanced classes: always use F1 or AUC-ROC instead of accuracy.

Learn Model Evaluation Metrics

The confusion matrix

Choosing the right metric

Practice questions

Related AI learning resources