Learn Diffusion Models

A free visual AI and machine learning lesson with an interactive 3D visualization, plain-English theory, and quiz.

Diffusion models are the engine behind DALL-E, Stable Diffusion, Midjourney, and Sora. The idea is brilliantly counterintuitive: train a model to reverse the process of slowly adding noise to an image. Once trained, you can start from pure noise and run the reverse process to generate a new image. Unlike GANs, training is stable. Unlike VAEs, output quality is photoreal. The cost: hundreds of forward passes to generate one image (vs. one for GANs).

Forward process — simple, no learning

Add Gaussian noise at each step:
  x_t = √(1 − β_t) · x_{t−1} + √β_t · ε   where ε ~ N(0, 1)

After T = 1000 steps with small β_t, x_T is essentially pure noise.
Closed form: x_t = √(α̃_t) · x_0 + √(1 − α̃_t) · ε   (α̃_t depends on β schedule).

Forward = mechanical · no learning needed

Reverse process — train a UNet

Train a network ε_θ(x_t, t) to predict the noise that was added at step t.

  Loss = || ε − ε_θ(x_t, t) ||²

At sampling time:
  • Start: x_T ~ N(0, 1)   (pure noise)
  • For t = T, T−1, ..., 1:
      predicted noise = ε_θ(x_t, t)
      x_{t-1} = (x_t − scaled_noise) / scale  +  small Gaussian
  • Output: x_0   (sampled image)

Neural net: usually a UNet with attention layers + time conditioning.
T = 1000 in the original paper; modern samplers (DDIM, DPM-Solver) reach quality in 20-50 steps.

Predict the noise · subtract a bit · iterate

Why diffusion works so well

Stable training: just predict noise — no two-player game like GANs.
High fidelity: small changes per step, error doesn't compound badly.
Diversity: different noise seeds → different images.
Conditioning: add text, class, or image guidance to ε_θ → DALL-E, Stable Diffusion.
Slow inference: each image needs N forward passes through the UNet (mitigated by latent diffusion + faster solvers).

Forward process — simple, no learning

Reverse process — train a UNet

Why diffusion works so well

Practice questions

Related AI learning resources