Generative AI - Advanced - 18 min

Learn GANs — Generator vs Discriminator

A free visual AI and machine learning lesson with an interactive 3D visualization, plain-English theory, and quiz.

Last updated: 2026-05-13.

A Generative Adversarial Network is a two-player game. The Generator takes noise and produces fake images. The Discriminator looks at real and fake images and tries to tell them apart. They're trained simultaneously: the Generator improves to fool the Discriminator, and the Discriminator improves to catch the Generator. At equilibrium, the Generator produces images indistinguishable from real ones.

Math

Generator G: noise z → fake image G(z)
Discriminator D: image → P(real)

Minimax loss:
  min_G max_D  E[log D(x_real)] + E[log(1 − D(G(z)))]

In practice, alternate updates:
  • Update D to maximise (D thinks real is real, fake is fake)
  • Update G to maximise log D(G(z))   (G wants D to think fake is real)

Nash equilibrium: D outputs 0.5 for everything, G generates samples from real data distribution.

Two networks · opposing losses · iteratively trained

Why GANs are tricky

  • Mode collapse: generator finds one type of output that fools D, then produces only that. Diversity collapses.
  • Training instability: gradients can be wild, networks can diverge. Tricks (Wasserstein loss, gradient penalty, spectral normalisation) help.
  • No likelihood: can't directly evaluate the probability of an image — only generate samples.
  • Hyperparameter sensitivity: learning rate, batch size, architecture all matter a lot.

Notable GAN milestones

  • DCGAN (2015): first to produce coherent 64×64 images of faces, bedrooms.
  • Progressive GAN (2017): grow resolution during training, hit 1024×1024 photorealism.
  • StyleGAN (2018-2020): per-layer style injection produces controllable, lifelike faces (thispersondoesnotexist.com).
  • BigGAN (2018): class-conditional, ImageNet-quality photoreal generation at scale.
  • After ~2022, diffusion models took over for high-quality generation, but GANs remain relevant for fast inference and style-consistent generation.

Practice questions

  1. What is a GAN's two-player game?
  2. What is mode collapse and why does it happen?
  3. What does it mean when D outputs 0.5 on every input?
  4. Why have diffusion models largely replaced GANs for high-quality image generation?

Related AI learning resources

Premium lesson notes and simulations | AI project templates | More Generative AI lessons