An autoencoder is a neural network with an hourglass shape — it squeezes the input through a narrow bottleneck and then tries to rebuild it from there. The bottleneck (the latent code) is forced to capture the essence of the data. After training, that latent code is a compact, learned representation you can use for compression, denoising, anomaly detection, or as input to other models.
Architecture
encoder f: x → z (input → latent)
decoder g: z → x̂ (latent → reconstruction)
Loss = ||x − x̂||² (mean squared error, or per-task loss)
Latent dim ≪ input dim. e.g.:
• Image 28×28 = 784 pixels → latent z = 16 numbers (49× compression).
• The bottleneck FORCES the network to discard noise and keep only structure.Squeeze + rebuild · loss = how well it rebuilds
What you can do with the latent z
- Compression: z is a learned summary of x — much smaller, often more meaningful than raw bytes.
- Denoising autoencoder: train with noisy input, clean target — model learns to clean data.
- Anomaly detection: rare inputs reconstruct poorly (high loss). Flag them.
- Pretraining: use z as input features to a downstream classifier.
- Generative use: sample z from the latent distribution to generate new x — but standard autoencoders' latent isn't smooth, so sampling is bumpy. VAEs (next chapter) fix this.