Live Inference

Run DistilGPT-2 entirely in your browser. Watch every stage of the inference pipeline you studied in the course — tokenization, forward pass, logits, softmax, and autoregressive generation — on real model weights. No server, no API key.

This is a tiny model (82M parameters, 6 layers) for demonstration purposes. The predictions show the mechanism, not the capability of modern LLMs. Everything you see here maps to the concepts from Modules 1–3.