- Published on
Re: Implementation [01]: A Deep Dive into NanoGPT
- Authors

- Name
- Shuqi Wang
Re: Implementation Series — Episode 01
Welcome to my open notebook. In this series, I am rebuilding the most influential models in AI history to prepare for my MPhil research. No black boxes, just code and first principles.
Overview
We start where many modern LLM journeys begin: Andrej Karpathy's nanoGPT.
It is the simplest, fastest repository for training/fine-tuning medium-sized GPTs. But merely running it isn't enough. In this post, we will first walk through the theory and then construct the model from scratch based on Andrej Karpathy's implementation.
Thanks for reading. Stay curious!