Saguaro
Published on

Re: Implementation [01]: A Deep Dive into NanoGPT

Authors
  • avatar
    Name
    Shuqi Wang
    Twitter

Re: Implementation SeriesEpisode 01

Welcome to my open notebook. In this series, I am rebuilding the most influential models in AI history to prepare for my MPhil research. No black boxes, just code and first principles.

Overview

We start where many modern LLM journeys begin: Andrej Karpathy's nanoGPT.

It is the simplest, fastest repository for training/fine-tuning medium-sized GPTs. But merely running it isn't enough. In this post, we will first walk through the theory and then construct the model from scratch based on Andrej Karpathy's implementation.

Thanks for reading. Stay curious!