Reimplementation

Published on
February 1, 2026
Re: Implementation [02]: GPT + Mixture of Experts (MoE)
Re:Implementation Deep-Learning GPT MoE
Building upon our basic GPT, we now implement a Sparse Mixture of Experts (MoE) architecture. This allows us to scale up model capacity (parameters) without proportionally increasing computational cost (FLOPs) during inference.
Published on
January 15, 2026
Re: Implementation Series
Re:Implementation
My open engineering notebook for mastering AI building blocks.
Published on
January 15, 2026
Re: Implementation [01]: Decoder-Only GPT - Building a Character-Level Language Model
Re:Implementation Deep-Learning GPT
My open notebook for mastering AI building blocks. Ep.01 focuses on the pretraining phase of a decoder-only, character-level GPT architecture, built from first principles.

Re: Implementation [02]: GPT + Mixture of Experts (MoE)