The Ultimate Step-By-Step LLM Engineering Projects Roadmap (2026 Edition)
Build a tokenizer
Learn embeddings
Implement RoPE / ALiBi
Hand-wire attention
Build MHA
Build a Transformer block
Train a mini-former
Compare objectives
Build sampling
Speculative decoding
KV cache
MQA / GQA / MLA
Long context
FlashAttention
Hardware budgets
Toy MoE
Sparse model trade-offs
State-space / linear attention
Diffusion language models
Data pipelines
Synthetic data
Scaling laws
SFT / DPO / RLHF / GRPO
Quantization
Serving stacks
Eval harnesses
RAG
Tool use / agents
Vision-language adapters
Interpretability
Red-team suite
Full capstone model system
One request: Choose an Opensource AI lab when you make it
Opensource is where humanity gets to keep the tools
DM me when you've made it ;)