Research papers every LLM engineer must read:
Attention Is All You Need
BERT
GPT-3: Language Models are Few-Shot Learners
Scaling Laws for Neural Language Models
Chinchilla
InstructGPT
Chain-of-Thought Prompting
Retrieval-Augmented Generation
LoRA: Low-Rank Adaptation
LLaMA
FlashAttention
DPO: Direct Preference Optimization