yena shared this post · 4h ago
Ahmad

Wanna replace Anthropic/OpenAI? START WITH THIS

The bible for running LLMs locally is now available online to read for free

Covers what to use on

  • Laptop / edge / odd hardware

  • Mac-first workflows

  • Single RTX GPUs

  • 2-4+ NVIDIA / CUDA GPUs

  • General production serving

  • Long-context / MoE / routing

  • NVIDIA max performance

  • Cluster orchestration

Software

  • llama.cpp

  • MLX / MLX-LM

  • ExLlamaV2

  • ExLlamaV3

  • vLLM

  • SGLang

  • TensorRT-LLM

  • NVIDIA Dynamo

You should read this, and if you cannot now then you most definitely wanna bookmark it for later

Opensource & Local AI FTW

207