I cut my Claude Code bill by 70% with 3 GitHub repos
Tested 10 repos that claim to save tokens. Ran each one for a full day on real projects
Most were just a hype. 3 actually changed my workflow
Here's what they do and how to set each one up:
- RTK (Rust Token Killer)
The problem: every terminal command you run like "npm install", "git log", test suites, build output
It gets dumped raw into Claude's context. You're paying for Claude to read thousands of lines of noise
The fix: RTK is a CLI proxy that sits between your terminal and Claude. It filters output before it hits context
How to set it up:
- install the binary (one command, zero dependencies)
- it auto-detects Claude Code sessions
- runs silently in the background
- no config needed, works out of the box
My result: 60-90% reduction on terminal-heavy sessions
Best for: developers who run builds, tests, or installs frequently during Claude sessions
- Caveman Claude
The problem: Claude responds like a teacher by default, full explanations, caveats, context. Helpful when learning, but expensive when shipping
The fix: one line in your CLAUDE.md that rewrites Claude's output behavior to be compressed and terse
How to set it up:
- clone the repo
- copy the prompt snippet into your project's CLAUDE.md
- DONE! Claude immediately responds in compressed mode
- Toggle it off anytime by removing the line
My result: 65-75% fewer output tokens with identical code quality
I compared outputs side by side for a full day, same code, same logic, half the tokens
Best for: experienced developers or code users who don't need Claude to explain what it's doing
- Context Mode
The problem: if you use MCP tools (Playwright, GitHub API, browser, any external tool), their raw responses flood your context silently, literally single page read can eat thousands of tokens
The fix: sandboxes all tool output into SQLite and only passes clean summaries into your conversation
How to set it up:
- install as a Claude Code plugin
- it auto-intercepts MCP tool responses
- raw data goes to local SQLite
- Claude only sees the summary
My result: 98% context reduction on tool-heavy workflows
Best for: anyone running Playwright, browser tools, API calls, or multiple MCP servers
How to stack all 3:
- RTK cleans the input (terminal noise)
- Caveman cleans the output (verbose responses)
- Context Mode cleans the tools (MCP bloat)
Each fixes a different layer and together they compound
You don't need all 3, just pick based on your workflow:
heavy terminal output? → start with RTK
output too verbose? → start with Caveman
lots of MCP tools? → start with Context Mode
Quick test right now: run /context in a fresh Claude Code session and check how much context is already used before you type anything
If it's over 20%, you're leaking tokens
All repo links in the comments
Save it.
RTK: https://github.com/rtk-ai/rtk May 5