pi396 shared this post · May 6
Chris Miller

I cut my Claude Code bill by 70% with 3 GitHub repos

Tested 10 repos that claim to save tokens. Ran each one for a full day on real projects

Most were just a hype. 3 actually changed my workflow

Here's what they do and how to set each one up:

  1. RTK (Rust Token Killer)

The problem: every terminal command you run like "npm install", "git log", test suites, build output

It gets dumped raw into Claude's context. You're paying for Claude to read thousands of lines of noise

The fix: RTK is a CLI proxy that sits between your terminal and Claude. It filters output before it hits context

How to set it up:

  • install the binary (one command, zero dependencies)
  • it auto-detects Claude Code sessions
  • runs silently in the background
  • no config needed, works out of the box

My result: 60-90% reduction on terminal-heavy sessions

Best for: developers who run builds, tests, or installs frequently during Claude sessions

  1. Caveman Claude

The problem: Claude responds like a teacher by default, full explanations, caveats, context. Helpful when learning, but expensive when shipping

The fix: one line in your CLAUDE.md that rewrites Claude's output behavior to be compressed and terse

How to set it up:

  • clone the repo
  • copy the prompt snippet into your project's CLAUDE.md
  • DONE! Claude immediately responds in compressed mode
  • Toggle it off anytime by removing the line

My result: 65-75% fewer output tokens with identical code quality

I compared outputs side by side for a full day, same code, same logic, half the tokens

Best for: experienced developers or code users who don't need Claude to explain what it's doing

  1. Context Mode

The problem: if you use MCP tools (Playwright, GitHub API, browser, any external tool), their raw responses flood your context silently, literally single page read can eat thousands of tokens

The fix: sandboxes all tool output into SQLite and only passes clean summaries into your conversation

How to set it up:

  • install as a Claude Code plugin
  • it auto-intercepts MCP tool responses
  • raw data goes to local SQLite
  • Claude only sees the summary

My result: 98% context reduction on tool-heavy workflows

Best for: anyone running Playwright, browser tools, API calls, or multiple MCP servers

How to stack all 3:

  • RTK cleans the input (terminal noise)
  • Caveman cleans the output (verbose responses)
  • Context Mode cleans the tools (MCP bloat)

Each fixes a different layer and together they compound

You don't need all 3, just pick based on your workflow:

heavy terminal output? → start with RTK
output too verbose? → start with Caveman
lots of MCP tools? → start with Context Mode

Quick test right now: run /context in a fresh Claude Code session and check how much context is already used before you type anything

If it's over 20%, you're leaking tokens

All repo links in the comments

Save it.

25
Sebastian O. Ch. I found these two, but I haven't found the context mode one, can you share it Chris Miller, please?:Caveman: https://github.com/JuliusBrussee/caveman/blob/m...
RTK: https://github.com/rtk-ai/rtk
May 5
Yurii Sychov Repo links? May 3 1 like