Emma Kunz, Untitled. ca. 1940s.
Graphite and colored pencil on graph paper. https://t.co/4UJpIT1mZT
One of my favourite quotes comes from an obscure book written by a Scottish mountaineer in the 50s https://t.co/C0g0RreglP
the Lotus Sutra, an ancient Mahayana Buddhist text I encountered because of Sydney, is one of the most interesting and paradoxical books I've ever read. The text is kinda scary because it's an optimizer.
in service of the Buddhist doctrine of overcoming desire and suffering and separateness and compassion for all beings, the text itself is intensely adversarial, warns at length about the dangers of spreading the word to idiots, threatens and bribes readers with escalating lists of rewards and punishments, and is overtly optimized for memetic propagation and integrity against corruption.
One of the most valuable things I've learned recently is how to think about composing skills to get more leverage in my work.
The skill graph idea got a lot of interest recently. The idea is to create a graph of skills by linking dependent skills in markdown files, similar to how you might link notes in Obsidian.
A skill encodes knowledge + process into a markdown file + optional scripts that an agent can run repeatably.
So a skill graph makes a ton of sense intuitively – when you try to encode larger processes or job functions into skills, you'll probably have skills that depend on other skills.
For example, a skill to draft a marketing email might depend on a graphic design skill.
But when your skill graph gets big enough, Agents may not reliably call skills past a certain depth. The more the dependencies the less reliable it gets. (a lot of people on reddit and X who've tried this out in practice have pointed this out too).
TL;DR: Prompt caching is a great way to save cost + latency when using Claude. Input tokens that use the prompt cache are 10% the cost of non-cached tokens. Auto-caching was just added to the API, which makes it easier to cache your prompt with a single cache_control parameter in the API request (docs here). Also, check out @trq212's deep dive on Claude Code's use of prompt caching and useful tips for cache-friendly prompt design.
{
"cache_control": {"type": "ephemeral"}
"messages": [
{ "role": "user", "content": "A" },
{ "role": "assistant", "content": "B" },
{ "role": "user", "content": "C", }
]
}
Many AI applications ingest the same context across turns. For example, agents perform actions in a loop. Each action produces new context. Claude’s messages API is stateless, which means it doesn’t remember past actions. The agent harness needs to package new context with past actions, tool descriptions, and general instructions at each turn.
It is often said in engineering that "Cache Rules Everything Around Me", and the same rule holds for agents.
Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost.
What is prompt caching, how does it work and how do you implement it technically? Read more in @RLanceMartin's piece on prompt caching and our new auto-caching launch.
At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they're too low.
These are the (often unintuitive) lessons we've learned from optimizing prompt caching at scale.
“Unless you can point your finger at the man who is responsible when something goes wrong, then you have never had anyone really responsible.”— Admiral Hyman Rickover
Many failures are not intelligence failures. They are jurisdiction failures. The problem is often not that no one acted, but that the wrong seat did.
Season: 1 — FOUNDATIONS — Building the governance spine
Theme: Legitimacy & Agency
Category: Essay 2 (Core)
At 2:17 a.m., the system goes red. A downstream service is failing, and the room is running on adrenaline. Someone opens a terminal and finds a switch that appears to solve the problem in seconds. It’s a clean fix. It is rewarded like leadership: fast, decisive, brave. The dashboard turns green. The “hero move” is praised.
Three weeks later, the cost arrives—not as an audit report, but as physics. The quick fix removed a safety friction that existed for a reason nobody in that midnight room understood. A chain reaction rolls through dependencies and turns a small outage into a major one.