HERMES AGENT HAS 5 SYSTEMS
RUNNING UNDER THE HOOD.
UNDERSTAND THEM AND YOU USE
THE AGENT 10X BETTER.
In this video @_alejandroao explained:
- THE AGENT LOOP
every message triggers the same cycle:
→ you send a message
→ Hermes builds context
(SOUL.md + memory.md + user.md + skills
- tools + message history)
→ sends everything to the LLM
→ LLM decides: call a tool or respond
→ if tool call: execute, return result, loop back
→ if response: deliver to you
→ after response: memory update
(agent checks if anything is worth remembering,
writes to memory.md or user.md)
this loop is why Hermes gets better over time.
the memory update after every response
means the agent learns from every conversation.
- CONTEXT ASSEMBLY
what the LLM sees on every turn:
→ SOUL.md (your agent's personality and rules)
→ memory.md (facts the agent learned over time)
→ user.md (facts about you, auto-updated)
→ AGENTS.md and .hermes.md (project context files)
→ skill descriptions (loaded on demand)
→ tool schemas (available actions)
→ message history (current conversation)
if SOUL.md is empty, Hermes falls back
to a default system prompt.
write your own SOUL.md and the agent
becomes yours, not generic.
CONTEXT COMPRESSION:
conversations hit context limits.
Hermes handles this at two checkpoints:
preflight: before each turn.
if conversation exceeds 50% of context window,
compression fires. older messages get summarized.
last 20 messages stay intact (protect_last_n).
gateway auto-compression: between turns.
fires at 85%. more aggressive.
prevents API errors before the agent
even starts processing your message.
after compression, a new session lineage ID
is generated. the agent can trace back
to the original conversation through SQLite.
three things break prompt cache:
switching models mid-session,
changing memory files,
or changing context files.
- THE GATEWAY
the system that keeps Hermes reachable
on 27+ messaging platforms.
an async loop runs continuously.
listens for incoming messages from
Telegram, Discord, Slack, WhatsApp,
email, SMS, and every other adapter.
when a message arrives:
→ gateway identifies which session it belongs to
→ queries SQLite for the full message history
(session ID = platform prefix + chat ID)
→ builds the context from scratch
→ sends everything into the agent loop
→ delivers the response back to the platform
the gateway also runs the session manager.
when you send a message while the agent is busy:
→ default: queued for next turn
→ /steer: injected without interrupting
→ /interrupt: stops current work
without the gateway, Hermes is a CLI tool.
with the gateway, Hermes is an always-on agent
you reach from your phone.
- MEMORY (THREE LAYERS)
LAYER 1 — MARKDOWN FILES
SOUL.md (identity), memory.md (learned facts),
user.md (facts about you).
injected into context after the system prompt.
updated by the agent after every response.
LAYER 2 — SQLITE
full transcripts of every session stored locally.
FTS5 full-text search across all past conversations.
session lineage tracking across compressions.
the agent can recall what you discussed
weeks ago using /recall or session search.
LAYER 3 — EXTERNAL PROVIDERS (optional)
8 supported providers: Mem0, SuperMemory,
Honcho, Zep, and more.
each works differently (semantic search,
LLM extraction, similarity matching).
queried after the first message in each session.
the agent processes your topic first,
then checks external memory for related context
from past conversations.
not enabled by default.
enable for significantly better long-term recall.
- CRON ENGINE
a loop inside the gateway ticks every 60 seconds.
each tick checks ~/.hermes/cron/jobs.json
for scheduled tasks.
if a job is due:
→ fresh session (no chat history, no memory pollution)
→ execute the prompt with assigned tools
→ store the run output as markdown
in ~/.hermes/cron/output/[job-id]/
→ deliver result to your home messaging platform
cron does NOT use the send_message tool.
delivery happens at the system level, not the agent level.
a cron session cannot create more cron jobs.
prevents runaway loops.
WHY THIS MATTERS:
the agent loop teaches it.
the context assembly focuses it.
the gateway reaches it.
the memory remembers it.
the cron engine automates it.
five systems. one agent.
understanding how they connect
changes how you configure every level.
full 15 levels breakdown in the article 👇