MCP Memory Layer: How Claude Code Agents Use Persistent Memory
Claude Code agents, by default, know only what is in their current context window. They do not remember what they learned in yesterday's session, what architectural constraints were decided last sprint, or what debugging approach worked the last time this class of error appeared.
The Model Context Protocol (MCP) changes that. By exposing memory operations as MCP tools, you give Claude Code agents the ability to read from and write to a persistent memory store at runtime — turning each agent session from a cold start into a warm, context-aware continuation of prior work.
This guide explains how MCP memory tools work, how to configure the Claude Code + MCP setup, and what patterns produce the best results in production.
What MCP Enables
MCP is a protocol for connecting AI agents to external capabilities — databases, APIs, file systems, and memory stores. From the agent's perspective, these capabilities appear as tools it can call during a session.
For memory, this means the agent gets access to tools like:
memory_save— write a structured observation, correction, or learned factmemory_search— retrieve semantically relevant memory before starting a taskmemory_recall— fetch specific memory by ID or tag
These aren't magic. They're function calls. What makes them powerful is that they're called at the right moments in the agent's workflow — not just at session start, but throughout execution as the agent encounters relevant triggers.
The CLAUDE.md + MCP Setup Pattern
The standard pattern for giving Claude Code persistent memory uses two components:
1. CLAUDE.md — project instructions that load into every session. This is where you tell the agent what memory system is available and how to use it. A typical section looks like:
# Memory System
You have persistent memory via MCP. On session start:
1. Search memory for context relevant to the current task
2. Load any previously saved user preferences or project state
Write to memory when:
- You learn a user preference or working style
- You make an architecture decision with lasting implications
- You receive a correction that should change future behavior
- You observe a pattern that will be useful across sessions
2. MCP configuration — the server definition that registers memory tools with Claude Code. In Claude Code's MCP config:
{
"mcpServers": {
"memory": {
"command": "node",
"args": ["/path/to/memory-server/index.js"],
"env": { "MEMORY_BACKEND": "memroos" }
}
}
}
With both in place, the agent knows what memory tools exist and when to use them.
Session Startup Memory Load
The most important memory pattern is loading context before the agent starts work. This is the difference between an agent that starts every session with a blank slate and one that immediately has the context it needs.
A well-designed startup sequence looks like this:
- Agent reads CLAUDE.md and learns about the memory system
- Agent calls
memory_searchwith the current task description - Memory system returns relevant past observations, decisions, and corrections
- Agent incorporates this context before taking any action
The key insight is that memory retrieval is semantic, not exact. The agent searches by meaning, not by keyword. A query for "authentication implementation" might surface a memory tagged "JWT expiry decision" because the embedding similarity is high.
Write Patterns That Work
Not everything should be written to memory. Agents that write too aggressively create noise; agents that never write are stateless. The patterns that work well:
On learning. When the agent discovers something non-obvious about the codebase, user preferences, or workflow — something that isn't in the code and wouldn't be obvious to a future session.
On correction. When the user corrects the agent's behavior. This is the highest-value write: "don't do X, do Y instead" should become a memory so the correction persists.
On decision. When an architectural or strategic decision is made. The decision itself may be in the code, but the reasoning and the alternatives considered typically aren't.
On pattern recognition. When the agent notices a recurring pattern — a user's preferred code style, a team convention, a deployment constraint — that will affect future work.
Memory Types in an MCP Context
Different memory types serve different purposes in a Claude Code workflow:
| Type | Example | Retrieval Trigger | |------|---------|------------------| | Episodic | "User preferred tabs over spaces in past session" | Similar file editing task | | Semantic | "This codebase uses Repository pattern for DB access" | Any DB-related work | | Procedural | "Deployment requires running db:migrate before restart" | Deploy-related tasks | | Declarative | "Auth service is owned by team A, don't modify without review" | Auth-touching changes |
Each type has a different write frequency, retrieval pattern, and decay rate. Episodic memories become less relevant over time; semantic and declarative memories may be durable indefinitely.
What to Look for in an MCP Memory Backend
When evaluating MCP memory servers, these are the capabilities that distinguish production-grade implementations from demos:
Typed memory tiers. A flat key-value store isn't memory — it's a cache. Look for distinct tiers (episodic, semantic, procedural, declarative) with different retrieval and decay behavior.
Per-agent scoping. Memory written by one agent shouldn't automatically be visible to all agents. Proper scoping prevents cross-agent contamination.
Audit trail. Every write should log the agent identity, timestamp, and source trigger. This is essential for debugging unexpected agent behavior.
Governance controls. For team deployments, operators should be able to review, edit, and approve memory before it becomes durable.
MemroOS implements all of this as a native MCP server, with Claude Code integration documented and tested. Agents connect with a single MCP config block and get governed, typed, persistent memory in return.