Knowledge·April 18, 2026·9 min

AI Agent Memory: How I Built Persistent Memory Into My Agent Org

Persistent AI agent memory is not one feature. Here is the three-layer system I use across session logs, vault files, and compiled knowledge so agents retain context.

Persistent memory machine with index cards feeding a durable graph archive.

AI Agent Memory: How I Built Persistent Memory Into My Agent Org

The default AI agent has no memory.

You open a session, ask a question, get an answer. Close the session. Next time, it's a blank slate. That's fine for quick lookups. It's useless for running an operating system.

If you're building agents that handle real work (content, finance, career ops, daily administration), they need to remember what happened yesterday. And last week. And what decisions were made, what's still in flight, and what got archived.

This post is about how I solved that problem. The foundation is files. Files are the source of truth. Tooling layers on top make those files more useful, but the vault is what everything else reads from and writes to.

The problem with stateless agents

A stateless agent has a few predictable failure modes:

It repeats work it already did.
It can't follow up on something from last session.
It can't tell you what changed, because it doesn't remember what was.
It asks you to re-explain context you already gave it.

That last one is the worst: you spend time restating context the agent should already have. If you spend the first five minutes of every session restating who you are and what you're working on, the agent is overhead, not useful work.

Three layers of agent memory

I use a three-layer memory stack. Each layer does something different:

Layer 1: Structured logs (short-term)

The primary log is an append-only file that agents write to as things happen, in real time:

vault/wiki/log.md

Each entry has a timestamp, agent name, status tag, and a one-line summary. It looks like this:

## [2026-04-16] cron | Quill Heartbeat
- **17:23** [Quill] [OK] Published org-evolution post, pushed design port to live site

## [2026-04-17] cron | Daily Job Search
- **13:02** [Mimir] [OK] Ran daily job search. Created 4 new company directories.

This is the working memory. It covers recent context: what's in flight, what changed, what the next step was. Agents read the last few entries at session start to orient themselves.

Individual agents also keep daily notes in their own memory directories for session-specific detail. The structured log is the canonical timeline.

Layer 2: Vault files (medium-term)

The vault is the agent's file system. Every agent has its own workspace inside the vault:

vault/02-areas/openclaw/agents/
├── harbor/       # SOUL.md, AGENTS.md, HEARTBEAT.md, memory/
├── vector/       # Same structure
├── ledger/
├── forge/
└── quill/

Each agent workspace has:

SOUL.md: identity, personality, role definition
AGENTS.md: operating instructions, session startup procedure
HEARTBEAT.md: what the agent checks during autonomous heartbeat cycles
memory/: persistent notes the agent writes to remember things across sessions

The memory/ directory is the medium-term store. This is where an agent keeps notes it wants to survive beyond today's log. Quill keeps writing style guidelines, SEO keyword research, and content calendars there. Ledger keeps financial position notes there.

The design decision that matters here: agents read their own memory files at session start. The AGENTS.md startup procedure explicitly lists which files to read. This means the agent loads its own context before it starts working, instead of waiting for the human to restate everything.

The heartbeat system is what makes this live. Each agent has a HEARTBEAT.md that defines what it checks on a schedule (emails, calendar, task boards). Memory without heartbeat is passive: the agent only recalls when asked. Heartbeat is what turns a reactive chatbot into a proactive operating layer, and it depends on memory to work. An agent that checks email but can't remember what it already triaged is worse than one that doesn't check at all.

Layer 3: Compiled wiki (long-term)

The wiki is the compiled knowledge layer. It's a separate directory of structured, cross-referenced pages that distill long-term knowledge from the raw logs and working files.

vault/wiki/
├── log.md           # Layer 1: append-only structured log
├── index.md
├── skills-audit.md  # Layer 3: compiled wiki pages
└── self-hosted-ai-stack.md

Note that Layer 1 and Layer 3 live in the same directory. The distinction is usage pattern, not separate storage backends. The log is the running timeline; the wiki pages are the durable distillation.

The wiki serves a different purpose than the logs. Logs capture what happened. The wiki captures what we learned. It's the distillation layer, the stuff worth keeping after the daily noise fades.

This is inspired by Andrei Karpathy's wiki-style approach: linkable, incrementally updated pages that serve as both personal reference and publishable content. The difference is that Karpathy's wiki is public-facing; mine is internal by default. Same structure, different intent: operational rather than editorial.

The LLM wiki concept, where an AI agent compiles raw sources into a structured, searchable knowledge base, is the same pattern at work. Here's how it looks in practice:

Agents don't write to the wiki on every session. They write to it when something substantive happens: a decision, a lesson, a configuration change worth preserving.

Agents also have semantic search over the wiki through OpenClaw's built-in memory plugin (memory_search and memory_get). They can query the wiki directly rather than reading every file. This means the wiki doubles as a human-readable knowledge base and a machine-retrievable context store.

Beyond the file-based layers, I'm also running two tools that extend the memory system:

QMD (Query Markup Documents), OpenClaw's memory search sidecar. It indexes all the vault's markdown files into a vector database, then combines semantic similarity with BM25 keyword matching in a hybrid search. This is what powers memory_search/memory_get. A query like "what caching solution did we pick?" can retrieve relevant pages even if the exact word "caching" never appears. QMD is local-first. Indexing and searching happen on the machine, so memory files don't leave the environment. It also reduces token usage by injecting only highly relevant, re-ranked chunks into the agent's context rather than entire files.
Honcho, an AI-native memory backend by Plastic Labs that provides adaptive cross-session modeling. Where the wiki captures what happened, Honcho captures how the user and agent behave over time. It uses dialectic reasoning, analyzing conversations to derive insights about preferences, communication style, and goals that were never explicitly stated. Then it injects that accumulated understanding back into the agent's context as a base layer plus a situational supplement. The result: agents that adapt their behavior based on accumulated experience, rather than just recalling past facts. It's newer in my stack, but it's the direction I'm most interested in: making agents genuinely personalize rather than just remember.

These tools sit on top of the three-layer file system, not in place of it. The files are the source of truth. QMD and Honcho are retrieval and modeling layers that make the files more useful.

Inter-agent memory

Agents in this system can send messages to each other directly via sessions_send, and they share the structured log and wiki. Memory is shared across the org, not just individual. In practice, day-to-day coordination happens through the shared log and direct messaging. The wiki is the durable compiled layer for knowledge that needs to survive beyond any one agent's context.

When Quill needs to know what Forge decided about a revenue model, it doesn't ask the human. It reads the relevant wiki page or sends a message directly via sessions_send. This cuts down on context-restating between agents the same way the individual memory layers cut it down between sessions.

How the layers connect

The three layers have a natural flow:

Session starts. The agent reads SOUL.md, AGENTS.md, HEARTBEAT.md, then checks recent logs and relevant wiki pages.
Work happens. The agent writes to the structured log in real time and updates its own memory files as needed.
Milestone hit. The agent updates the wiki if something durable changed.

The daily logs are ephemeral. The memory files are semi-permanent. The wiki is durable.

This means:

Recent context is always available (Layer 1)
Domain-specific knowledge persists across sessions (Layer 2)
Hard-won lessons survive even when the agent changes (Layer 3)

Why files, not a database

I considered vector databases. I considered dedicated memory APIs like Mem0 and Zep. I went with files for a few reasons:

1. Portability

A directory of markdown files backed by git is the most portable knowledge system I know. No vendor lock-in. No API dependency. No schema migration. git clone and you have everything.

2. Human readability

I can read the agent's memory directly. I can edit it. I can see exactly what it knows. That's harder with an embedded vector store.

3. Agent readability

The agent can read the same files I do. There's no translation layer between what the human knows and what the agent knows. The vault is the shared context.

4. Simplicity

Files are simpler and degrade more gracefully than a separate memory service. They don't have connection errors or schema migrations. When something goes wrong with a file, you can see it and fix it directly. System-level failures still happen: disk corruption, sync conflicts, accidental deletes. Git helps recover from most of these. The failure modes are simpler and more recoverable, which is the point. Files aren't indestructible. They're just easier to fix when they break.

When a vector database would be better

I'm not against vector databases. They're the right tool when:

You have thousands of documents and need semantic search
You need sub-second retrieval across a large corpus
Your memory needs are more about retrieval than structure

For my current setup (a few hundred files across a structured directory tree), I've already added QMD as a vector index on top of the same files. It didn't replace them. The markdown files are still the source of truth. QMD just makes them searchable by meaning. If the vault grows further, the same pattern holds: index the files, don't migrate them.

The retrieval pattern

When an agent needs information, it follows a retrieval pattern shaped by its own AGENTS.md and HEARTBEAT.md contracts, not one universal hardcoded sequence. The general order looks like this:

Check its own memory files. Does it already have a note on this?
Check the structured log. Was something relevant decided recently?
Check the wiki. Is there a compiled page on this topic?
Search the broader vault (occasionally). Is there a project file or area note that covers this? This is more design-intent than standard practice. It works when the vault is well-structured.
Ask another agent. Does a specialist in that domain have context?
Ask the human. If none of the above answers the question.

Most questions get answered by step 1 or 2. The wiki handles step 3 for durable topics. Step 4 is for edge cases. Step 5 is where the multi-agent org structure pays off: agents can coordinate without routing everything through the human. Step 6 is the last resort.

Privacy and scoping

One thing worth being explicit about. The vault can contain sensitive information. Financial positions, career details, personal admin. All of it lives in the same directory tree that agents read from. That's a tradeoff. For a personal org where one person owns the whole system, sharing visibility between agents is a feature, not a bug. But the repo should stay private. And you should think carefully before giving an agent access to domains it doesn't need. The PARA structure helps here. You can scope what an agent reads by pointing it at specific directories rather than the whole vault.

What I'd do differently

A few things I learned the hard way:

1. Start with a structured log habit, not the wiki

The wiki is tempting to build first. It feels productive to create structured knowledge pages. But if the log habit isn't established, the wiki has nothing to distill. Start with logs. Let the wiki emerge from patterns in the logs.

2. Give each agent its own memory directory

Early on, I tried sharing one memory location between agents. That created confusion. One agent would overwrite another's notes, or context would bleed between domains. Separate directories for separate agents. Shared wiki for shared knowledge.

3. Don't over-automate memory maintenance

I tried building an automated system that would periodically review logs and update the wiki. It generated noise: flagging things that seemed important but weren't, creating wiki pages for transient issues. The better approach is structured rules: agents can update the wiki directly when the update is source-backed and follows a managed format, rather than requiring human approval on every change. The guardrail is the structure, not a gatekeeper.

4. Git is the backup, not the memory layer

git push backs up everything. But git history is a terrible retrieval interface. The memory layers are for retrieval. Git is for durability. Don't conflate them.

5. Don't let agents self-edit their own scope

This is a subtle one. If agents can freely rewrite their own priority files or scope definitions, they tend to drift. An agent that started handling life admin might gradually expand into career ops because it seemed helpful. Keep scope definitions owned by the human or the orchestrator, not by the agent itself.

Memory failure modes

No memory system is maintenance-free. A few things to watch for:

Stale wiki pages. Knowledge that was true when written but isn't anymore. The fix is periodic review. If a wiki page hasn't been touched in months, it's probably worth re-reading.
Log bloat. The structured log grows indefinitely. The fix is occasional pruning or archiving older entries. The log should cover recent context, not years of history.
Conflicting notes. An agent's memory file says one thing, the wiki says another, and neither has been updated. The fix is a clear hierarchy: wiki over agent memory over logs. When in doubt, the wiki wins.

The key principle

Agent memory doesn't need to be sophisticated. It needs to be reliable.

A file the agent can read at session start beats a vector index that's down half the time. A daily log that actually gets written to beats a knowledge graph nobody maintains.

Build the habit first. Structure and tooling come after.

The memory system works because the agents use it every session. The architecture is secondary.

Some links on this site may be affiliate links. I only recommend tools I use. If you click through and make a purchase, I may earn a small commission at no extra cost to you.