← Back to Writing

How I Turned My Obsidian Vault Into an AI Operating System

I turned an Obsidian vault into an AI operating system with specialist agents, markdown memory, search, routing, and documentation workflows I can audit.

Basalt knowledge vault with markdown cards connected by purple-white note threads.

How I Turned My Obsidian Vault Into an AI Operating System

Most people use Obsidian as a notes app. I use mine as an operating system.

Under the surface, it runs specialist AI agents, Discord-based routing, markdown memory, subagent delegation, and local model orchestration. The vault is where I think, where my agents work, and where the durable version of everything lives.

I wanted a persistent environment, not another chat window.

This post is a snapshot of that environment as it exists right now. Messy parts included.

The core idea

I wanted one source of truth that would still be useful if every AI tool disappeared tomorrow.

Which ruled out building my whole workflow inside a proprietary chat app or a hosted agent platform.

Markdown won.

Obsidian became the human interface. The vault became the durable file system. Everything else (agents, memory layers, model routing, chat surfaces) had to fit around that.

The rules that came out of that:

  • the vault is the source of truth
  • markdown is the long-term format
  • agents can operate inside the same workspace humans use
  • chat is an interface, not the database
  • specialized agents beat one giant context blob

Why this stack

I didn't choose this stack randomly. It's a blend of ideas that fit together better than they first appear.

PARA for operational clarity

Tiago Forte's PARA methodology gave me the backbone.

Projects, Areas, Resources, and Archive are simple enough to maintain, but strong enough to keep the vault usable as life and work expand. I wanted a structure that could hold active execution and long-term knowledge without turning into chaos.

This matters more once agents are involved. If the human layout is messy, the agent layer gets messy fast.

Zettelkasten for durable knowledge

PARA handles operational organization well, but it isn't enough on its own for idea development. Zettelkasten fills that gap.

I wanted a knowledge layer that rewarded linking, synthesis, and long-term accumulation. Not dumping everything into folders and forgetting it. That's why the vault isn't only a project manager. It's also a place where ideas compound.

Wiki-style thinking

Karpathy's influence is here too: the idea that a private wiki can become a real cognitive tool, not just a storage bin.

That framing pushed me toward a system where notes, docs, plans, and memory are part of the operating layer itself. I wanted the knowledge base alive and queryable, not AI pasted on top of a dead document archive.

Why markdown stayed non-negotiable

All of which led to the same conclusion: markdown had to stay at the center.

Markdown is portable, durable, inspectable, and easy for both humans and agents to work with. Obsidian gives me the interface, but the files remain mine.

Where QMD and MetaClaw fit

QMD and MetaClaw solve different problems.

QMD is about retrieval. If the vault is going to be the source of truth, I need search and recall that go beyond simple filename lookup or a tiny memory file.

MetaClaw is broader. It's part skills system, part memory and evolution layer, and part model-facing runtime. In practice that means it can inject skills, summarize sessions into reusable behaviors, maintain longer-term context, and support a more adaptive agent loop over time.

I don't want the system coupled to one static model path. I want a stack that can learn, evolve, and stay flexible without changing the vault underneath it.

So the stack is a composition:

  • PARA for structure
  • Zettelkasten for knowledge growth
  • wiki-style thinking for a living personal knowledge system
  • Obsidian for human usability
  • markdown for durability
  • QMD for retrieval
  • memory-wiki for compiled knowledge and provenance
  • Honcho for stronger adaptive cross-session modeling when needed
  • MetaClaw for skills, memory, and adaptive agent evolution
  • OpenClaw for orchestration and agent execution

That combination is what makes it coherent.

The stack

At a high level, the stack looks like this:

Discord
  ↓
OpenClaw gateway
  ↓
Mimir orchestrator
  ├─ Harbor   (life admin)
  ├─ Vector   (career)
  ├─ Ledger   (finance)
  └─ Forge    (agency and side projects)

Vault markdown
  ├─ projects
  ├─ areas
  ├─ resources
  ├─ archive
  ├─ journal
  ├─ memory
  ├─ templates
  ├─ wiki logs
  └─ agent homes

Memory and retrieval
  ├─ native memory/session files
  └─ QMD configuration for vault-wide search

Model/runtime layer
  ├─ OpenAI for primary orchestration and subagents
  └─ MetaClaw for skills, memory, and adaptive agent behavior

Each layer has a clear job.

Why Obsidian stayed at the center

Obsidian gives me the UX I want without taking ownership of the data.

This matters more once agents enter the picture.

If an agent can read and write the same markdown files I use every day, the system stays coherent. Notes, plans, logs, project docs, and agent instructions all live in the same environment. I don't need to constantly copy context from one app into another.

My vault structure is still basically human-readable:

  • 00-inbox/ for quick capture
  • 01-projects/ for active efforts
  • 02-areas/ for ongoing responsibilities
  • 03-resources/ for reference knowledge
  • 04-archive/ for inactive material
  • agents/ for specialist agent homes
  • memory/ for rolling recall and session artifacts
  • wiki/ for durable logs

The system is still useful even without the AI layer. An important design test.

From one assistant to a team

The biggest change was moving away from the idea of one general-purpose assistant.

Instead, I use a small team of specialists:

  • Mimir as the orchestrator
  • Harbor for life admin
  • Vector for job search and career work
  • Ledger for finance
  • Forge for agency work, automation, and side projects

Each agent has its own home directory, context files, operating rules, and tone. That keeps prompts smaller and responsibilities cleaner.

Instead of forcing one model to carry my entire life in a single thread, I split the problem by domain.

That alone made the system less brittle.

Why Discord replaced WhatsApp

Originally, I imagined WhatsApp as the main capture and control surface.

In practice, Discord turned out to be much better for how I actually operate day to day:

  • channels map naturally to specialist agents
  • threads are a great fit for delegated work
  • it's easier to keep active work separated by context
  • Discord feels more operational than personal messaging apps
  • native thread and session handling makes agent workflows cleaner

The live system is now Discord-first.

I can talk to the main orchestrator in one place, drop into specialist channels when I want to work directly with a domain agent, and let deeper tasks branch into their own threads or subagent sessions.

That turned the chat layer from a cluttered inbox into a control panel.

Subagents are the difference between "chatbot" and "system"

This was the biggest practical shift.

The main agent doesn't do every task itself.

It can spawn subagents with a smaller scope, lower-cost model defaults, and limited depth. Right now my setup uses a max spawn depth of 2, up to 5 child sessions per agent, and a lightweight model for delegated work.

That gives me a pattern closer to real delegation:

  • the orchestrator routes the work
  • the specialist agent owns the domain
  • subagents handle burst execution when something gets deep or messy
  • results come back up the chain in a cleaner form

Without subagents, everything turns into one giant conversation. With them, it can branch and return.

Memory is where things get real

Everyone wants an AI with memory. Very few setups make memory trustworthy.

My current approach is to keep memory grounded in files.

OpenClaw already gives me native memory and session artifacts. On top of that, I've been moving toward QMD for broader vault-wide retrieval across markdown files and transcripts.

That's the direction I want, because the useful context isn't only in a single MEMORY.md file. It's spread across project docs, plans, notes, operating instructions, and previous sessions.

Now there's a second layer on top of that: memory-wiki for compiled knowledge and provenance, and Honcho if I want stronger adaptive cross-session modeling later.

But the honest part: the architecture is still in transition. QMD is configured but not yet fully live. The designed architecture and the live architecture are often not the same thing. That's what real systems look like while they're being built.

Where MetaClaw fits

MetaClaw isn't my source of truth or the center of the stack. That's intentional. But it's also more than a simple proxy.

In my setup, MetaClaw sits in the model-facing path, but its value is that it can act as an adaptive layer for skills, memory, and optional self-improvement. It can inject skills into turns, summarize sessions into reusable capabilities, maintain longer-lived context, and create the conditions for the agent to evolve over time instead of staying static.

The vault is where durable knowledge and operating structure live. MetaClaw is one of the layers that helps the agent behave better on top of that foundation.

In my current setup, it's exposed through a local OpenAI-compatible provider pointed at GLM-5.1 through Fireworks/HuggingFace. The transport detail matters less than the adaptive behavior sitting behind it.

The pattern I trust most: durable knowledge underneath, adaptive agent behavior above it.

The practical payoff

All of this only matters if it changes how work gets done. In practice, it does.

Less context repetition. I don't re-explain my agency work, job search, finances, and life admin in one thread over and over.

Better domain separation. Career work stays with the career agent. Agency work stays with the agency agent. That reduces prompt drift and cross-contamination.

Durable outputs. What matters ends up in markdown files, logs, plans, and docs I own.

Better delegation. The system can branch into subagents instead of forcing every task through one conversation bottleneck.

And the part that interests me most: I think a lot of people are going to move from "using AI tools" to "running personal AI systems." The difference is huge. A tool answers prompts. A system has memory, routing, interfaces, persistence, specialization, and operational boundaries.

What I'd recommend if you want to build something similar

Keep it simpler than mine at first:

  • Start with markdown as the source of truth.
  • Use Obsidian or another local-first editor as the human interface.
  • Add one orchestrator before adding specialists.
  • Split into specialist agents only when you feel real domain overload.
  • Treat chat surfaces like Discord as interfaces, not storage.
  • Expect memory to be the hardest part.
  • Design for partial failure. Your live setup will always lag behind your ideal diagram.

That last point matters. The reason this system works isn't that it's perfectly finished. It's that the foundation is durable enough to survive changes in tools, providers, and model preferences.

Where this goes next

Tighten the QMD integration so vault-wide retrieval matches the intended design. Keep refining specialist agent boundaries. Publish more of the operating model in public. Turn this stack into repeatable client-facing systems where it makes sense.

That last part is a big reason I'm writing this here, on the new Mimir Works site. This blog documents real operating systems for knowledge work, agency work, and practical automation. Real setups, real tradeoffs, real architecture.

What stays true when the models change

The decision that mattered most wasn't choosing a model. It was choosing what stays true when the models change. For me, that's the vault.

Markdown files. Clear structure. Specialist roles. A chat layer on top. Memory that can be inspected. Systems that can evolve without trapping me inside a vendor's interface.

That's what turned a notes app into an AI operating system.

Read next: Building an AI Second Brain With OpenClaw and Why Specialist AI Agents Beat One Big Chat.

Some links on this site may be affiliate links. I only recommend tools I use. If you click through and make a purchase, I may earn a small commission at no extra cost to you.