How I Turned My Obsidian Vault Into an AI Operating System
Most people use Obsidian as a notes app.
I use mine as an operating system.
Under the surface, it is a live workspace for specialist AI agents, Discord-based routing, markdown memory, subagent delegation, and local model orchestration. The vault is where I think, where my agents work, and where the durable version of everything lives.
That is the system I wanted: not another chat window, but a persistent environment.
This post is a snapshot of that environment as it exists right now, messy parts included.
The core idea
I wanted one source of truth that would still be useful if every AI tool disappeared tomorrow.
That is the whole philosophy in one sentence.
That ruled out building my whole workflow inside a proprietary chat app or a hosted agent platform.
Markdown won.
Obsidian became the human interface. The vault became the durable file system. Everything else, agents, memory layers, model routing, chat surfaces, had to fit around that.
That led to a design with a few simple rules:
- the vault is the source of truth
- markdown is the long-term format
- agents can operate inside the same workspace humans use
- chat is an interface, not the database
- specialized agents beat one giant context blob
Why this stack
I did not choose this stack randomly.
It is really a blend of a few ideas that fit together better than they first appear.
PARA for operational clarity
Tiago Forte's PARA methodology gave me the backbone.
Projects, Areas, Resources, and Archive are simple enough to maintain, but strong enough to keep the vault usable as life and work expand. I wanted a structure that could hold active execution and long-term knowledge without turning into chaos.
That matters even more once agents are involved. If the human layout is messy, the agent layer gets messy fast.
Zettelkasten for durable knowledge
PARA handles operational organization well, but it is not enough on its own for idea development.
That is where the Zettelkasten influence comes in.
I wanted a knowledge layer that rewarded linking, synthesis, and long-term accumulation instead of dumping everything into folders and forgetting it. That is why the vault is not only a project manager. It is also a place where ideas can compound.
Karpathy-style wiki thinking
There is also a more personal influence in the stack: the idea that a private wiki or personal knowledge environment can become a real cognitive tool, not just a storage bin.
That framing pushed me toward a system where notes, docs, plans, and memory are part of the operating layer itself. I did not want AI pasted on top of a dead document archive. I wanted the knowledge base to feel alive and queryable.
Why markdown stayed non-negotiable
All of that led to the same conclusion: markdown had to stay at the center.
Markdown is portable, durable, inspectable, and easy for both humans and agents to work with. Obsidian gives me the interface, but the files remain mine.
Where QMD and MetaClaw fit
QMD and MetaClaw solve different problems.
QMD is about retrieval. If the vault is going to be the source of truth, I need search and recall that go beyond simple filename lookup or a tiny memory file.
MetaClaw is broader. It is not just a transport layer in front of the model. It is part skills system, part memory and evolution layer, and part model-facing runtime. In practice that means it can inject skills, summarize sessions into reusable behaviors, maintain longer-term context, and support a more adaptive agent loop over time.
That matters because I do not want the system coupled to one static model path. I want a stack that can learn, evolve, and stay flexible without changing the vault underneath it.
So the stack is really a composition:
- PARA for structure
- Zettelkasten for knowledge growth
- wiki-style thinking for a living personal knowledge system
- Obsidian for human usability
- markdown for durability
- QMD for retrieval
- memory-wiki for compiled knowledge and provenance
- Honcho for stronger adaptive cross-session modeling when needed
- MetaClaw for skills, memory, and adaptive agent evolution
- OpenClaw for orchestration and agent execution
That combination is what makes the system feel coherent.
The stack
At a high level, the stack looks like this:
Discord
↓
OpenClaw gateway
↓
Mimir orchestrator
├─ Harbor (life admin)
├─ Vector (career)
├─ Ledger (finance)
└─ Forge (agency and side projects)
Vault markdown
├─ projects
├─ areas
├─ resources
├─ archive
├─ journal
├─ memory
├─ templates
├─ wiki logs
└─ agent homes
Memory and retrieval
├─ native memory/session files
└─ QMD configuration for vault-wide search
Model/runtime layer
├─ OpenAI for primary orchestration and subagents
└─ MetaClaw for skills, memory, and adaptive agent behavior
It sounds elaborate, but each layer has a clear job.
Why Obsidian stayed at the center
Obsidian gives me the UX I want without taking ownership of the data.
That matters more once agents enter the picture.
If an agent can read and write the same markdown files I use every day, then the system starts to feel coherent. Notes, plans, logs, project docs, and agent instructions all live in the same environment. I do not need to constantly copy context from one app into another.
My vault structure is still basically human-readable:
00-inbox/for quick capture01-projects/for active efforts02-areas/for ongoing responsibilities03-resources/for reference knowledge04-archive/for inactive materialagents/for specialist agent homesmemory/for rolling recall and session artifactswiki/for durable logs
That means the system is still useful even without the AI layer. That is an important design test.
From one assistant to a team
The biggest change was moving away from the idea of one general-purpose assistant.
Instead, I now use a small team of specialists:
- Mimir as the orchestrator
- Harbor for life admin
- Vector for job search and career work
- Ledger for finance
- Forge for agency work, automation, and side projects
Each agent has its own home directory, context files, operating rules, and tone. That keeps prompts smaller and responsibilities cleaner.
Instead of forcing one model to carry my entire life in a single thread, I split the problem by domain.
That alone made the system feel less brittle.
Why Discord replaced WhatsApp
Originally, I imagined WhatsApp as the main capture and control surface.
In practice, Discord turned out to be much better for how I actually operate day to day.
A few reasons:
- channels map naturally to specialist agents
- threads are a great fit for delegated work
- it is easier to keep active work separated by context
- Discord feels more operational than personal messaging apps
- native thread and session handling makes agent workflows cleaner
So the live system is now Discord-first.
I can talk to the main orchestrator in one place, drop into specialist channels when I want to work directly with a domain agent, and let deeper tasks branch into their own threads or subagent sessions.
That turned the chat layer from a cluttered inbox into a control panel.
Subagents are the difference between "chatbot" and "system"
This is one of the biggest practical shifts.
The main agent is not expected to do every task itself.
It can spawn subagents with a smaller scope, lower-cost model defaults, and limited depth. Right now my setup uses a max spawn depth of 2, up to 5 child sessions per agent, and a lightweight model for delegated work.
That gives me a pattern that feels much closer to real delegation:
- the orchestrator routes the work
- the specialist agent owns the domain
- subagents handle burst execution when something gets deep or messy
- the result comes back up the chain in a cleaner form
Without subagents, everything turns into one giant conversation. With them, the system can branch and return.
Memory is where things get real
Everyone wants an AI with memory. Very few setups make memory trustworthy.
My current approach is to keep memory grounded in files.
OpenClaw already gives me native memory and session artifacts. On top of that, I have been moving toward QMD for broader vault-wide retrieval across markdown files and transcripts.
That is the direction I want, because the useful context is not only in a single MEMORY.md file. It is spread across project docs, plans, notes, operating instructions, and previous sessions.
Now there is a second layer on top of that: memory-wiki for compiled knowledge and provenance, and Honcho if I want stronger adaptive cross-session modeling later.
But here is the honest part: the architecture is still in transition.
The config is set up for QMD. The live status still reports memory-core and vector search as unavailable. So right now the system reflects something I think more builders should talk about openly: the designed architecture and the live architecture are often not the same thing.
That is not failure. That is just what real systems look like while they are being built.
Where MetaClaw fits
MetaClaw is not my source of truth and it is not the center of the stack.
That is intentional.
But it is also more than a simple proxy.
In my setup, MetaClaw sits in the model-facing path, but its real value is that it can act as an adaptive layer for skills, memory, and optional self-improvement. It can inject skills into turns, summarize sessions into reusable capabilities, maintain longer-lived context, and create the conditions for the agent to evolve over time instead of staying static.
That makes it a different kind of component than the vault.
The vault is where durable knowledge and operating structure live. MetaClaw is one of the layers that helps the agent behave more intelligently on top of that foundation.
In my current setup, it is exposed through a local OpenAI-compatible provider and pointed at GLM-5.1 through Fireworks/HuggingFace, but that transport detail is not really the important part. The important part is the adaptive behavior sitting behind it.
That is the pattern I trust most: durable knowledge underneath, adaptive agent behavior above it.
The practical payoff
All of this only matters if it changes how work gets done.
In practice, it does.
A few things matter immediately:
1. Less context repetition
I do not have to re-explain my agency work, job search, finances, and life admin in one thread over and over.
2. Better domain separation
Career work stays with the career agent. Agency work stays with the agency agent. That reduces prompt drift and weird cross-contamination.
3. Durable outputs
What matters ends up in markdown files, logs, plans, and docs I own.
4. Better delegation
The system can branch into subagents instead of forcing every task through one conversation bottleneck.
5. A blog-worthy operating model
This is the part that interests me most now.
I think a lot of people are going to move from "using AI tools" to "running personal AI systems." The difference is huge.
A tool answers prompts.
A system has memory, routing, interfaces, persistence, specialization, and operational boundaries.
What I would recommend if you want to build something similar
If you want to build your own version of this, I would keep it simpler than mine at first:
- Start with markdown as the source of truth.
- Use Obsidian or another local-first editor as the human interface.
- Add one orchestrator before adding a bunch of specialists.
- Split into specialist agents only when you feel real domain overload.
- Treat chat surfaces like Discord as interfaces, not storage.
- Expect memory to be the hardest part.
- Design for partial failure. Your live setup will always lag behind your ideal diagram.
That last point matters.
The reason this system feels promising is not that it is perfectly finished. It is that the foundation is durable enough to survive changes in tools, providers, and model preferences.
Where this goes next
The next stage is straightforward:
- tighten the QMD integration so vault-wide retrieval matches the intended design
- keep refining specialist agent boundaries
- publish more of the operating model in public
- turn this stack into repeatable client-facing systems where it makes sense
That last part is a big reason I am writing this here, on the new Mimir Works site.
This blog is not going to be about AI in the abstract. I want it to document real operating systems for knowledge work, agency work, and practical automation.
Not hype. Not vague thought pieces. Real setups, real tradeoffs, real architecture.
Final thought
The most important decision I made was not choosing a model.
It was choosing what stays true when the models change.
For me, that is the vault.
Markdown files. Clear structure. Specialist roles. A chat layer on top. Memory that can be inspected. Systems that can evolve without trapping me inside a vendor's interface.
That is what turned a notes app into an AI operating system.
And I think a lot more people are going to build their own version of this over the next few years.