Your AI Agent Has Amnesia — Here's How I Fixed It

Every developer using AI coding agents has experienced this:

You spend 20 minutes in session A teaching your agent a bug pattern — global regex lastIndex not resetting causes .test() to alternate between true/false in a loop. The agent gets it, fixes it, you’re happy.

Next day, new session. Same bug. The agent remembers nothing.

AI agents have no memory. Every conversation starts on a blank slate.

This isn’t a bug in any particular product. It’s a structural flaw in the entire paradigm.

The Problem Isn’t Retrieval — It’s Structure

There are plenty of “AI memory” solutions out there. Most follow the same playbook: embedding → vector DB → semantic search → top-K retrieval.

I tried them. Three problems:

One-shot retrieval is fragile — cosine similarity only finds superficially similar things. “Global regex lastIndex” and “.test() alternating in loops” are two faces of the same problem, but embeddings may not connect them.
Not explainable — vector similarity tells you “these two things are 0.87 similar” but not why. The agent gets a pile of similar documents with no context to judge which is actually relevant.
Too much infrastructure — a coding agent’s memory system needs Qdrant/Pinecone/ChromaDB? For what is essentially “help me remember lessons learned,” that’s over-engineering.

What I needed wasn’t a search engine — it was a notebook.

Zettelkasten: A 300-Year-Old Solution

Niklas Luhmann was a German sociologist who wrote 70 books and 400 papers in his lifetime. His secret weapon was a card box system — the Zettelkasten.

The core rules are simple:

One idea per card (atomic)
Cards connected via explicit links (bidirectional)
Find entry points via keyword index, explore context along links

This method has been validated for decades in human knowledge management. My key insight: it maps almost directly to AI agents.

Three equivalent substitutions make this work:

Human Zettelkasten	Agent Zettelkasten
Human brain understands card meaning	LLM understands card meaning
Manually write bilinks `[[related card]]`	Agent auto-links after writing cards
Brain’s fuzzy memory “I think there’s a card about this”	`memex search` CLI as entry point

The key trade-off: spend more LLM tokens for explainability and zero infrastructure.

LLMs generate multiple search keywords (like using a search engine) — stronger than vector similarity’s one-shot top-K because the LLM can adjust the next query based on previous results. Bilinks explicitly state “A and B are related because X” — more precise and human-readable than cosine similarity.

Tokens only get cheaper over time, so this trade-off becomes increasingly favorable.

Architecture: Two-Layer Separation

┌─────────────────────────────────┐
│  Smart Layer (Skills/Prompts)   │  ← LLM logic lives here
│  recall / retro / organize      │
├─────────────────────────────────┤
│  Protocol Layer (CLI)           │  ← Pure data ops, no LLM
│  search / read / write / links  │
├─────────────────────────────────┤
│  Storage                        │
│  ~/.memex/cards/*.md            │  ← Plain markdown files
└─────────────────────────────────┘

The CLI is the data layer. Skills are the intelligence layer. The CLI has zero LLM dependencies — all LLM logic lives in skills.

Why not let skills manipulate files directly? The CLI exists not because the operations are complex, but because it’s the protocol between memory and agent. Claude Code calls the CLI through skills, Cursor calls it through MCP, humans call it directly from the terminal — all accessing the same ~/.memex/cards/ directory.

One memory store, accessible by any system. No lock-in.

Workflow: Three Automated Moments

1. Before a task — Recall

When the agent receives a new task, it automatically searches for relevant memory cards. Not stuffing all history into the context window, but exploring along bilinks to pull only truly relevant context.

2. After a task — Retro

The agent automatically distills lessons from the current task into atomic cards. Not dumping the entire conversation, but extracting insight — a reusable unit of cognition.

3. Periodically — Organize

Detect orphan cards, discover missing links, merge duplicates. A knowledge network needs maintenance, just like code needs refactoring.

The entire process can be fully automatic (via hooks) or manually triggered (memex recall, memex retro).

Cross-Platform: One MCP Server Covers All Clients

Distributing agent memory to different products requires distinguishing three layers:

Tool protocol (MCP) — makes tools callable by the LLM. One MCP server covers Cursor, VS Code/Copilot, Windsurf, Codex.
Instruction injection — teaches the LLM when to use tools and how to chain them. Key discovery: tool descriptions themselves are the most universal instruction carrier — every MCP client reads tool descriptions, no extra per-project config files needed. Zero config, works out of the box.
Claude Code Plugin — the deepest integration. SessionStart hook for auto-recall, slash commands to trigger retro, skills to orchestrate the full workflow.

Why Markdown

Because markdown is the lingua franca of the developer world.

Open ~/.memex/cards/ in Obsidian to see the full knowledge graph
Search with grep
Sync to any device with git
Edit with any editor
If memex dies tomorrow, your knowledge survives

No lock-in. No vendor dependency. Your knowledge is yours.

What I Learned

Building memex refreshed several of my assumptions:

1. LLMs are better search engines than embeddings (for this use case)

Embeddings do one-shot semantic matching. LLMs do iterative exploration — search, read, judge, adjust query, search again. The latter is clearly stronger, at the cost of a few hundred extra tokens. With token prices dropping exponentially, this trade-off becomes more favorable over time.

2. Explicit links >> implicit similarity

The explicit link [[js-global-regex-lastindex-trap]] tells the agent: this card relates to a regex bug. That carries an order of magnitude more information than “cosine similarity 0.87.”

3. The best infrastructure is no infrastructure

No database to maintain, no embedding model to deploy, no vector store to pay for. The filesystem + git is the infrastructure.

Try It

# Claude Code
npx memex@latest

# VS Code / Cursor / Codex / Windsurf (MCP)
npx @anthropic/memex-mcp

GitHub: github.com/iamtouchskyer/memex

Web Viewer: memra.vercel.app

The structure of the next era won’t be discovered — it’ll be built.

Your AI Agent Has Amnesia — Here's How I Fixed It

The Problem Isn’t Retrieval — It’s Structure

Zettelkasten: A 300-Year-Old Solution

Architecture: Two-Layer Separation

Workflow: Three Automated Moments

Cross-Platform: One MCP Server Covers All Clients

Why Markdown

What I Learned

Try It

Share this post

Related posts

📖 Related book chapters

📬 Subscribe

Comments