Silicon Workforce S1E08: AI Ran for 8 Hours and Forgot Who It Was

Silicon Workforce S1E08

The Pi-Math project. A 47-hour marathon session — 6 UI modules to migrate, 12 ticks in the OPC loop. $364 in token costs.

At tick 8, the review agent suddenly raised a finding: “This function’s error handling is incomplete.”

The problem was — tick 3 had already fixed this exact issue. The exact same bug, the exact same fix.

AI forgot.

A 500-Page Novel Compressed to Two Pages

(EP06 used a similar metaphor — a 500-page thesis compressed to two pages, to illustrate cost. This time it’s a novel, because the point is different: a thesis loses data, a novel loses foreshadowing.)

Imagine you’re reading a 500-page novel. You’ve read 300 pages, remembering all the characters, all the plot threads, all the foreshadowing.

Then someone walks over, takes away the 300 pages you’ve read, and hands you a two-page summary: “The protagonist is named Zhang San. He’s searching for a treasure. His friend died.”

You continue reading page 301. You know the protagonist is Zhang San, you know there’s a treasure, you know someone died. But you don’t remember — Zhang San refused a shortcut on page 47 because the path went through a place he didn’t want to face. You don’t remember his friend left one last sentence before dying, a sentence whose true meaning won’t be revealed until page 400.

This is context compaction (automated memory compression).

AI’s working memory (context window) is finite. When conversations get too long, the system automatically compresses — summarizing earlier content into a digest, making room for new content. The outline of information is preserved, but the reasons behind decisions are lost.

In Pi-Math’s 2,170-minute session, context was compressed multiple times. After each compression, AI’s review quality noticeably declined — not because it got dumber, but because it lost the memory of “I already checked this issue earlier.”

Three Solutions That Didn’t Work

After discovering the problem, I tried three approaches.

Approach 1: Expand the context window. In theory, if context is large enough, no compression is needed. But context windows have physical limits (200K tokens at the time), and bigger means more expensive and slower. A 47-hour session generates information far exceeding any context window. This isn’t a solution, it’s procrastination.

Approach 2: Repeat key information in the prompt. At the start of each tick, stuff summaries of all previous ticks into the prompt. Problem: the summaries themselves keep growing. By tick 12, summaries consumed 40% of context, leaving even less room for actual work.

Approach 3: Let AI maintain its own memory. Require AI to write “what I learned this tick” notes at the end of each tick. Problem: these notes are AI-generated, and they too get compressed away with context compaction. Using AI to solve AI’s memory problem is like using forgetting to cure forgetting.

The Mechanical Bridge

The solution that actually worked didn’t rely on AI’s memory — it relied on the filesystem.

OPC designed a checkpoint mechanism: at the end of each tick, write key state to files. Not AI deciding what to write — the harness code mandating what to write. tick-N-summary.md contains:

What this tick did (code changes + test results)
What problems were found (with specific file:line references)
Which problems are fixed, which remain
What the next tick should focus on

When the next tick starts, the harness injects the most recent 3 tick summaries into the system prompt. No matter how context is compressed, these files physically exist on the filesystem — compression can’t touch them.

The mechanical bridge: filesystem > AI memory

It’s like a diary. No matter how bad your memory is, as long as you wrote a diary entry yesterday, today you can open it and see “yesterday I fixed an error handling bug, specifically in src/lib/parser.ts:47.” The diary doesn’t rely on your brain for storage — it’s on paper.

The deeper design involves two hooks: PreCompact and PostCompact.

PreCompact fires just before context is about to be compressed. Its job is to extract the most important current information and write it to files — things like “currently fixing bug #23,” “already tried approaches A and B, both failed,” “next step should try approach C.”

PostCompact fires after context compression completes. Its job is to inject the files PreCompact saved back into the new context. The first message AI sees is “you were fixing bug #23, approaches A and B failed, next step is approach C.”

PreCompact writes, PostCompact reads. The filesystem is the bridge. AI’s memory breaks? No problem — the bridge is still there.

Filesystem > AI Memory

This solution reveals a more general principle: in agent systems, the filesystem is the only reliable long-term memory.

AI’s context is temporary — it gets compressed, truncated, cleared. Git is semi-permanent — it records what changed but not why. Only state files explicitly written to the filesystem are fully controllable — you write what you want, it stores what you wrote, it doesn’t disappear until you delete it.

OPC’s .harness/ directory embodies this philosophy. flow-state.json records which node you’re on. progress.md records each node’s narrative. acceptance-criteria.md records acceptance standards. These files aren’t in AI’s context — they’re on disk. Read in only when each tick starts.

In larger, longer-running projects, this mechanism was validated repeatedly. After every context compression, AI could restore state from the filesystem and continue working. Decision details might be lost, but direction and progress never are.

Is it imperfect? Of course. “Why we chose this architecture over that one” — that kind of decision rationale is hard to preserve in structured files. But for execution-level information like “what I’m doing, how far along, what’s next” — the filesystem is reliable enough.

AI’s memory is like RAM — fast but volatile. The filesystem is like a hard drive — slower but persistent. The engineering-correct approach is the same as computer architecture: write important things to the hard drive.

Silicon Workforce S1: The OPC Framework Evolution Previous: When Tools Start Checking Themselves <- Next: Don’t Make AI Better, Make Bad Outcomes Smaller ->

Silicon Workforce S1E08: AI Ran for 8 Hours and Forgot Who It Was

A 500-Page Novel Compressed to Two Pages

Three Solutions That Didn’t Work

The Mechanical Bridge

Filesystem > AI Memory

Comments