I Taught My AI Agents to Sleep. Now I'm Teaching Them to Wake Up.
A while back I built a memory system for AI agents that worked like sleep.
The idea came straight from neuroscience. While you sleep, your brain replays the day, decides what mattered, and moves the keepers from short-term storage into long-term knowledge. The rest fades. I built agents that did the same thing: they ran all day, logged what happened, and at rest a consolidation pass replayed those logs, scored what was important, and wrote it into a knowledge graph the whole fleet could share. The unimportant stuff decayed away.
It worked. My agents stopped re-learning the same things every morning. That system was Myelin v1, and I shipped it in the open.
Then I made a decision that surprised even me: I sunset it.
Why I walked away from working code
Myelin v1 did something real, and people used it. But I’d stopped supporting it some time ago, and I finally made it official — because everything I find interesting about agent memory now lives somewhere v1 couldn’t reach.
So v2 is a clean break. It’s closed-source, and it’s where all of my focus goes. Part of that is depth: the questions I’m chasing don’t fit inside a side project anymore. And part of it is honest — some of this work is novel enough that I want to develop it properly before deciding what to say about it. I’d rather go quiet and go deep than narrate every step.
But I can tell you about the question that pulled me here, because the question is the interesting part.
The half of memory I forgot to build
Here’s the thing about the sleep model: sleep is only half of how memory works.
The brain doesn’t wait until you’re asleep to start remembering. The moment something happens, you encode it — and, crucially, you connect it to what you already know. You meet someone at a conference and within seconds they’re linked to the company they work for, the talk they just gave, the friend who introduced you. That binding happens while you’re awake, in real time, long before any sleep-time consolidation cleans it up.
This is Complementary Learning Systems theory — McClelland, McNaughton and O’Reilly laid it out in 1995, and Kumaran, Hassabis and McClelland updated it for AI in 2016. The brain runs two memory systems on purpose: a fast one (the hippocampus) that grabs new experiences instantly, and a slow one (the neocortex) that integrates them carefully over time. You need both. The fast system lets you remember what just happened. The slow system keeps that fast learning from overwriting everything you already knew.
My agents had the slow system. They had no fast one.
So a thing my agent figured out at 2pm wasn’t really known until a consolidation pass ran later. In the meantime it sat in a log, waiting. The agent had learned something and couldn’t use it yet. That’s not how memory is supposed to feel. That’s the gap I’m building into now: the awake path — encode the moment it happens, wire it to what’s already there, without corrupting the stable knowledge underneath.
That last part is the hard part, and it’s also straight from the neuroscience. Write new information too aggressively into a settled store and you get catastrophic interference — the new memory smears the old ones (McCloskey and Cohen named this in 1989). The brain’s whole two-system design exists to avoid exactly that. Which means the fix isn’t “write faster.” It’s “write fast somewhere separate, then integrate carefully.” The architecture is the answer, not the speed.
The questions I’m actually sitting with
I’m not going to walk through how I’m solving these. But these are the open questions I find genuinely hard, and I think they’re the right questions for anyone serious about agent memory:
What’s the right unit of memory, and when do you write it? Every turn? Every tool call? Only the surprising ones? Write too much and you drown in noise. Write too little and you lose the thread.
Should an agent decide how hard to think on each turn? Brains do. The basal ganglia is a fast, cheap gate that decides whether a situation deserves a reflex or real deliberation — you don’t burn deep thought deciding whether to burn deep thought (Daw, Niv and Dayan, 2005; Shenhav, Botvinick and Cohen, 2013). Most agents think exactly as hard about “hi” as they do about a production incident. That’s wasteful, and it’s solvable.
How do you reconcile one source of truth with many specialized memory systems? You want a single, trustworthy record of what happened. You also want fast recall, a consolidated knowledge store, and reusable procedures — and those want different shapes. Is that one store or many? I have a strong opinion here. I’m keeping it to myself for now.
When does remembering how to do something graduate into a skill? There’s a difference between recalling that you fixed a bug and just knowing how to fix that class of bug without thinking. The brain moves skills from effortful recall into automatic procedure. Agents should too.
Where I’m at
I’ve spent a long time on this now, and it’s the most interesting problem I’ve worked on — the place where neuroscience, systems design, and AI actually meet instead of just borrowing each other’s vocabulary. I’m reading the memory literature like it’s my job, because for the questions I care about, it kind of is.
If you’re building agentic memory, cognitive architectures, or long-horizon agent systems — or you’re hiring people who think about this — I’d genuinely like to compare notes. The best conversations I’ve had on this came from strangers who’d been staring at the same wall from a different side.
I taught my agents to sleep. Teaching them to wake up is turning out to be the harder, better problem.