Every serious agent project eventually hits the same wall. You have a capable model. You have tools. You have a context window that gets populated on each turn. And then the agent does something sensible in turn three that directly contradicts something it decided in turn fourteen, because turn fourteen cannot see turn three unless you explicitly carry it forward. The context window is not memory. It is a whiteboard that gets erased.
The standard answer is a vector database. Embed your history, store it, retrieve the most similar chunks at the start of each turn. This is the solution that most teams reach for, and I understand why. It is fast to implement, there are good hosted options, and it solves the literal retrieval problem.
What it does not solve is the structural memory problem. A vector store tells you what is semantically similar. It does not tell you what caused what, what was decided and why, what the agent learned that supersedes something it believed earlier, or how a series of events connect into a coherent episode. You can retrieve the fact that the agent mentioned a rate limit. You cannot easily query whether that rate limit caused a decision to switch APIs, or whether that decision was later corrected when the agent discovered the rate limit only applied to unauthenticated requests.
I built AgenticMemory to hold that structure.
The first design did not hold its own weight
My first implementation was a SQLite database with a normalized relational schema. A nodes table for cognitive events, an edges table for relationships. The schema was clean in the first week.
The real problem was not the migrations. It was that SQL is clumsy at graph traversal. “Starting from this decision, traverse CausedBy edges backward to the facts that grounded it, filtering for nodes that have not been superseded” — those queries became increasingly baroque as the edge type vocabulary grew. I acknowledged that and started over.
The .amem format
The new format is binary. The file starts with a 4-byte magic sequence (AMEM), followed by a fixed 64-byte header. Then come node records, edge records, a compressed content block, and a feature matrix.
The tradeoff: schema is compiled in. Changing a field means a migration tool, not ALTER TABLE. What SQL would not give me is the O(1) seek behavior or the contiguous feature matrix layout. I made the tradeoff deliberately.
The EventTypes and why Correction is the interesting one
There are 6 EventTypes: Fact, Decision, Inference, Correction, Skill, and Episode.
Correction is the one that shaped the whole design. When an agent learns that something it believed was wrong, the naive response is to delete the old node and replace it with the correct one. I rejected that early.
Deletion loses the history of what the agent believed and when it believed it. If an agent made a series of bad decisions grounded in a Fact that was later found to be wrong, you need the original Fact to still exist to reconstruct the belief state at the time of each decision.
The memory store is append-only in practice. Nothing gets deleted. The history is preserved. You pay in storage size, but cognitive event records are small and the LZ4 compression keeps the total footprint reasonable.
The seven EdgeTypes and what they make possible
CausedBy, Supports, Contradicts, Supersedes, RelatedTo, PartOf, and TemporalNext.
Without typed edges, you have a collection of nodes with unlabeled connections — traversal is meaningless. With them, you can ask structural questions about causality, contradiction, belief history, and time. These are the questions that matter when an agent is debugging its own reasoning.
The decay formula and its known failure mode
Every node has a decay lambda. Confidence at retrieval time: base × exp(-λ × days_since_access) × log2(access_count + 1) / 10.
The access multiplier models rehearsal and retention — things accessed frequently decay more slowly. This is a heuristic borrowed loosely from spaced repetition research. The failure mode I have not solved: the things you forget first are the things that have not come up recently, which sometimes are unimportant and sometimes are exactly what you needed.
The Python layer and three days of memory boundary errors
The core library is written in Rust. The obvious path to a Python interface is a C FFI layer — compile with cdylib and staticlib targets, write Python bindings using ctypes.
The root cause was ownership. Rust’s borrow checker enforces memory ownership at compile time. When you expose a Rust type across an FFI boundary, you hand a raw pointer to a runtime that knows nothing about Rust’s ownership rules. Python’s garbage collector freed Python objects that held references to Rust allocations that Rust still owned. The result was segfaults that occurred minutes into sessions, nondeterministic in timing, essentially impossible to reproduce.
Subprocess delegation solved it completely. Each memory operation is a message exchange: the Python side sends a JSON command, the Rust process reads it, executes, writes a JSON response. Slower than direct FFI. For cognitive event workloads, the overhead is invisible.
What is not built yet
What this actually means for how long an agent can remember you
I want to be direct about something that does not get said clearly enough in conversations about AI memory.
Right now, when you open a new chat with Claude or GPT, it does not know who you are. It does not know what you talked about yesterday, what you have been building for the last six months, what you corrected it on last week, or how your thinking has evolved. Every session starts from zero. That is not a limitation of the model. It is a limitation of not having a proper place to store memory.
AgenticMemory is that place.
Not just what you said. What the agent decided because of what you said. What it learned from your corrections and never got wrong again. The causal chain from a conversation you had in January to a decision the agent made in September because of what you established in January.
People hear “remember everything forever” and think it requires a data center. It requires a USB stick.
The decay model means the agent is not burdened by noise. Old facts that stopped being relevant fade naturally. Decisions and corrections that you keep referencing stay strong. The memory behaves more like how you actually work — the things that matter surface, the things that do not matter quiet down.
And here is what makes this portable in a way that nothing current is: the .amem file does not belong to Claude, or GPT, or any specific provider. It belongs to you. Start with one model, switch to another — every agent picks up the same brain file and knows everything the previous ones learned about you. Your history with an agent should not be trapped inside the provider’s servers. It should live with you, travel with you, and survive every model upgrade and platform switch that will happen over the next decade.
The memory problem in AI agents is not a storage problem. It is a structure problem. Everyone who tried to solve it reached for a bigger whiteboard. What it actually needed was a proper filing system, with relationships between the files, a record of how each file was used, and a way to trace back through every decision to the facts that grounded it.
That is what this is. And it fits on a USB stick.
The memory format and query engine are in active development. More on the embedding integration when there is something worth writing about.