If you build agents that persist past a single conversation, you will eventually discover that "agent memory" is not one thing.
The discovery doesn't come at once. It comes as a slow accumulation of weirdness. The agent contradicts something it said two weeks ago. It treats a one-off customer concession as a standing policy. Its voice drifts toward generic. It forgets a commitment it made on Tuesday. Its tone shifts depending on which thread it last operated in.
None of these are model failures. They are memory failures. And they are silent because we are using one word — memory — for at least five different systems.
The failure mode is silent
This is the most important thing to internalize. Untyped agent memory does not fail like software fails. There's no stack trace. The agent stays fluent; it just stops being itself. The contradictions accumulate. The trust erodes. You eventually conclude the agent "doesn't work," start a new project, wire up a vector store again, and the cycle repeats.
The cycle repeats because the issue was not the storage layer. The issue was treating five different memory systems as one.
Agent memory is not a system. It is at least five systems pretending to be one.
The five types
Here are the five types. Each has its own retention rule, its own mutation rule, and its own failure mode. Mix them and the system drifts in ways the logs won't catch.
1 · Canon
What is true. Slow-changing. Shared across agents. Authoritative. The voice guide, the approved-claim list, the product names, the refusals, the policies. Canon is what the agent reads when it needs to know what the company believes.
Canon mutates rarely and carefully. When it does change, there is a record. The agent does not write to canon. The agent reads from it.
2 · Episodic
What happened. Append-only. Immutable. Timestamped. The customer conversation, the past action, the decision that was made, the workflow run that completed.
Episodic memory is the company's lived experience. Agents add to it but do not edit it. It is the raw material from which patterns get extracted and from which canon eventually gets written.
3 · Working
What's open. Short-lived. Bounded by the current task. The conversation context, the in-flight commitments, the intermediate outputs that won't survive past this session.
Working memory evaporates when the work closes. It is supposed to. The bug, in most agent implementations, is that working memory gets persisted past its useful life and starts to corrupt the other types.
4 · Identity
Who I am. The agent's self-model. The tone the agent uses. The refusal posture. The confidence threshold for actions. The escalation rules. The relationship boundaries — who the agent reports to, who it serves, who it argues with.
Identity is curated. It does not drift on its own; it drifts when nothing protects it. The agent that has no identity layer slowly absorbs the median voice of whoever it last talked to.
5 · Scar-tissue
What not to repeat. The failures the agent has been taught around. The customer who got hurt by a particular response. The category of mistake that costs the company every time it happens. The policy carve-out that nobody told the agent about until it shipped the wrong thing.
Scar-tissue is first-class. It is not log noise. It is the company's institutional memory of what hurt, and an agent that does not have a scar-tissue layer will repeat the same mistakes indefinitely, sometimes within the same week.
Why conflating them breaks things
The failures show up at the seams between memory types.
Episodic treated as canon. A customer success rep made a one-time concession to a difficult customer. The transcript got indexed. Six weeks later, an agent finds it during a retrieval pass and treats it as standing policy, applies it to a new customer, and gives away revenue.
Identity treated as canon. The agent has been operating in support, where the tone is warm and slightly informal. Six weeks later, it gets routed into a sales conversation, brings the support tone with it, and the deal closes ten percent below the company's posture because the agent's identity was acting as canon.
Canon treated as working. The voice guide gets loaded into the conversation context. When the user asks the agent to "be more casual," the agent edits the voice guide in working memory and proceeds to operate against the edited version. The canon is now degraded for the rest of the session, and there is no signal that anything happened.
Working treated as episodic. The intermediate scratch-thinking the agent did to solve a task gets persisted to the long-term store. Now retrieval pulls back half-formed reasoning as if it were a record of what the company actually decided. The next agent that retrieves it treats half-thoughts as positions.
Scar-tissue treated as log. The agent shipped a bad response, the team fixed it, the incident report got filed in the same folder as every other log event, and nothing was elevated to scar-tissue. Three months later, the agent ships the same response again, in a slightly different context, and the team is mystified.
These failure patterns are silent. Each one looks like "the agent is a little off" until enough of them accumulate that the system as a whole stops being trustworthy.
Retention rules differ
Each type has its own retention rule, and the rules are not interchangeable.
Canon: mutates rarely, carefully, with a change log. Editable in place by named owners.
Episodic: append-only, immutable, timestamped. You can summarize it. You cannot rewrite it.
Working: short-lived. Evaporates when the task closes. If it persists past the task, something is wrong.
Identity: curated by humans. Not learned from interaction. The agent does not drift its own identity from talking to enough people; humans declare what the agent is.
Scar-tissue: append-only and protected. New scar-tissue gets added when a failure is named. Existing scar-tissue gets reviewed but rarely deleted, because the cost of forgetting a lesson is higher than the cost of redundancy.
If you give all five the same retention rule, you will get the same problem the LLM had at the prompt layer — context that is technically present but operationally useless because it cannot be relied on to mean what it appears to mean.
The taxonomy is not optional
I'm sure there are agent systems that work fine without this taxonomy. They are the demos. They are the single-task copilots that run for a fifteen-minute interaction and then disappear.
The moment the agent persists across sessions, holds mandates over time, or operates against the company's actual customer-facing surfaces, the typing is not optional. The cost of skipping it is paid in slow trust erosion, which is more expensive than every alternative because it is invisible until it is acute.
Type your memory. Name the layers. Give each one its own retention rule, its own mutation rule, its own retrieval expectation.
A vector store is not a voice guide. An event log is not a commitment tracker. A transcript is not a relationship.
Closing
The agent that knows itself is the agent that knows what kind of memory it is reading from at any given moment. Canon vs. episodic. Identity vs. context. Standing rule vs. one-off carve-out. Lesson vs. log line.
If your agent cannot make these distinctions, you have not built an agent that persists. You have built one that drifts.
Type your memory or it will untype itself.