All articles
Memory·May 19, 2026·7 min read

Reconcile-on-Write: How to Stop Agent Memory From Rotting

Naive agent memory rots into near-duplicates and stale facts. Matrix reconciles every semantic write — cosine-gated UPDATE/ADD with an optional LLM arbiter.

By Matrix Team

Give an agent a memory and walk away for a month. Come back and read what it wrote. You will not find a clean profile of each contact. You will find sediment: "prefers email," "likes being emailed," "wants email follow-ups," "asked to be contacted by email," all stored as separate facts, all retrieved together, each one nudging the model toward repeating itself. Somewhere in that pile is the fact that the contact changed their mind two weeks ago and now wants a phone call — and it's outvoted by four stale rows that agree with each other.

This is agent memory deduplication as a production problem, not a tidiness one. Memory that grows monotonically doesn't just waste space. It crowds recall with noise, dilutes the signal the model needs, and lets old facts win by sheer count. The fix isn't a periodic cleanup job. It's reconciling every write before it lands.

Why naive append rots

Most agent memory is append-only because append is trivial: embed the note, write a row, move on. The write path never looks at what's already there. Two forces then degrade it over time.

Near-duplicates accumulate. Conversation is repetitive. The same fact gets restated across calls, across channels, in slightly different words. Each restatement embeds to a slightly different vector and writes a new row. Now your top-k recall returns five rows that say the same thing — five slots of your context budget spent on one fact.

Stale facts never die. When a contact updates something — a new address, a changed preference, a corrected name — the old fact is still sitting there. Recall has no way to know the new row should win. Both come back. The model sees a contradiction and resolves it arbitrarily, or averages them into something wrong.

The naive system gets worse the more it's used, which is exactly backwards from what memory is supposed to do.

The choke-point insight

You can't reconcile what you can't intercept. The first design decision in Matrix's memory is that all durable semantic writes go through exactly two doors:

  • MemoryExtractorService.newNote — the post-session extractor, a fire-and-forget LLM pass that distills each conversation into durable facts.
  • MemoryToolset.addNote — the live add_contact_note tool the agent calls mid-conversation.

Both are SEMANTIC writes — cross-session facts about a contact, the kind that survive past the current interaction. (Working and episodic memory have different lifecycles; see the four kinds of agent memory.) Because every semantic fact funnels through these two methods, there's a single seam to wire reconciliation into: MemoryReconciler. No fact reaches Neo4j without passing through it.

The hybrid rule

Reconciliation is the LEARN step of the CoALA cognitive cycle that runs every agent in the platform. Before a semantic note is written, Matrix embeds it and compares it — by cosine similarity — to the nearest existing memory in scope (same agent, same contact). The comparison drives one of three decisions, gated by two thresholds:

similarity = cosine(new_note_embedding, nearest_existing_in_scope)

if similarity >= matrix.memory.reconcile-update:   # 0.95
    UPDATE   # supersede the old row, keep it for history,
             # write the new row linked via `supersedes`
elif similarity < matrix.memory.reconcile-add:     # 0.80
    ADD      # genuinely new fact — just write it
else:                                              # 0.80 .. 0.95
    decision = llm_arbiter(new_note, nearest)      # optional bean
    # arbiter returns ADD | UPDATE | SKIP
    # falls back to the midpoint rule when the bean is absent

Three regimes, each handling a distinct failure mode:

UPDATE — at or above 0.95

A cosine of 0.95+ means the new note is, for practical purposes, a restatement or a refinement of an existing fact. Matrix doesn't blindly overwrite and it doesn't append a duplicate. It supersedes: the old row is marked superseded (it stays in the graph for history), and the new row is written with a supersedes back-link to it. Recall excludes superseded rows, so the latest version wins while the audit trail survives. That supersede mechanic is its own deep topic — temporal validity, "let the latest fact win without losing history" — covered in the temporal-validity post.

ADD — below 0.80

Below 0.80 cosine, the note is far enough from anything on file that it's almost certainly a new, distinct fact. There's nothing to reconcile against, so Matrix just writes it. This is the common case for a contact's first few facts and for genuinely new information.

The ambiguous band — 0.80 to 0.95

This is where naive systems quietly fail. The note is similar but not obviously the same — "lives in Pune" vs. "moved to Pune last year," or "interested in the senior role" vs. "wants a management track." Pure cosine can't tell a refinement from a coincidental phrasing overlap.

In this band Matrix can defer to an LLM arbiter (backed by VertexTextClient) that reads both notes and returns one of ADD, UPDATE, or SKIP — the last being the option a threshold alone can't express: "this adds nothing, don't write it at all."

The arbiter is optional. It's a real architectural property, not a footnote: if the Vertex bean isn't configured, the reconciler doesn't hang or error — it falls back to the midpoint rule, treating the band by a simple split. You get cosine-gated reconciliation everywhere, and the smarter judgment in the gray zone only when you've wired the bean in.

When reconcile is skipped entirely

One honest caveat: reconciliation needs embeddings to compare. With the noop embedding backend — the substring-fallback mode you might run locally without an embeddings provider — there are no vectors to compute cosine over, so reconcile is skipped and every note is an ADD. If you're testing dedup behavior, make sure you're on a real embedding backend, or you'll see the naive append behavior the feature exists to prevent.

Config you actually tune

The two thresholds are the knobs. Defaults are deliberately conservative — UPDATE only when the system is very sure (0.95), the ambiguous band kept narrow (0.80–0.95):

PropertyDefaultEffect
matrix.memory.reconcile-update0.95Cosine at/above this → UPDATE (supersede + add)
matrix.memory.reconcile-add0.80Cosine below this → ADD; in between, the arbiter (or midpoint) decides

Raise reconcile-update toward 1.0 and you'll only merge near-exact restatements (more ADDs, more potential duplicates). Lower it and you'll merge more aggressively (cleaner store, higher risk of collapsing two facts that only looked alike). Widen the band by lowering reconcile-add to push more decisions to the arbiter. The right setting depends on how chatty and how factually dense your conversations are.

Why this makes recall better

Reconciliation isn't cosmetic — it directly feeds the second half of the memory system: ranking. Query-driven recall doesn't return raw vector hits. It re-ranks them by a weighted blend of three signals:

score = α·similarity + β·recency + γ·importance
# matrix.memory.rank-{similarity,recency,importance}
# defaults: 0.7 / 0.2 / 0.1   (importance defaults to 0.5 per row)

Now connect the two. In a naive append store, five near-duplicate rows all score high on similarity and crowd out a single more-relevant fact — the duplicates win by count. Recency can't help either, because the stale facts keep getting restated, so they look fresh. Reconcile-on-write removes that distortion at the source: one fact is one row, superseded rows are excluded from recall entirely, and the recency and importance terms get to do their job instead of being drowned by volume.

The takeaway: dedup is a recall-quality feature, not a storage feature. Every duplicate you don't write is a context slot you don't waste and a vote you don't let a stale fact cast.

Takeaway

Append-only agent memory rots predictably: near-duplicates pile up, stale facts never die, and recall slowly fills with noise that drowns the signal. Matrix stops the rot at the write path. Every semantic fact passes through MemoryReconciler at one of two choke points, gets embedded and compared to the nearest existing memory in scope, and is then merged (UPDATE, ≥0.95), added (ADD, <0.80), or — in the gray band — judged by an optional LLM arbiter that can even decide to SKIP. Superseded rows stay for history but drop out of recall, so the latest fact wins, ranking weights mean something, and memory gets cleaner the more it's used instead of dirtier.

Reconciliation is half of the story. The other half is what happens to the row you superseded — kept, time-stamped, and queryable without polluting recall. That's temporal validity. And if you want the map of what kind of memory this even is, start with the four kinds of agent memory.


Build an agent that remembers without rotting. Spin up a workspace, attach the built-in memory tools to any agent, and watch reconcile-on-write keep its contact profiles clean — no cleanup job required. Read the full design in docs/COGNITIVE_CORE.md and ship your first agent from /admin/agents.

#agent memory deduplication#embeddings#reconciliation

Build your first agent on Matrix

Spin up a workspace, wire up tools and knowledge, give your agent a voice, and talk to it in real time — no agent code required.

Keep reading