Operation · Revise

Keep the store coherent as it grows.

Revise is a background operation MemState runs after ingest and query traffic. It reconciles contradictions and consolidates duplicates so the store stays a single source of truth — the agent never invokes it directly; it keeps sending observations and asking for context.

Revise is the consolidation operator on Dt: it merges duplicate titles, transfers edges, and keeps the graph aligned with what the agent has already written — always after traffic, never as an extra agent API. It reads the same Pt object as ingest and forget. Context: GEM state.

What it does

Revise walks the store and looks for topics that refer to the same real-world thing, fields with conflicting values, or records that should be consolidated. It is meant to be a gentle background process, not a user-facing write path.

🔗

Consolidate duplicates

Merge topic pairs that represent the same thing so callers don't see two half-complete records.

Reconcile conflicts

Supersede older values with newer evidence without erasing the older values from history.

🔄

Preserve history

Every merge or supersession is recorded. Superseded values are still retrievable if asked for explicitly.

How it's triggered

  • Revise is not called directly by clients. It has no public HTTP route.
  • The reasoner triggers it after every ingest and on a periodic schedule, when there are topics to look at.
  • Today, reasoner runs are scheduled after /v1/ingest and /v1/query. UI and chat tool writes do not schedule revise work.
Revise runs in the background on the same FastAPI process — there is no separate worker or queue.

Flow

Reasoner event ingest_complete / cron run_revise_duplicates Executor find duplicates title equality today merge_topics edges + fields + delete ReasonerResult revise_actions[]

Revise is an internal loop: find duplicates, merge, record the actions taken.

Today's duplicate detection

The current duplicate heuristic is deliberately conservative.

  • Candidate pairs are topics with the same non-empty title.
  • Each pair is returned once, with stable ordering (id_a < id_b).
  • Each run is capped at 50 pairs so maintenance never monopolizes the process.
This is a starting point, not the final heuristic. Richer detection based on summary similarity, field overlap, and embedding proximity is on the roadmap.

What a merge does

StepBehavior
Outgoing links Rewritten to point from the surviving topic; old edges are removed.
Incoming links Rewritten to point to the surviving topic; old edges are removed.
Fields Fields that exist only on the dropped topic are copied over. Name conflicts are left untouched on the surviving topic (not merged field-by-field today).
Legacy chains Legacy HAS_FIELD chains are re-pointed at the surviving topic when the field name is not already present.
Drop topic Removed with DETACH DELETE once transfer completes.

What's built today

Capability Status
Duplicate detection by exact title.Shipped
Graph-safe topic merge with edge transfer.Shipped
Field history merge for same-name fields.Planned
Semantic duplicate detection (summary + embedding).Planned
Client-facing dry-run of revise decisions.Planned
Conflict reconciliation across fields (not just duplicates).Planned

Code map

  • memstate.reasoner.engine.Reasoner.run
  • memstate.core.executor.Executor.run_revise_duplicates
  • memstate.store.graph_store.find_duplicate_titles
  • memstate.store.graph_store.merge_topics

Next: Forget