One process. One store. Three access surfaces.
MemState runs as a single FastAPI service with an embedded Kuzu graph. All three API surfaces share the same in-process store, so state is immediately consistent across product endpoints, UI tools, and the agent chat loop. Agent paths stay limited to observations and context queries; revise and forget run inside the reasoner, not in the agent integration.
Under the GEM state split Mt = (Dt, St, Pt),
this architecture keeps one durable Dt in Kuzu, evaluates St
(semantic / structural / temporal query stages) in the executor over that same store, and applies
Pt from the in-process Policies object plus environment settings.
Topology
MemState is deployed as one service. There is no separate worker, no external database tier, and no message queue. Everything — routing, write semantics, background maintenance, and persistence — lives in one process.
Single service
FastAPI + Uvicorn. Launched via python -m memstate.api.cli. One process owns the socket and the graph handle.
Embedded store
Kuzu is embedded and opened from MEMSTATE_KUZU_PATH. There is no external DB to operate.
In-process maintenance
Reasoner work runs on FastAPI BackgroundTasks — no separate queue or worker.
Shared singletons
All three API surfaces resolve the same Executor and the same store handle via dependency injection.
Components
| Component | Role | Called by |
|---|---|---|
api.main |
Route layer, health endpoints, auth wiring, router mounting. | HTTP clients |
api.deps |
Cached providers for settings, executor, reasoner, and the shared store. | FastAPI dependency injection |
core.Executor |
Write and read semantics: how observations become graph updates, retrieval stages, salience reinforcement (placement may be explicit on the wire today). | /v1/ingest, /v1/query |
reasoner.Reasoner |
Background maintenance: revise duplicates, forget by low salience. | Post-response background tasks |
store.GraphStore |
Kuzu facade for topic, field, edge, and vector operations. | Executor, UI API, tool runner |
llm.MemoryToolRunner |
Runs LLM tool calls against the same shared store. | /api/llm/chat, MCP server |
Request lifecycles
Ingest
Client -> FastAPI route -> Executor.ingest
-> GraphStore writes (topic / fields / edges / embedding)
-> HTTP response returned
-> BackgroundTasks callback -> Reasoner.run("ingest_complete")
-> run_forget_low_salience (if policy threshold met)
-> run_revise_duplicates (duplicate-title heuristic)
Retrieve
Client -> FastAPI route -> Executor.query
-> embed query text
-> pick candidate topics (vector KNN or cosine fallback)
-> optional structural expansion (neighbors + fields)
-> optional temporal expansion (history when explain=true)
-> salience bump + topic-history append per touched topic
-> HTTP response returned
-> BackgroundTasks callback -> Reasoner.run("query_complete")
Shared-singleton dependency model
Every API surface resolves the Executor through one dependency function. The Executor holds the shared
store; the store holds one Kuzu handle per configured database path. That means writes from the chat
loop are instantly visible to /v1/query without any cross-process sync.
One dependency cache guarantees every surface uses the same Executor and the same Kuzu handle.
Error behavior
- Validation errors —
ValueErrorbecomes HTTP 400 through a global exception handler. - Auth errors — optional API key mismatch becomes HTTP 401 from
verify_api_key. - Graph health failures —
/health/graphreturns 503 with a path and operator hint. - Background failures — a reasoner error does not affect the already-sent ingest or query response.