Architecture

One process. One store. Three access surfaces.

MemState runs as a single FastAPI service with an embedded Kuzu graph. All three API surfaces share the same in-process store, so state is immediately consistent across product endpoints, UI tools, and the agent chat loop. Agent paths stay limited to observations and context queries; revise and forget run inside the reasoner, not in the agent integration.

Under the GEM state split Mt = (Dt, St, Pt), this architecture keeps one durable Dt in Kuzu, evaluates St (semantic / structural / temporal query stages) in the executor over that same store, and applies Pt from the in-process Policies object plus environment settings.

Topology

MemState is deployed as one service. There is no separate worker, no external database tier, and no message queue. Everything — routing, write semantics, background maintenance, and persistence — lives in one process.

🧩

Single service

FastAPI + Uvicorn. Launched via python -m memstate.api.cli. One process owns the socket and the graph handle.

🗄

Embedded store

Kuzu is embedded and opened from MEMSTATE_KUZU_PATH. There is no external DB to operate.

In-process maintenance

Reasoner work runs on FastAPI BackgroundTasks — no separate queue or worker.

🔗

Shared singletons

All three API surfaces resolve the same Executor and the same store handle via dependency injection.

Components

Component Role Called by
api.main Route layer, health endpoints, auth wiring, router mounting. HTTP clients
api.deps Cached providers for settings, executor, reasoner, and the shared store. FastAPI dependency injection
core.Executor Write and read semantics: how observations become graph updates, retrieval stages, salience reinforcement (placement may be explicit on the wire today). /v1/ingest, /v1/query
reasoner.Reasoner Background maintenance: revise duplicates, forget by low salience. Post-response background tasks
store.GraphStore Kuzu facade for topic, field, edge, and vector operations. Executor, UI API, tool runner
llm.MemoryToolRunner Runs LLM tool calls against the same shared store. /api/llm/chat, MCP server

Request lifecycles

Ingest

Client -> FastAPI route -> Executor.ingest
  -> GraphStore writes (topic / fields / edges / embedding)
  -> HTTP response returned
  -> BackgroundTasks callback -> Reasoner.run("ingest_complete")
      -> run_forget_low_salience  (if policy threshold met)
      -> run_revise_duplicates    (duplicate-title heuristic)

Retrieve

Client -> FastAPI route -> Executor.query
  -> embed query text
  -> pick candidate topics (vector KNN or cosine fallback)
  -> optional structural expansion (neighbors + fields)
  -> optional temporal expansion (history when explain=true)
  -> salience bump + topic-history append per touched topic
  -> HTTP response returned
  -> BackgroundTasks callback -> Reasoner.run("query_complete")

Shared-singleton dependency model

Every API surface resolves the Executor through one dependency function. The Executor holds the shared store; the store holds one Kuzu handle per configured database path. That means writes from the chat loop are instantly visible to /v1/query without any cross-process sync.

get_executor() @lru_cache Executor shared store policy + embedder GraphStore shared instance KuzuGraph one handle per path get_graph_store() reuses executor.store

One dependency cache guarantees every surface uses the same Executor and the same Kuzu handle.

Error behavior

  • Validation errorsValueError becomes HTTP 400 through a global exception handler.
  • Auth errors — optional API key mismatch becomes HTTP 401 from verify_api_key.
  • Graph health failures/health/graph returns 503 with a path and operator hint.
  • Background failures — a reasoner error does not affect the already-sent ingest or query response.

Related