Architecture - HTTP, UI, and LLM stack

The HTTP layer is intentionally thin. It validates inputs, applies optional API-key auth, and delegates all graph behavior to shared internal services. Static UI and JSON APIs are served from the same FastAPI app instance.

Route families and ownership

Surface	Owner module	Notes
`/v1/ingest`, `/v1/query`	`memstate.api.main`	High-level memory semantics through `Executor`.
`/api/ui/*`	`memstate.api.ui_api`	Low-level topic/field/edge CRUD and graph snapshot payloads.
`/api/llm/chat`	`memstate.llm.chat_api`	Intent routing, provider calls, tool loops, optional Study pipeline; also exposes `/api/llm/transcribe` for Groq Whisper speech-to-text.
`/ui/*`	`StaticFiles` mount in `api.main`	Dev graph explorer frontend assets.
`/health*`	`api.main`	Operational liveness and graph-open check.

Auth behavior

If MEMSTATE_API_KEY is unset: all routes are open.
If set: protected routes require either X-API-Key or Authorization: Bearer <key>.
MEMSTATE_ADMIN_KEY (falls back to MEMSTATE_API_KEY) gates updates to system context via X-Admin-Key header on PUT /api/ui/system-context.
Health endpoints and static UI remain reachable without auth for operational usability.

LLM chat pipeline

POST /api/llm/chat
  -> normalize and clip dialogue turns
  -> resolve provider/model/base URL and tool round budget
  -> classify latest-turn intent: query | ingest | both
  -> choose tool set for route
  -> run provider chat loop (Ollama or Groq) with MemoryToolRunner
  -> optional Study mode (long ingest):
       Phase A sandbox ingest (study topic_kind)
       Phase B integrate with full memory
  -> return reply, tool log, routing metadata

Shared-state guarantee: LLM tools call the same GraphStore instance used by UI and /v1. There is no separate memory cache for chat.

LLM and MCP usage (visual)

Both LLM chat and MCP use MemoryToolRunner and the same graph model; HTTP chat adds intent routing and provider orchestration.

LLM vs MCP behavior in current code

Aspect	HTTP LLM chat	MCP server
Entry module	`memstate.llm.chat_api`	`memstate.llm.mcp_server`
Intent routing	Yes (`query\|ingest\|both`) via classifier/override.	No built-in classifier at server layer; client chooses tools directly.
Tool execution	`MemoryToolRunner(store, chat_route=...)`	`MemoryToolRunner(get_store())` in MCP tool handlers.
Provider orchestration	Ollama or Groq chat loops, retries, Study two-phase flow.	None; MCP exposes local tools over stdio transport.
State backend	GraphStore -> Kuzu	GraphStore -> Kuzu

Relation to the four operations: client-facing ingest/query are /v1 routes. LLM and MCP paths execute tool-level store operations directly, so internal revision/forget are not auto-triggered unless reasoner is invoked through the /v1 callback path.

Operational note: API server and MCP server are usually separate processes. If both point to the same MEMSTATE_KUZU_PATH, file-lock conflicts can occur depending on runtime overlap.

UI graph payload contract

GET /api/ui/graph and LLM graph tools both rely on build_ui_graph_snapshot() and return the same shape:

nodes[]: id, label/title, topic metadata, field summaries, computed community.
edges[]: structural RELATED edges and synthetic field-ref edges.

Error translation at HTTP boundary

Error source	Status mapping	Typical case
`ValueError` in core/store	400	Invalid placement payload or missing required topic id.
`verify_api_key`	401	Missing or incorrect API key header.
Upstream LLM HTTP errors	502 / 503	Provider timeout, invalid credentials, unreachable host.
Graph open failure on health check	503	Locked or inaccessible Kuzu path.

Route families and ownership

Auth behavior

LLM chat pipeline

LLM and MCP usage (visual)

LLM vs MCP behavior in current code

UI graph payload contract

Error translation at HTTP boundary

Cross-links