The write path into memory.
Ingest is how observations become durable memory. MemState applies placement, merge,
and revision rules inside the executor and store — the agent loop should not orchestrate
revise or forget. (This reference build may still require explicit placement on the wire;
see How it works and the payload tables below.)
In the GEM decomposition Mt = (Dt, St, Pt),
ingest is the operator that advances Dt from an observation:
placement picks one of three graph primitives (new_topic, extend_topic,
version_field), then the executor commits field revisions, edges, and refreshed embeddings under
the active Pt caps and thresholds.
How it works
- Client calls
POST /v1/ingest. The body carries an observation payload (fields, edges, text). Placement may be explicit in this build or derived later as routing matures. - Executor validates and applies MemState rules. It interprets the payload, resolves the target topic when needed, and executes the write path the policy layer selects.
- Primitives run against the store. A new topic node is created, fields are appended with full revision metadata, and typed edges are merged.
- The embedding is refreshed. Any topic whose text changed gets a new vector so semantic search stays in sync.
- Background maintenance is scheduled. The reasoner is notified that ingest completed, which may trigger revision or forget work later.
Placement modes
These are the three write shapes MemState can apply to materialize an observation.
Today the client often supplies placement explicitly; long term, MemState is expected to
choose among them from the observation alone.
new_topic
Create a brand-new topic with its fields and edges. Use this for genuinely new things.
extend_topic
Add fields or edges to an existing topic. Use this when you're enriching something you already know.
version_field
Append a new revision to exactly one field on an existing topic. Use this when a value changed.
Operation lifecycle map
Client-facing operations are ingest and query. Internal operations revision and forget are triggered by reasoner policy events.
Access and trigger
- HTTP route:
POST /v1/ingestinmemstate.api.main. - SDK route:
MemoryClient.ingest()inmemstate.client. - Auth: guarded by API key only when
MEMSTATE_API_KEYis configured.
Current LLM chat and MCP flows do not call /v1/ingest; they use MemoryToolRunner direct tool operations on
GraphStore.
Ingest mode differentiation (current code)
| Mode | Entry surface | When it is used | Core path |
|---|---|---|---|
| Structured ingest | POST /v1/ingest |
Client/SDK sends explicit placement payload. | api.main.ingest -> Executor.ingest -> GraphStore. |
| Conversational ingest (normal / small) | POST /api/llm/chat |
LLM route resolves to ingest/both, and Study condition is false. |
chat_api.llm_chat -> run_ollama_chat|run_groq_chat with tools_for_intent_route. |
| Conversational ingest (Study) | POST /api/llm/chat |
LLM route resolves to ingest/both, and Study condition is true for long message ingest. |
chat_api._chat_study_ingest phase A (sandbox) + phase B (integration). |
Router decision: normal vs Study ingest
In chat_api.llm_chat, Study is selected only when all three checks pass:
body.study_ingest == true AND route in {"ingest","both"} AND
len(last_user_message) > settings.chat_chunk_threshold_chars.
Decision logic is implemented in chat_api._should_use_study() and evaluated after intent routing.
Settings that influence the decision
| Setting | Default | Effect |
|---|---|---|
study_ingest (request body) | true | Master on/off switch for Study mode in this request. |
MEMSTATE_CHAT_CHUNK_THRESHOLD_CHARS | 10000 | Minimum last-user-message length to activate Study path. |
intent_override (request body) | unset | Can force route to ingest/both, affecting Study eligibility. |
MEMSTATE_CHAT_INTENT_TURNS | 8 | How much prior dialogue is considered for intent classification. |
How each conversational ingest mode works
A) Normal/small-message ingest path
/api/llm/chat
-> route resolved as ingest|both
-> _should_use_study == false
-> MemoryToolRunner(store, chat_route=route)
-> tool set = tools_for_intent_route(route)
-> run_ollama_chat or run_groq_chat
-> return reply + tool_log + route metadata
This path uses normal tool budgets (max_tool_rounds from request or MEMSTATE_CHAT_MAX_TOOL_ROUNDS, default 32) and does not create a
Study session topic kind.
B) Study ingest path (long message)
Study uses two LLM/tool phases: phase A constrained to session topics, then phase B integrates with broader memory.
Study-specific controls and constraints
- Phase A tool budget:
max(base_tool_rounds, MEMSTATE_CHAT_CHUNK_PER_SEGMENT_TOOL_ROUNDS)(default min 72). - Phase A runner is created with
study_session_kind=study:<session_id>and catalog context. - Phase A forbids full graph tool (
memory_graph_snapshot) and restricts linking to same study session topics. - Phase B returns to normal route tool set and can connect Study topics to existing memory.
Objective in code
new_topic, extend_topic, version_field) so all client writes pass one
semantics layer before touching GraphStore.
Visual flow (current)
Ingest applies a validated placement transition, writes via GraphStore, returns response, then schedules internal maintenance.
Input contract (current implementation)
| Field | Used by ingest | Notes |
|---|---|---|
placement | Yes | Required: new_topic, extend_topic, or version_field. |
topic_id | Conditional | Required for extend_topic and version_field. |
title, summary | Yes | Used for create and embedding text generation. |
fields[] | Yes | Each field becomes a history append operation. |
edges[] | Yes | Each edge becomes RELATED merge. |
suggest_similar | Optional | If true, computes top-k similar topic ids before placement writes. |
Placement behavior details
| Placement | Write behavior | Validation and constraints |
|---|---|---|
new_topic |
Generate UUID topic id, create topic row, append each provided field history item, add each provided edge. | No topic_id required from client; default title fallback is untitled. |
extend_topic |
Append/update fields and edges on an existing topic, then refresh topic embedding from current title+summary. | Fails if topic_id missing or topic does not exist. |
version_field |
Append exactly one new field history entry to one field, then refresh embedding. | Requires existing topic and exactly one item in fields[]. |
Execution sequence
Client -> POST /v1/ingest -> Executor.ingest(req)
-> optional similar-topic pass (KNN then cosine fallback; k=8)
-> apply placement transition
-> GraphStore.create_topic / append_field_history / add_related_edge
-> update_topic_embedding for extend/version paths
-> return IngestResponse(topic_id, applied[], version_ids{}, similar_topic_ids[])
-> FastAPI BackgroundTasks schedules reasoner.run("ingest_complete")
Side effects and policy coupling
- Each field write creates a
FieldHistoryEntry(UUID + timestamp + provenance metadata). - Field history length is capped by
Policies.max_field_history(default 500). - Reference behavior uses
REF_UNCHANGEDsentinel so omitted refs are preserved on append. - Ingest completion always enqueues internal maintenance (
ingest_completereasoner event).
Failure modes
| Condition | Raised in code | HTTP result |
|---|---|---|
| Unknown placement | ValueError("unknown placement") | 400 |
Missing required topic_id | ValueError | 400 |
| Topic not found for extend/version | ValueError("topic not found") | 400 |
| Unauthorized key | HTTPException(401) | 401 |
Code map
memstate.api.main.ingestmemstate.core.executor.Executor.ingestmemstate.store.graph_store.append_field_historymemstate.reasoner.engine.Reasoner.run(eventingest_complete)