Operation · Ingest

The write path into memory.

Ingest is how observations become durable memory. MemState applies placement, merge, and revision rules inside the executor and store — the agent loop should not orchestrate revise or forget. (This reference build may still require explicit placement on the wire; see How it works and the payload tables below.)

In the GEM decomposition Mt = (Dt, St, Pt), ingest is the operator that advances Dt from an observation: placement picks one of three graph primitives (new_topic, extend_topic, version_field), then the executor commits field revisions, edges, and refreshed embeddings under the active Pt caps and thresholds.

How it works

  1. Client calls POST /v1/ingest. The body carries an observation payload (fields, edges, text). Placement may be explicit in this build or derived later as routing matures.
  2. Executor validates and applies MemState rules. It interprets the payload, resolves the target topic when needed, and executes the write path the policy layer selects.
  3. Primitives run against the store. A new topic node is created, fields are appended with full revision metadata, and typed edges are merged.
  4. The embedding is refreshed. Any topic whose text changed gets a new vector so semantic search stays in sync.
  5. Background maintenance is scheduled. The reasoner is notified that ingest completed, which may trigger revision or forget work later.

Placement modes

These are the three write shapes MemState can apply to materialize an observation. Today the client often supplies placement explicitly; long term, MemState is expected to choose among them from the observation alone.

+

new_topic

Create a brand-new topic with its fields and edges. Use this for genuinely new things.

extend_topic

Add fields or edges to an existing topic. Use this when you're enriching something you already know.

🕒

version_field

Append a new revision to exactly one field on an existing topic. Use this when a value changed.

Operation lifecycle map

ingest client route /v1/ingest query client route /v1/query Reasoner.run(event) background policy dispatcher revision run_revise_duplicates() forget run_forget_low_salience()

Client-facing operations are ingest and query. Internal operations revision and forget are triggered by reasoner policy events.

Access and trigger

  • HTTP route: POST /v1/ingest in memstate.api.main.
  • SDK route: MemoryClient.ingest() in memstate.client.
  • Auth: guarded by API key only when MEMSTATE_API_KEY is configured.

Current LLM chat and MCP flows do not call /v1/ingest; they use MemoryToolRunner direct tool operations on GraphStore.

Ingest mode differentiation (current code)

Mode Entry surface When it is used Core path
Structured ingest POST /v1/ingest Client/SDK sends explicit placement payload. api.main.ingest -> Executor.ingest -> GraphStore.
Conversational ingest (normal / small) POST /api/llm/chat LLM route resolves to ingest/both, and Study condition is false. chat_api.llm_chat -> run_ollama_chat|run_groq_chat with tools_for_intent_route.
Conversational ingest (Study) POST /api/llm/chat LLM route resolves to ingest/both, and Study condition is true for long message ingest. chat_api._chat_study_ingest phase A (sandbox) + phase B (integration).

Router decision: normal vs Study ingest

In chat_api.llm_chat, Study is selected only when all three checks pass: body.study_ingest == true AND route in {"ingest","both"} AND len(last_user_message) > settings.chat_chunk_threshold_chars.

POST /api/llm/chat chat_api.llm_chat prepare dialogue last message must be user resolve route intent_override or classifier routing clip cap: 12,000 chars _should_use_study ? study_ingest == true route is ingest|both last_user_len > threshold Study ingest path _chat_study_ingest() phase A + phase B Normal ingest path run_ollama_chat or run_groq_chat true false

Decision logic is implemented in chat_api._should_use_study() and evaluated after intent routing.

Settings that influence the decision

Setting Default Effect
study_ingest (request body)trueMaster on/off switch for Study mode in this request.
MEMSTATE_CHAT_CHUNK_THRESHOLD_CHARS10000Minimum last-user-message length to activate Study path.
intent_override (request body)unsetCan force route to ingest/both, affecting Study eligibility.
MEMSTATE_CHAT_INTENT_TURNS8How much prior dialogue is considered for intent classification.

How each conversational ingest mode works

A) Normal/small-message ingest path

/api/llm/chat
  -> route resolved as ingest|both
  -> _should_use_study == false
  -> MemoryToolRunner(store, chat_route=route)
  -> tool set = tools_for_intent_route(route)
  -> run_ollama_chat or run_groq_chat
  -> return reply + tool_log + route metadata

This path uses normal tool budgets (max_tool_rounds from request or MEMSTATE_CHAT_MAX_TOOL_ROUNDS, default 32) and does not create a Study session topic kind.

B) Study ingest path (long message)

Long user text & route ingest|both Phase A: sandbox ingest build_study_hierarchy() topic_kind = study:<session_id> tools_for_study_phase_a() optional delay study_phase_delay_seconds Phase B: integration MemoryToolRunner without study lock tools_for_intent_route(route) link/merge into main memory response includes study_ingest=true, study_phases=2

Study uses two LLM/tool phases: phase A constrained to session topics, then phase B integrates with broader memory.

Study-specific controls and constraints

  • Phase A tool budget: max(base_tool_rounds, MEMSTATE_CHAT_CHUNK_PER_SEGMENT_TOOL_ROUNDS) (default min 72).
  • Phase A runner is created with study_session_kind=study:<session_id> and catalog context.
  • Phase A forbids full graph tool (memory_graph_snapshot) and restricts linking to same study session topics.
  • Phase B returns to normal route tool set and can connect Study topics to existing memory.

Objective in code

Objective: apply controlled write transitions (new_topic, extend_topic, version_field) so all client writes pass one semantics layer before touching GraphStore.

Visual flow (current)

Client/SDK POST /v1/ingest api.main.ingest Executor.ingest(req) Placement branch new_topic extend_topic version_field GraphStore topic/field/edge writes BackgroundTasks -> reasoner.run("ingest_complete")

Ingest applies a validated placement transition, writes via GraphStore, returns response, then schedules internal maintenance.

Input contract (current implementation)

Field Used by ingest Notes
placementYesRequired: new_topic, extend_topic, or version_field.
topic_idConditionalRequired for extend_topic and version_field.
title, summaryYesUsed for create and embedding text generation.
fields[]YesEach field becomes a history append operation.
edges[]YesEach edge becomes RELATED merge.
suggest_similarOptionalIf true, computes top-k similar topic ids before placement writes.

Placement behavior details

Placement Write behavior Validation and constraints
new_topic Generate UUID topic id, create topic row, append each provided field history item, add each provided edge. No topic_id required from client; default title fallback is untitled.
extend_topic Append/update fields and edges on an existing topic, then refresh topic embedding from current title+summary. Fails if topic_id missing or topic does not exist.
version_field Append exactly one new field history entry to one field, then refresh embedding. Requires existing topic and exactly one item in fields[].

Execution sequence

Client -> POST /v1/ingest -> Executor.ingest(req)
  -> optional similar-topic pass (KNN then cosine fallback; k=8)
  -> apply placement transition
      -> GraphStore.create_topic / append_field_history / add_related_edge
      -> update_topic_embedding for extend/version paths
  -> return IngestResponse(topic_id, applied[], version_ids{}, similar_topic_ids[])
  -> FastAPI BackgroundTasks schedules reasoner.run("ingest_complete")

Side effects and policy coupling

  • Each field write creates a FieldHistoryEntry (UUID + timestamp + provenance metadata).
  • Field history length is capped by Policies.max_field_history (default 500).
  • Reference behavior uses REF_UNCHANGED sentinel so omitted refs are preserved on append.
  • Ingest completion always enqueues internal maintenance (ingest_complete reasoner event).

Failure modes

Condition Raised in code HTTP result
Unknown placementValueError("unknown placement")400
Missing required topic_idValueError400
Topic not found for extend/versionValueError("topic not found")400
Unauthorized keyHTTPException(401)401

Code map

  • memstate.api.main.ingest
  • memstate.core.executor.Executor.ingest
  • memstate.store.graph_store.append_field_history
  • memstate.reasoner.engine.Reasoner.run (event ingest_complete)

Next operation: query