Policies, thresholds, and runtime knobs.
MemState exposes two layers of configuration: policies that govern core behavior like history length and forgetting, and settings driven by environment variables for runtime tuning.
Core policies
Policies are the defaults the executor and reasoner consult on every operation. They are code-level defaults today; overriding them means constructing a custom policy object in integration code or tests.
| Policy | Default | Role |
|---|---|---|
max_field_history | 500 | Maximum revisions retained per field. Oldest revisions are trimmed beyond the cap. |
embedding_dimension | 384 | Expected topic embedding width. |
query_salience_bump | 0.1 | Amount added to a topic's salience every time retrieval touches it. |
forget_salience_threshold | 0.05 | Topics whose salience drops below this can be archived by the forget operator. |
topic_count_soft_limit | 500 | Pressure point that makes the reasoner run forget more aggressively. |
max_topics_for_forget_scan | 10,000 | Upper bound on the scan size used by the forget flow. |
Policies object in integration code or tests. A declarative runtime policy engine is on
the roadmap.
Forgetting is a ladder
Forgetting is not a single on/off action. It is a progression: reduce influence, detach from the active working set, and finally archive. Each stage is non-destructive — the underlying data stays on disk and can still be reached by an explicit query.
Forgetting attenuates content by relevance, never deletes it. Today MemState implements the archive stage; summarize and detach are planned.
Paper-aligned threshold ladder
The research formulation uses three salience thresholds to order the ladder: summarize (compress older
material while keeping the topic hot), detach (remove from default working set), and archive (exclude from
routine scans while retaining durable history). The reference build does not yet expose three separate
constants; the table below maps names to intent and to what exists in Policies today.
| Formal name | Ladder stage | In this codebase |
|---|---|---|
θsummary (summarize) |
Compress older field/topic material while the unit stays active. | Planned; history is trimmed only by max_field_history, not a dedicated summarize operator. |
θremove (detach) |
Drop from default retrieval expansion while keeping the row queryable explicitly. | Partially reflected by salience decay and archival flags; dedicated detach threshold not split out yet. |
θarchive (archive) |
Lowest salience band: mark archived, damp salience further, skip in routine scans. |
forget_salience_threshold in Policies gates
Executor.run_forget_low_salience (topics below the threshold are archived and salience is scaled down).
|
| Stage | What happens to the topic | Status |
|---|---|---|
| Summarize | Older history is compressed into summaries; the topic remains fully active. | Planned |
| Detach | Low-salience fields or connectors are removed from the active working set; they stay reachable via explicit lookup. | Planned |
| Archive | The topic is marked archived and excluded from default active scans and retrieval candidates. History stays on disk. | Shipped |
Retrieval stages
POST /v1/query accepts any subset of three stages. You can turn any of them off to trade
breadth for latency, or leave them all on (the default).
| Stage | What it does | Effect on the response |
|---|---|---|
semantic |
Embeds the query and uses KNN (or cosine fallback) over topic embeddings to decide which topic ids to hydrate. | Defines the candidate set; each bundle is topic memory (fields, summary, neighbors as requested). A similarity field may be present for diagnostics when this stage ran—the payload is still structured context, not a score-first API. |
structural |
For every candidate, pulls its typed neighbors and the requested fields. | Populates neighbors[] and field entries. |
temporal |
Enables full value history on each returned field when combined with explain=true. |
Returns the full revision stack rather than only the current value. |
Runtime settings (environment variables)
| Variable | Default | Role |
|---|---|---|
MEMSTATE_QUERY_FIELD_SALIENCE_BUMP | 0.1 | Per-field salience bump on query-routed reads. |
MEMSTATE_FIELD_SALIENCE_MAX | 10.0 | Upper bound for field salience values. |
MEMSTATE_CHAT_INTENT_TURNS | 8 | Dialogue turns included during intent classification. |
MEMSTATE_CHAT_MAX_TOOL_ROUNDS | 32 | Upper bound on chat tool-call iterations. |
MEMSTATE_CHAT_CHUNK_THRESHOLD_CHARS | 10000 | Minimum last-user-message length that activates the Study-mode ingest flow. |
MEMSTATE_CHAT_CHUNK_PER_SEGMENT_TOOL_ROUNDS | 72 | Tool budget for each Study phase-A segment (range 8–256). |
MEMSTATE_STUDY_PHASE_DELAY_SECONDS | 8.0 | Pause between Study phase A and B to smooth LLM rate limits. |
MEMSTATE_GROQ_RATE_LIMIT_MAX_RETRIES | 20 | Max retries on Groq 429 responses. |
MEMSTATE_GROQ_RATE_LIMIT_BACKOFF_CAP_SECONDS | 120.0 | Per-retry sleep cap for Groq backoff. |
Practical limits
top_kbounds the number of candidate topics considered and returned by retrieval.version_fieldingest requires exactly one field and a validtopic_id.- Large
jsonvalues are allowed but affect retrieval and API response latency. - Archived topics are retained on disk. Forgetting is always non-destructive.