Ingesting memories

Ingest is the write path. You send conversation messages; the server runs LLM-based extraction to pull out facts (and, when relevant, artifacts and episodes), embeds each one, and stores them in your org’s vector index.

The mental model

Ingest is asynchronous by default. Extraction is LLM-bound — typically 3–10 seconds — so the API returns a job immediately and does the work in the background. Your code polls or opts into sync mode.

┌──────────┐                              ┌───────────────┐
│  Client  │  POST /v1/memories  ──────►  │   Memory API  │
│          │  ◄────  IngestJob (pending)  │  (returns 1s) │
└──────────┘                              └───────────────┘
                                                  │
                                                  │  extraction (3–10s)
                                                  ▼
                                          status: succeeded
                                          result.memories_created: [...]

Required fields

Every ingest needs:

messages — array of { role, content }. Empty array → 400.
user_id — keys the per-user session namespace
conv_id — anchors every extracted memory to a conversation (for replay, export, bulk retract)

Optional: agent_id, app_id, group_ids (tag the extracted memories to shared groups — memories judged personal, e.g. health or finances, are never group-tagged), timestamp_format (a strptime format for parsing dated turns on the batch path), extract_artifacts (defaults to true — pass false to skip the artifact-extraction stage, the most expensive part of the pipeline).

Async ingest (default)

const job = await client.memories.ingest({
  messages: [
    { role: 'user', content: 'My favorite food is pad see ew.' },
    { role: 'assistant', content: 'Noted — Thai food.' },
  ],
  user_id: 'alice',
  conv_id: 'conv_2026_05_16',
});

// pollUntilDone handles exponential backoff (500ms → 5s) and timeout.
const done = await client.memories.jobs.pollUntilDone(job.id);

if (done.status === 'failed') {
  throw new Error(`Ingest failed: ${done.error?.message}`);
}

console.log('Created', done.result?.memories_created.length, 'memories');

Sync ingest (`wait: true`)

Useful for demos, one-shot scripts, or any code where you want the result inline:

const job = await client.memories.ingest(
  {
    messages: [{ role: 'user', content: 'I am vegetarian.' }],
    user_id: 'alice',
    conv_id: 'conv_2026_05_16',
  },
  { wait: true },
);

if (job.status === 'succeeded') {
  console.log('Inline result:', job.result?.memories_created);
} else if (job.status === 'failed') {
  console.error('Extraction failed:', job.error);
} else {
  // Sync budget elapsed (30s) — fell back to async; poll job.id as above.
  console.log('Polling required:', job.id);
}

The server holds the connection for up to 30 seconds. If extraction finishes in that window the response is terminal (succeeded or failed). If the budget elapses, you get a pending/running job back and have to poll — same as async mode.

Use sync mode for interactive demos and CLI tools; use async mode for production agent loops where you want to dispatch ingest and continue working.

What gets extracted

You pass messages; you don’t pre-decide what’s a fact vs an artifact vs an episode. The server’s extraction pipeline decides:

Type	Triggered when
Fact	The default. A semantic claim in a turn (“User likes X”, “User works at Y”).
Artifact	The conversation references a structured object — a doc, code snippet, summary — that’s worth storing standalone. Extracted by default; pass `extract_artifacts: false` to skip this stage.
Episode	A stretch of turns gets summarized into a session-level memory. Server-driven; no client knob.

The result.memories_created array tells you what landed; each entry is a thin reference ({id, type, text}). For the full row, call client.memories.get(id).

Tagging memories to groups

Pass group_ids to associate this ingest with one or more groups — shared tagging targets you register up front (see Groups). At extraction time a classifier tags each extracted memory: prompted groups get the memories their prompt matches, and catch-all groups (registered without a prompt) get every shareable memory. Other members of the group can then surface those memories with a group search.

await client.memories.ingest({
  messages: [
    { role: 'user', content: "When I'm in Tokyo I always stay near Shibuya station." },
    { role: 'assistant', content: 'Noted.' },
  ],
  user_id: 'alice',
  conv_id: 'conv_2026_05_16',
  group_ids: ['grp_tokyo2026'],
});

The classifier tags each extracted memory with the subset of group_ids it belongs to — a memory can land in several groups, one, or none. Untagged extraction still happens as usual; tagging is additive.
Memories judged personal (private/sensitive: health, family matters, finances, credentials) are never group-tagged — they stay in the author’s personal scope, fully retrievable there. See the personal gate.
Unknown or archived ids are soft-skipped — they never fail the ingest, and come back in result.ignored_group_ids so you can prune stale ids client-side.
Up to 20 group ids per ingest; more returns 422.

Groups are how you share memory across users. A fact Alice ingests with group_ids: ['grp_tokyo2026'] becomes visible to every member of that group via group search — without exposing her untagged personal memories.

Failure modes

Extraction can fail for various reasons — upstream LLM hiccup, content that doesn’t yield extractable facts, rate limits. The job lands in status: "failed" with an error.code and error.message. Retry by submitting the same body again; we don’t auto-retry server-side. Common failure codes:

Code	Meaning
`ingest_failed`	Generic extraction error; check `error.message`
`rate_limit_exceeded`	Org quota hit; wait and retry

Getting started

Guides

Reference

The mental model

Required fields

Async ingest (default)

Sync ingest (`wait: true`)

What gets extracted

Tagging memories to groups

Failure modes

See also

​The mental model

​Required fields

​Async ingest (default)

​Sync ingest (wait: true)

​What gets extracted

​Tagging memories to groups

​Failure modes

​See also

The mental model

Required fields

Async ingest (default)

Sync ingest (`wait: true`)

What gets extracted

Tagging memories to groups

Failure modes

See also