Skip to main content
Ingest is the write path. You send conversation messages; the server runs LLM-based extraction to pull out facts (and, when relevant, artifacts and episodes), embeds each one, and stores them in your org’s vector index.

The mental model

Ingest is asynchronous by default. Extraction is LLM-bound — typically 3–10 seconds — so the API returns a job immediately and does the work in the background. Your code polls or opts into sync mode.
┌──────────┐                              ┌───────────────┐
│  Client  │  POST /v1/memories  ──────►  │   Memory API  │
│          │  ◄────  IngestJob (pending)  │  (returns 1s) │
└──────────┘                              └───────────────┘

                                                  │  extraction (3–10s)

                                          status: succeeded
                                          result.memories_created: [...]

Required fields

Every ingest needs:
  • messages — array of { role, content }. Empty array → 400.
  • user_id — keys the per-user session namespace
  • conv_id — anchors every extracted memory to a conversation (for replay, export, bulk retract)
Optional: agent_id, app_id, group_ids (tag the extracted memories to shared groups), timestamp_format (a strptime format for parsing dated turns on the batch path), extract_artifacts (defaults to true — pass false to skip the artifact-extraction stage, the most expensive part of the pipeline).

Async ingest (default)

const job = await client.memories.ingest({
  messages: [
    { role: 'user', content: 'My favorite food is pad see ew.' },
    { role: 'assistant', content: 'Noted — Thai food.' },
  ],
  user_id: 'alice',
  conv_id: 'conv_2026_05_16',
});

// pollUntilDone handles exponential backoff (500ms → 5s) and timeout.
const done = await client.memories.jobs.pollUntilDone(job.id);

if (done.status === 'failed') {
  throw new Error(`Ingest failed: ${done.error?.message}`);
}

console.log('Created', done.result?.memories_created.length, 'memories');

Sync ingest (wait: true)

Useful for demos, one-shot scripts, or any code where you want the result inline:
const job = await client.memories.ingest(
  {
    messages: [{ role: 'user', content: 'I am vegetarian.' }],
    user_id: 'alice',
    conv_id: 'conv_2026_05_16',
  },
  { wait: true },
);

if (job.status === 'succeeded') {
  console.log('Inline result:', job.result?.memories_created);
} else if (job.status === 'failed') {
  console.error('Extraction failed:', job.error);
} else {
  // Sync budget elapsed (30s) — fell back to async; poll job.id as above.
  console.log('Polling required:', job.id);
}
The server holds the connection for up to 30 seconds. If extraction finishes in that window the response is terminal (succeeded or failed). If the budget elapses, you get a pending/running job back and have to poll — same as async mode.
Use sync mode for interactive demos and CLI tools; use async mode for production agent loops where you want to dispatch ingest and continue working.

What gets extracted

You pass messages; you don’t pre-decide what’s a fact vs an artifact vs an episode. The server’s extraction pipeline decides:
TypeTriggered when
FactThe default. A semantic claim in a turn (“User likes X”, “User works at Y”).
ArtifactThe conversation references a structured object — a doc, code snippet, summary — that’s worth storing standalone. Extracted by default; pass extract_artifacts: false to skip this stage.
EpisodeA stretch of turns gets summarized into a session-level memory. Server-driven; no client knob.
The result.memories_created array tells you what landed; each entry is a thin reference ({id, type, text}). For the full row, call client.memories.get(id).

Tagging memories to groups

Pass group_ids to associate this ingest with one or more groups — shared tagging targets you register up front (see Groups). At extraction time a classifier reads each group’s prompt and tags the extracted memories that belong to it. Other members of the group can then surface those memories with a group search.
await client.memories.ingest({
  messages: [
    { role: 'user', content: "When I'm in Tokyo I always stay near Shibuya station." },
    { role: 'assistant', content: 'Noted.' },
  ],
  user_id: 'alice',
  conv_id: 'conv_2026_05_16',
  group_ids: ['grp_tokyo2026'],
});
  • The classifier tags each extracted memory with the subset of group_ids it belongs to — a memory can land in several groups, one, or none. Untagged extraction still happens as usual; tagging is additive.
  • Unknown or archived ids are soft-skipped — they never fail the ingest, and come back in result.ignored_group_ids so you can prune stale ids client-side.
  • Up to 20 group ids per ingest; more returns 422.
Groups are how you share memory across users. A fact Alice ingests with group_ids: ['grp_tokyo2026'] becomes visible to every member of that group via group search — without exposing her untagged personal memories.

Failure modes

Extraction can fail for various reasons — upstream LLM hiccup, content that doesn’t yield extractable facts, rate limits. The job lands in status: "failed" with an error.code and error.message. Retry by submitting the same body again; we don’t auto-retry server-side. Common failure codes:
CodeMeaning
ingest_failedGeneric extraction error; check error.message
rate_limit_exceededOrg quota hit; wait and retry

See also

  • Searching memories — query what you just ingested
  • API Reference → Memories → Ingest — full request/response schemas