Skip to main content
POST
/
v1
/
memories
/
search
TypeScript SDK
import { MemoryClient } from '@xtraceai/memory';

const client = new MemoryClient({
  apiKey: process.env.XTRACE_API_KEY!,
  orgId:  process.env.XTRACE_ORG_ID!,
});

const results = await client.memories.search({
  query: 'what does the user like to eat?',
  filters: { user_id: 'alice' },
  limit: 10,
});

for (const m of results.data) {
  console.log(m.score?.toFixed(2), '·', m.text);
}
{
  "object": "search",
  "data": [
    {
      "id": "<string>",
      "text": "<string>",
      "object": "memory",
      "user_id": "<string>",
      "agent_id": "<string>",
      "conv_id": "<string>",
      "app_id": "<string>",
      "group_ids": [
        "<string>"
      ],
      "categories": [
        "<string>"
      ],
      "score": 123,
      "created_at": "2023-11-07T05:31:56Z",
      "updated_at": "2023-11-07T05:31:56Z",
      "details": {
        "fact_type": "<string>",
        "status": "<string>",
        "supersedes": "<string>",
        "source_role": "<string>",
        "episode_id": "<string>",
        "artifact_id": "<string>",
        "artifact_ids": [
          "<string>"
        ],
        "source_event_ids": [
          "<string>"
        ]
      }
    }
  ],
  "context": "<string>",
  "stage_timings": {},
  "context_selection_applied": false
}

Authorizations

x-api-key
string
header
required

Long-lived org API key. Alternative: Authorization: Bearer <key>.

X-Org-Id
string
header
required

Required alongside the API key (no key→org reverse index).

Body

application/json

POST /v1/memories/search — agentic memory search.

The pipeline has two phases:

  1. Retrieve — embed the query, fetch per-corpus vector candidates, optionally rerank. Output: a ranked set of :class:Memory rows.
  2. Compose (only when mode='compose') — an LLM context-selection step picks the most relevant subset of the candidates and weaves them into a markdown block ready to drop into an LLM prompt.

mode toggles whether phase 2 runs:

  • "compose" (default) → data populated with rows and context populated with the assembled markdown. The dominant use case (memory search → LLM prompt) gets the prompt-ready blob without an opt-in.
  • "retrieve"data populated, context null. Skips the LLM compose step (cheaper, faster); for callers building their own UI or doing custom downstream processing.

Scope is enforced server-side by the per-request store (QdrantVectorDB._scope_must) and follows "scope by what you pass": every scope axis you supply ANDs; an axis you omit is left unconstrained. org_id always comes from auth. At least one of user_id / agent_id / app_id / group_ids must be supplied — an unscoped org-wide search is rejected. user_id is optional here (unlike ingest, where it is required): omit it and pass group_ids to read a shared group across users. No filter DSL — the four scope axes are the only narrowing.

Legacy compatibility (undocumented in the public spec, kept so existing SDK installations don't break):

  • filters: {user_id: "X", ...} — accepted; user_id is lifted out of filters if absent at the body root. Other filter keys are silently dropped (the new shape doesn't support a general filter DSL).
  • mode: "rows" → translated to "retrieve".
  • mode: "context" → translated to "compose".
  • include: ["context_prompt"] → sets mode="compose" and drops the value. "full_content" is dropped silently.

These translations are intentionally invisible on the wire — the public spec only advertises the new shape — and will be removed once consumers have migrated.

query
string
required

Natural-language query text. Embedded server-side; xmem's pipeline ranks and selects.

Required string length: 1 - 4000
Example:

"who likes thai food?"

user_id
string | null

Scope key. When supplied, baked into the per-request store as an AND pin so reads are tenant-isolated at the Qdrant filter layer (QdrantVectorDB._scope_must). Optional on search (unlike ingest): omit it and pass group_ids to read a shared group across users. At least one of user_id / agent_id / app_id / group_ids is required. The compat shim also accepts it inside legacy filters.user_id and lifts it before field validation runs, so old SDKs that send the pre-#68 wire shape don't 422.

Example:

"user-123"

mode
enum<string>
default:compose

Pipeline depth selector.

  • compose (default) — vector retrieval plus an LLM context-selection step that picks the most relevant subset, then assembles it into a markdown block. data carries the selected rows; context carries the assembled markdown. One LLM call.
  • retrieve — vector retrieval only. No LLM, no agent, cheaper and faster. data carries the raw ranked candidates (the unfiltered set); context is null.

data is populated in both modes; under compose it's the LLM-selected subset, under retrieve it's the raw candidate set.

Available options:
retrieve,
compose
include
enum<string>[]

Which corpora to search. Subset to restrict — e.g. ["fact"] for facts-only. Default is all three.

Available options:
fact,
artifact,
episode
group_ids
string[]

Optional group tags — another AND scope axis. When non-empty, candidate rows must be tagged to at least one of the requested group(s):

org AND kb_type [ AND user_id ] AND ( group_ids ∩ <group_ids> ) [ AND agent_id ] [ AND app_id ]

Group membership (the ) is OR / any-of across the list — pass [trip_tokyo, trip_paris] to span both trips.

How it composes with user_id ("scope by what you pass"):

  • omit user_id, pass group_ids — the cross-user whole-group read: every user's rows tagged to those group(s). A traveler's AI sees the whole trip's shared facts, from any traveler.
  • pass both user_id and group_ids — the intersection: only the caller's own rows that are also tagged to those group(s) (the caller's slice of the group).

(agent_id / app_id AND on top when set — they narrow further.) Group ids are server-generated unguessable handles (see POST /v1/groups), so knowing the id is the access boundary; other users' untagged memories never surface.

Example:
["grp_a1b2c3d4e5f6071829304a5b6c7d8e9f"]
agent_id
string | null

Optional agent scope. When set, ANDs onto the active primary scope (whether that's user_id or group_ids): candidate rows must also carry this exact agent_id. Use it to narrow a search to one agent's contributions. Indexed payload key, same axis ingest stamps.

Example:

"bot-7"

app_id
string | null

Optional app scope. When set, ANDs onto the active primary scope (like agent_id): candidate rows must also carry this exact app_id. Indexed payload key, same axis ingest stamps.

Example:

"app-3"

Response

Successful Response

POST /v1/memories/search response.

data is always populated — both modes return the ranked Memory rows from retrieval. context is populated only when mode='compose' with the assembled markdown block from the LLM context-selection step. stage_timings is always present for per-stage latency attribution.

Note that under mode='compose' the rows in data are the retrieval candidates — a possibly larger set than the subset the LLM selected and wove into context. Useful for showing "everything we found" alongside "what we sent to the model."

mode
enum<string>
required

Echoes the request's mode. compose ⇒ both data and context populated; retrievedata populated and context null.

Available options:
retrieve,
compose
object
string
default:search

Constant discriminator for the resource type.

Allowed value: "search"
data
Memory · object[]

Ranked Memory rows from retrieval. Populated for both modes. Sort order is score desc. For mode=compose, these are the candidates considered by the context-selection step — a (possibly larger) superset of what ended up in context.

context
string | null

Assembled markdown context block ready for prompt insertion. Populated only when mode=compose; null otherwise.

stage_timings
Stage Timings · object

Per-stage retrieval-pipeline latencies (seconds). Populated in both modes.

context_selection_applied
boolean
default:false

True when xmem's LLM context-selection step ran. False indicates the pipeline fell through (e.g. empty candidate pool, or XMEM_ENABLE_CONTEXT_SELECTION=false).