xtrace_sdk.x_vec

XTrace Vec for Python.

This package provides the XTrace Vec for Python, which allows you to augment your Python AI applications using a privacy-preserving vector database. The SDK provides a set of tools and libraries to help you integrate with the XTrace platform and equip your AI applications with the ability to store, retrieve, and manage memory as needed in a secure and efficient manner.

Submodules

Classes

AWSKMSKeyProvider

Envelope encryption via AWS KMS.

KeyProvider

Protocol for objects that supply a 256-bit AES key and can wrap/unwrap it for storage.

PassphraseKeyProvider

Derives a 256-bit AES key from a passphrase using scrypt.

DataLoader

Encrypts and uploads document collections to XTrace.

Embedding

This class provides an interface to generate embeddings using different providers.

InferenceClient

A client wrapper for interacting with inference models via OpenAI API.

Retriever

Retrieves and decrypts chunks from XTrace using encrypted hamming distance search.

ExecutionContext

Bundles a homomorphic encryption client and an AES key under a single key-provider-protected object.

Package Contents

class xtrace_sdk.x_vec.AWSKMSKeyProvider(kms_client, key_id)

Envelope encryption via AWS KMS.

On creation, generates a random 256-bit DEK and wraps it with KMS. On load, unwraps a previously wrapped DEK using KMS.

Requires boto3 at runtime. The KMS key must grant the caller kms:Encrypt and kms:Decrypt permissions.

Parameters:
  • kms_client (object) – A boto3 KMS client (boto3.client("kms")).

  • key_id (str) – KMS key ID, ARN, or alias (e.g. "alias/xtrace").

_kms
_key_id
_key: bytes | None = None
_edek: bytes | None = None
classmethod create(kms_client, key_id)

Generate a fresh DEK via KMS GenerateDataKey.

Parameters:
  • kms_client (object) – A boto3 KMS client.

  • key_id (str) – KMS key ID, ARN, or alias.

Return type:

AWSKMSKeyProvider

classmethod from_wrapped(wrapped, **kwargs)

Unwrap a previously stored EDEK using KMS Decrypt.

Parameters:
  • wrapped (bytes) – The EDEK bytes returned by wrap_key().

  • kms_client – A boto3 KMS client (passed via kwargs).

  • key_id – KMS key ID, ARN, or alias (passed via kwargs).

  • kwargs (Any)

Return type:

AWSKMSKeyProvider

get_key()
Return type:

bytes

wrap_key()
Return type:

bytes

provider_id()
Return type:

str

class xtrace_sdk.x_vec.KeyProvider

Bases: Protocol

Protocol for objects that supply a 256-bit AES key and can wrap/unwrap it for storage.

get_key()

Return the raw 256-bit AES key.

Return type:

bytes

wrap_key()

Return an opaque blob that can be stored alongside ciphertext to recover the key later.

Return type:

bytes

classmethod from_wrapped(wrapped, **kwargs)

Reconstruct a provider from a blob previously returned by wrap_key().

Parameters:
  • wrapped (bytes)

  • kwargs (Any)

Return type:

KeyProvider

provider_id()

Return a short identifier for the provider type (used in serialization).

Return type:

str

class xtrace_sdk.x_vec.PassphraseKeyProvider(passphrase, salt=None)

Derives a 256-bit AES key from a passphrase using scrypt.

This is the default key provider and preserves backwards-compatible behavior. The passphrase is never stored — only the derived key is kept in memory.

Parameters:
_salt = b'xtrace-aes-gcm-v1'
_key
get_key()
Return type:

bytes

wrap_key()
Return type:

bytes

classmethod from_wrapped(wrapped, **kwargs)

Re-derive the key from the passphrase and stored salt.

Parameters:
  • wrapped (bytes) – The salt bytes returned by wrap_key().

  • passphrase – The original passphrase (passed via kwargs).

  • kwargs (Any)

Return type:

PassphraseKeyProvider

provider_id()
Return type:

str

class xtrace_sdk.x_vec.DataLoader(execution_context, integration)

Encrypts and uploads document collections to XTrace.

DataLoader handles the two encryption steps required before data reaches XTrace:

  1. AES encryption of chunk content (text → ciphertext bytes).

  2. Homomorphic encryption of embedding vectors (float vector → encrypted index).

The encrypted data is then uploaded via XTraceIntegration. Neither the plaintext chunk content nor the raw embedding vectors leave the client.

Parameters:
execution_context
integration
async dump_db(db, index, kb_id, concurrent=False)

Upload a pre-encrypted database to XTrace.

Typically called after load_data_from_memory() or load_data_from_memory_batch() have produced an (index, encrypted_db) pair.

Parameters:
  • db (EncryptedDB) – Encrypted document collection produced by load_data_from_memory().

  • index (EncryptedIndex) – Encrypted embedding vectors produced by load_data_from_memory().

  • kb_id (str) – Destination knowledge-base ID.

  • concurrent (bool, optional) – Upload batches concurrently. Defaults to False.

Returns:

List of server responses, one per upload batch.

Return type:

list[dict]

async upsert_one(chunk, vector, kb_id)

Encrypt and upload a single chunk.

Parameters:
  • chunk (Chunk) – Chunk dict with at minimum a chunk_content string field.

  • vector (list[float]) – Float embedding vector for this chunk.

  • kb_id (str) – Destination knowledge-base ID.

Returns:

Server response list.

Return type:

list[dict]

Raises:

ValueError – If chunk is not a dict.

async delete_chunks(chunk_ids, kb_id)

Delete chunks by ID.

Parameters:
  • chunk_ids (list[int]) – List of chunk IDs to delete.

  • kb_id (str) – Knowledge-base ID the chunks belong to.

Returns:

Server response.

Return type:

dict

async update_chunks(chunk_updates, vectors, kb_id)

Re-encrypt updated chunks and upload them.

Each chunk in chunk_updates must include a chunk_id field identifying the record to replace.

Parameters:
  • chunk_updates (list[Chunk]) – Updated chunk dicts, each containing a chunk_id.

  • vectors (list[list[float]]) – New float embedding vectors, one per chunk.

  • kb_id (str) – Knowledge-base ID the chunks belong to.

Returns:

Server response list.

Return type:

list[dict]

async load_data_from_memory(chunks, vectors, disable_progress=False)

Encrypt a document collection one chunk at a time.

AES-encrypts each chunk’s chunk_content and homomorphically encrypts each float embedding vector into an encrypted index. Results are ready to pass to dump_db().

Parameters:
  • chunks (DocumentCollection) – Document collection — each item must have a chunk_content string field.

  • vectors (list[list[float]]) – Float embedding vectors, one per chunk. Each entry may also be a coroutine (e.g. an unawaited bin_embed() call) — it will be awaited automatically.

  • disable_progress (bool, optional) – If True, suppress the tqdm progress bar, defaults to False.

Returns:

Tuple of (index, encrypted_db) where index contains the encrypted vectors and encrypted_db contains chunks with AES-encrypted content.

Return type:

tuple[EncryptedIndex, EncryptedDB]

async load_data_from_memory_batch(chunks, vectors, disable_progress=False)

Encrypt a document collection using batch homomorphic encryption.

Faster than load_data_from_memory() for large collections because all embedding vectors are passed to the homomorphic client in a single batch call instead of one at a time. AES encryption is still applied per chunk.

Parameters:
  • chunks (DocumentCollection) – Document collection — each item must have a chunk_content string field.

  • vectors (list[list[float]]) – Float embedding vectors, one per chunk. Each entry may also be a coroutine (e.g. an unawaited bin_embed() call) — it will be awaited automatically.

  • disable_progress (bool, optional) – Unused; kept for API compatibility with load_data_from_memory().

Returns:

Tuple of (index, encrypted_db) where index contains the encrypted vectors and encrypted_db contains chunks with AES-encrypted content.

Return type:

tuple[EncryptedIndex, EncryptedDB]

class xtrace_sdk.x_vec.Embedding(provider, model_name, dim)

This class provides an interface to generate embeddings using different providers. Supported providers include “ollama”, “openai”, and “sentence_transformer”. It also includes methods to convert float embeddings to binary format.

Parameters:
dim
provider
url
model_name
__hash__()
Return type:

int

__eq__(other)
Parameters:

other (Any)

Return type:

bool

async embed(text)

Generates an embedding for the given text using the specified provider.

Parameters:

text (str) – The input text to be embedded.

Returns:

A numpy array representing the embedding of the input text.

Return type:

np.ndarray

Raises:

ValueError – If the embedding dimension does not match the expected dimension.

static float_2_bin(float_array)

Convert a list of floats to a list of binary integers, naive implementation, preserves dimension

Parameters:

float_array (np.ndarray or list[float]) – A numpy array or list of floats to be converted.

Returns:

A numpy array of binary integers (0s and 1s).

Return type:

np.ndarray

async bin_embed(text)

Generates a binary embedding for the given text. :param text: The input text to be embedded. :type text: str :return: A numpy array representing the binary embedding of the input text. :rtype: np.ndarray

Parameters:

text (str)

Return type:

numpy.ndarray

class xtrace_sdk.x_vec.InferenceClient(inference_provider, model_name, api_key=None, base_url=None, prompt_template=None)

A client wrapper for interacting with inference models via OpenAI API.

Parameters:
  • inference_provider (str)

  • model_name (str)

  • api_key (str | None)

  • base_url (str | None)

  • prompt_template (Callable[[str, str], str] | None)

client
model_name
prompt_template
query(query, context=None, stream=False)

Query the inference model.

Parameters:
  • query (str) – The query string to send to the model.

  • context (str, optional) – Optional context string to provide additional information to the model.

  • stream (bool, optional) – If True, stream the response incrementally.

Returns:

The model’s response text.

Return type:

str

class xtrace_sdk.x_vec.Retriever(execution_context, integration, parallel=False)

Retrieves and decrypts chunks from XTrace using encrypted hamming distance search.

Parameters:
execution_context
integration
parallel = False
async nn_search_for_ids(query_vector, k=3, kb_id='', meta_filter=None, range_filter=None, include_scores=False)

Find the k nearest neighbors by encrypted hamming distance.

Parameters:
  • query_vector (list[float]) – Float embedding vector to search with.

  • k (int) – Number of nearest neighbors to return.

  • kb_id (str) – Knowledge-base ID to search.

  • meta_filter (dict | None) – Optional metadata filter dict (MongoDB-style operators).

  • range_filter (list[int] | None) – Optional [min, max] range to restrict which chunks are searched.

  • include_scores (bool) – If True, also return the plain hamming distances.

Returns:

List of chunk IDs, or (chunk_ids, scores) if include_scores=True.

Return type:

list[int] | tuple[list[int], list[int]]

async retrieve_and_decrypt(chunk_ids, kb_id, projection=None)

Fetch chunks by ID and AES-decrypt their content.

Parameters:
  • chunk_ids (list[int]) – List of chunk IDs to retrieve.

  • kb_id (str) – Knowledge-base ID the chunks belong to.

  • projection (list[str] | None) – Fields to return; defaults to all standard fields.

Returns:

List of dicts with decrypted chunk_content and meta_data.

Return type:

list[dict]

static format_context(contexts)

Format a list of context strings for use in an LLM prompt.

Parameters:

contexts (list)

Return type:

str

class xtrace_sdk.x_vec.ExecutionContext(homomorphic_client, key_provider, context_id=None)

Bundles a homomorphic encryption client and an AES key under a single key-provider-protected object.

An ExecutionContext is the root secret for a XTrace deployment. It holds:

  • A homomorphic client (PaillierClient or PaillierLookupClient) whose secret key is used to decrypt Hamming distances returned by the XTrace server.

  • An AES key supplied by a KeyProvider, used to encrypt chunk content before upload.

The secret key is never transmitted in plaintext — it is AES-encrypted with the key provider’s key before any remote storage.

Parameters:
homomorphic
key_provider
aes
classmethod create(passphrase=None, homomorphic_client_type='paillier', embedding_length=512, key_len=1024, salt=None, path=None, key_provider=None)

Create a new execution context and optionally save it to disk.

Supply either key_provider or passphrase (with optional salt). If both are given, key_provider takes precedence.

Parameters:
  • passphrase (str | None) – Secret passphrase used to derive the AES encryption key and protect the homomorphic secret key at rest.

  • homomorphic_client_type (str) – "paillier" or "paillier_lookup".

  • embedding_length (int) – Dimension of the binary embedding vectors (must match the model).

  • key_len (int) – RSA modulus size in bits (minimum 1024).

  • salt (bytes | None) – Optional salt bytes for passphrase-based key derivation.

  • path (str | None) – If provided, persist the context to this file path via save_to_disk().

  • key_provider (xtrace_sdk.x_vec.crypto.key_provider.KeyProvider | None) – Explicit KeyProvider instance (e.g. AWSKMSKeyProvider).

Returns:

Initialised ExecutionContext.

Raises:

ValueError – If homomorphic_client_type is not recognised or embedding_length >= key_len.

Return type:

ExecutionContext

property device: str

"cpu" or "gpu".

Type:

Active compute backend

Return type:

str

to_dict_enc()

Return a serialisable dict with the secret key AES-encrypted under the key provider.

Return type:

dict

to_dict_plain()

Return a serialisable dict with the secret key in plaintext. Do not persist or transmit.

Return type:

dict

embed_len()

Embedding vector dimension this context was configured for.

Return type:

int

key_len()

RSA modulus size in bits used for key generation.

Return type:

int

__str__()
Return type:

str

hash()

Compute a deterministic SHA-256 fingerprint of this context’s cryptographic identity.

The device field is excluded so that CPU and GPU contexts sharing the same keys compare as equal.

Returns:

Hex-encoded SHA-256 digest.

Return type:

str

__hash__()
Return type:

int

__eq__(other)
Parameters:

other (Any)

Return type:

bool

_config_with_device()
Return type:

str

serialize_exec_context()

Serialise the execution context to a JSON string suitable for storage or transmission.

The secret key is AES-encrypted under the key provider before inclusion.

Returns:

JSON string representing the encrypted execution context.

Return type:

str

Raises:

ValueError – If the homomorphic client type is not supported.

classmethod _from_serialized_exec_context(json_obj, passphrase=None, key_provider=None, context_id=None)

Reconstruct an ExecutionContext from a previously serialised dict.

Supply either key_provider or passphrase. For passphrase-based contexts the salt is read from the stored wrapped_key field automatically.

Parameters:
Returns:

Restored ExecutionContext.

Raises:

ValueError – If the stored homomorphic client type is not supported.

Return type:

ExecutionContext

dump_tables()

Dump precomputed encryption tables (Paillier-Lookup only) for caching.

Returns:

Dict containing g_table and noise_table, or an empty dict if the underlying client does not support table export.

Return type:

dict

save_to_disk(path)

Persist the execution context to a local file.

The secret key is AES-encrypted before writing. The passphrase/key is not stored.

Parameters:

path (str) – File path to write to.

Return type:

None

classmethod load_from_disk(passphrase=None, path='', key_provider=None)

Load an ExecutionContext from a file previously saved with save_to_disk().

Parameters:
Returns:

Restored ExecutionContext.

Return type:

ExecutionContext

async save_to_remote(integration)

Upload the execution context to XTrace remote storage.

The secret key is AES-encrypted under the key provider before upload — XTrace never sees the plaintext secret key or the passphrase.

Parameters:

integration (xtrace_sdk.integrations.xtrace.XTraceIntegration) – Authenticated XTraceIntegration instance.

Returns:

The context_id assigned by the server.

Return type:

str

classmethod load_from_remote(passphrase=None, context_id='', integration=None, key_provider=None)
Async:

Parameters:
Return type:

ExecutionContext

Fetch and decrypt an ExecutionContext from XTrace remote storage.

Parameters:
Returns:

Restored ExecutionContext.

Return type:

ExecutionContext