Quickstart

This page gives a good introduction in how to get started with XTraceSDK.

First make sure that you have installed XTraceSDK. You can find the installation instructions in the install section.

Loading Data with DataLoader

Say we want to load some data to be used later for semantic retrieval with privacy preserving capabilities. We can use the DataLoaderBase class to load data from a file, preprocess it and store it in an encrypted index and database.

from xtrace_sdk.data_loaders.base import DataLoaderBase
from xtrace_sdk.crypto.paillier_client import PaillierClient
from xtrace_sdk.crypto.encryption.aes import AESClient
from xtrace_sdk.inference.embedding import Embedding
from xtrace_sdk.integrations.xtrace import XTraceIntegration
from xtrace_sdk.connectors.local_disk_connector import LocalDiskConnector

Next we need to create instances of crytpo clients, embedding and xtrace integration module to be used by the data loader.

paillier_client = PaillierClient()
execution_context = ExecutionContext(
    paillier_client=paillier_client,
    passphrase="this is a super safe passphrase",
)
xtrace_integration = XTraceIntegration(org_id="your_org_id", api_key="your_api_key")

embed = Embedding("ollama","mxbai-embed-large",1024)

Now we can create an instance of the DataLoaderBase class and use it with a LocalDiskConnector to load our data from a file into encrypted index and database to be used later.

data_loader = DataLoaderBase(execution_context, xtrace_integration)
connector = LocalDiskConnector("path/to/your/data/files")
collection = connector.load_data()
vectors = [embed.bin_embed(item['chunk_content']) for item in collection]
index,db = data_loader.load_data_from_memory(collection,vectors)

We can also dump the index and database to XTrace server for later use.

await data_loader.dump_db(db, index, kb_id="your_kb_id"")

Quering Encrypted Data with Encrypted Query using Retriever

Now that we have preprocessed our data, we can query it using the Retriever class. We will use the same crypto clients, embedding as before, with a compute module to perform the query logic.

from xtrace_sdk.retrievers.simple_retriever import SimpleRetriever


retriever = SimpleRetriever(execution_context, xtrace_integration)

Quering is as simple as calling the query method on the retriever object with the index and database we created earlier.

query = "How do I use XTraceSDK?"
query_vector = embed.bin_embed(query)
ids = await retriever.nn_search_for_ids(query_vector,k=3,meta_filter=None)
contexts = await retriever.retrieve_and_decrypt(ids)
print(contexts)
# Output: ['XTraceSDK is a software development kit that allows you to ...', 'To use XTraceSDK, you need to ...', 'XTraceSDK is designed to be ...']