Quickstart
==========

This page gives a good introduction in how to get started with XTraceSDK.

First make sure that you have installed XTraceSDK. You can find the installation instructions in the install section.


Preprocessing Data with DataLoader
--------------------------------------------------
Say we want to p, we would import the TxtLoader class and all the necessary cryptography modules.

.. code-block:: python

    from xtrace_sdk.data_loaders.base import DataLoaderBase
    from xtrace_sdk.crypto.paillier_client import PaillierClient
    from xtrace_sdk.crypto.encryption.aes import AESClient
    from xtrace_sdk.utils.embedding import Embedding
    from xtrace_sdk.integrations.local.storage import LocalStorage
    from xtrace_sdk.connectors.local_disk_connector import LocalDiskConnector

Next we need to create instances of crytpo clients, embedding and storage module to be used by the data loader.

.. code-block:: python

    paillier_client = PaillierClient()
    aes_client = AESClient("this is a super safe AES Key")
    embedding = Embedding()
    storage = LocalStorage("data/") #mounts the data directory to the storage module

Now we can create an instance of the DataLoaderBase class and use it with a LocalDiskConnector to load our data from a file
into encrypted index and database to be used later.

.. code-block:: python

    data_loader = DataLoaderBase(embedding, aes_client, paillier_client, storage)
    connector = LocalDiskConnector("path/to/your/data/files")
    collection = connector.load_data()
    index,db = data_loader.load_data_from_memory(collection)

We can also dump the index and database to a storage mounted in the storage object for later use:

.. code-block:: python

    # saves the index and database to under data/ directory
    await data_loader.dump_index(index)
    await data_loader.dump_db(db)


Quering Encrypted Data with Encrypted Query using Retriever 
------------------------------------------------------------

Now that we have preprocessed our data, we can query it using the Retriever class.
We will use the same crypto clients, embedding as before, with a compute module 
to perform the query logic.

.. code-block:: python

    from xtrace_sdk.compute.local.compute import LocalCompute
    from xtrace_sdk.retrievers.simple_retriever import SimpleRetriever


    compute = LocalCompute('data/', paillier_client)
    retriever = SimpleRetriever(embedding,aes_client,paillier_client,compute)


Quering is as simple as calling the query method on the retriever object with the index and database we created earlier.

.. code-block:: python

    query = "How do I use XTraceSDK?"
    ids = await retriever.nn_search_for_ids(query,k=3,meta_filter=None)
    contexts = await retriever.retrieve_and_decrypt(ids)
    print(contexts)
    # Output: ['XTraceSDK is a software development kit that allows you to ...', 'To use XTraceSDK, you need to ...', 'XTraceSDK is designed to be ...']

Integrate with LLMs to Build Privacy Preserving RAG  Pipeline
---------------------------------------------------------------

You can easily integrate the retriever with an LLM (Large Language Model) to generate responses based on the retrieved contexts like the one below.
To do this, you can use the OllamaClient to interact with an LLM like DeepSeek. First, make sure you have the Ollama installed and configured.

.. code-block:: python

    from xtrace_sdk.inference.ollama import OllamaClient
    from xtrace_sdk.retrievers.base import RetrieverBase
    formated_context = RetrieverBase.format_context([i['chunk_content'] for i in contexts])
    RAG_PROMPT_TEMP = lambda context,query: f"DOCUMENT:\n{context}\nQUESTION:\n{query}\nINSTRUCTIONS:\n Answer the users QUESTION using the DOCUMENT text above.Keep your answer ground in the facts of the DOCUMENT."
    OLLAMA_URL = 'http://localhost:11434'
    deepseek = OllamaClient(OLLAMA_URL,"deepseek-r1:1.5b")
    deepseek.query(RAG_PROMPT_TEMP(formated_context,q))