ColiVara is an web API that abstracts all the difficult parts about visual RAG. It embeds and saves documents, and then returns the highest matching pages when a user makes a query.
We use the Python SDK in this quickstart, but since ColiVara is an API, you can use any language by making standard API calls.
Colivara accepts a file url, or base64 encoded file, or a file path. We support over 100 file formats including PDF, DOCX, PPTX, and more. We will also automatically take a screenshot of URLs (webpages) and index them.
import os
from colivara_py import ColiVara
rag_client = ColiVara(
# this is the default and can be omitted
api_key=os.environ.get("COLIVARA_API_KEY"),
# this is the default and can be omitted
base_url="https://api.colivara.com"
)
# Upload a document to the default collection
document = rag_client.upsert_document(
name="attention is all you need",
url="https://arxiv.org/abs/1706.03762",
metadata={"published_year": "2017"}
)
Search
You can filter by collection name, collection metadata, and document metadata. You can also specify the number of results you want.
results = rag_client.search(query="What is the role of self-attention in transformers?")
print(results) # top 3 pages with the most relevant information
FAQ
Do I need a vector database?
No - ColiVara uses Postgres and pgVector to store vectors for you. You DO NOT need to generate, save, or manage embeddings in anyway.
Do you convert the documents to markdown/text?
No - ColiVara treats everything as an image, and uses vision models. There are no parsing, chunking, or OCR involved. This method outperforms chunking, and OCR for both text-based documents and visual documents.
How does non-pdf documents or web pages work?
We run a pipeline to convert them to images, and perform our normal image-based retrieval. This all happen for you under the hood, and you get the top-k pages when performing retrieval.
Can I use my vector database?
Yes - we have an embedding endpoint that only generates embeddings without saving or doing anything else. You can store these embeddings at your end. Keep in mind that we use late-interaction and multi-vectors, many vector databases do not support this yet.