Welcome
Last updated
Last updated
Welcome to the ColiVara documentation! Here you'll get an overview of all the features ColiVara offers to help you build a state of the art retrieval system.
Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embeddings.
It is a web-first implementation of the ColPali paper using ColQwen2 as the LLM model. It works exactly like RAG from the end-user standpoint - but using vision models instead of chunking and text-processing for documents. No OCR, no text extraction, no broken tables, or missing images. What you see, is what you get.
ColiVara performance is near state of the art for Retrieval-Augmented Generation on the vidore leaderboard. We significantly outperfomed currently methods for document parsing and processing such as OCR and captioning.
Our detailed Benchmark Performance Evaluation have illustrated Colivara's performance across diverse benchmarks. Metrics like NDCG@5 score (Normalized Discounted Cumulative Gain at rank 5) and Latency were recorded for a comprehensive analysis.
Benchmark | Colivara Score | Avg Latency (s) (lower is better) | Num Docs |
---|---|---|---|
ColiVara dominated visual-heavy benchmarks like ArxivQA and InfoQA with NDCG@5 score of 88.1, double the performance of captioning-based systems.
Even for on text-centric benchmarks, ColiVara outperformed traditional methods by up to 30% on benchmarks like DocQA and multimodal benchmarks like InfoQA.
For more comprehensive benchmarks, where a holistic approach of visual and textual analysis is key to query generation, such as the key for queries in specific domains (Sustainability, Energy, AI, Government Report, Healthcare), ColiVara shines overwhelmingly over competitions, scoring in the high 90s for all benchmarks. This is due to ColiVara's Holistic Multimodal Integration and Spatial Context Awareness.
Average
86.8
N/A
N/A
ArxivQA
87.6
3.2
500
DocVQA
54.8
2.9
500
InfoVQA
90.1
2.9
500
Shift Project
87.7
5.3
1000
Artificial Intelligence
98.7
4.3
1000
Energy
96.4
4.5
1000
Government Reports
96.8
4.4
1000
Healthcare Industry
98.5
4.5
1000
TabFQuad
86.6
3.7
280
TatQA
70.9
8.4
1663
Create your first RAG pipeline with 2 lines of code
Learn all what ColiVara have to offer
Try the API live