Self-hosting
ColiVara is made up of multiple services. The first is an embedding service that turns images and queries into Vectors. This is bundled separately as it needs a GPU. The second is a Postgres with a pgvector extension to store vectors. A Gotenberg service that handles document conversions to PDFs. And finally, a Django-Ninja REST API that handles user requests. Other than the Embedding service, everything else is bundles together via docker-compose to run seamlessly on a typical VPS.
For production workloads - you may consider a managed Postgres instance for automatic security updates and regular backup.
Embedding Service
Git clone the service repository
git clone https://github.com/tjmlabs/ColiVarEOptional: download uv and install it in your environment. We use uv, however you can also use pip to install the requirements.
pip install uv
uv venv # or python -m venv .venv
source .venv/bin/activate #.venv/Scripts/activate on windowsCompile requirements based on your environment. As this services uses pytorch under the hood- requirements will be different depending on your OS and Nvidia GPU availability. We use a mac for development and a Linux in production.
Install the requirements
uv pip compile builder/requirements.in -o builder/requirements.txtDownload the models from huggingface and save them in the
models_hubdirectory before building. See src/download_models.py for more details.
from colpali_engine.models import ColQwen2, ColQwen2Processor
import torch
model_name = "vidore/colqwen2-v1.0"
if torch.cuda.is_available():
device_map = "cuda"
elif torch.backends.mps.is_available():
device_map = "mps"
else:
device_map = None
model = ColQwen2.from_pretrained(
model_name,
cache_dir="models_hub/", # where to save the model
device_map=device_map,
)
processor = ColQwen2Processor.from_pretrained(model_name, cache_dir="models_hub/")Run the service locally using the following command
The Embedding service is now running on
http://localhost:8000/. You can test it using the following command. Remember - you do need a GPU and at least 8gb of VRAM available. The performance on a M-series of Macs is also acceptable for local development.
You may consider running this service in an "on-demand" fashion via Docker for cost-savings in production settings.
REST API
Clone the ColiVara repository
Create a .env.dev file in the root directory with the following variables:
Run all the services via docker-compose
Application will be running at http://localhost:8001 and the swagger documentation at
http://localhost:8001/v1/docsThe swagger documentations page is also a playground - where you can try all the endpoints using the token created earlier

Development
Follow the steps above to get the service up and running.
To run tests and type checking - we have 100% test coverage
Make a branch with your changes and additional code
Open a Pull request on Github. We have CI/CD and pre-commit hooks to format and test your changes
We welcome contribution and discussion.
Last updated
Was this helpful?