Meet Turbovec: A Rust Vector Index with Python Bindings, and Built on Google’s TurboQuant Algorithm

Vector search supports most recovery augmented generation (RAG) pipelines. On a large scale, it gets expensive. Storing 10 million document embeds in float32 consumes 31 GB of RAM. For development teams running local or internal inference, this number creates real limitations.

A new open source library called com. turbovec Addresses this directly. It is a vector index written in Rust with Python bindings. It was built on TurboQuant,Quantization algorithm from Google research. The same collection of 10 million documents fits into 4GB with Turbovec. On ARM devices, the search speed outperforms the FAISS IndexPQFastScan by 12–20%.

TurboQuant Sheet

TurboQuant Brought to you by the Google search team. The Google team suggests using TurboQuant as a data-oblivious quantizer. It achieves near-perfect distortion rates across all bitwidths and dimensions. It requires no training and does not pass data.

Most production-level vector quantizers, including FAISS’s product quantisers, require a codebook training step. You should run k-means on a representative sample of your vectors before starting indexing. If your cluster grows or changes, you may need to retrain and rebuild the entire index. TurboQuant goes beyond all that. It uses the analytic property of rotated vectors instead of data-driven normalization.

How Turbovec quantizes vectors

The quantization pipeline consists of four steps:

(1) Every carrier normalization. The length (base) is stripped and stored as a single float. Each vector becomes a unit direction on a high-dimensional hypersphere.

(2) A Random rotation It is applied. All vectors are multiplied by the same random orthogonal matrix. After rotation, each coordinate independently follows a beta distribution. In high dimensions, this converges to a Gaussian N(0, 1/d). This applies to any input data – rotation makes the coordinate distribution predictable.

(3) Lloyd’s Max numerical quantization It is applied. Since the distribution is known analytically, ideal bucket boundaries and centroids can be calculated in advance from mathematics alone. For 2-bit quantization, this means 4 groups per coordinate. For 4 bits, this means 16 groups. There is no need to pass data.

(4) The quantum coordinates are A little packed to bytes. A 1536-dimensional vector shrinks from 6,144 bytes at FP32 to 384 bytes at 2-bit. This is a 16x compression ratio.

At search time, the query is rotated once in the same domain. Registration occurs directly against the code book values. The recording core uses SIMD cores – NEON on ARM and AVX-512BW on modern x86, with a fallback of AVX2 – with partitioned lookup tables for throughput.

TurboQuant achieves distortion of approximately 2.7 times the information theoretical Shannon minimum.

Recall and speed: numbers

All benchmarks use 100K vectors, 1000 queries, k=64, and report an average of 5 runs.

As a reminder, Turbovec compares against FAISS IndexPQ (LUT256, nbits=8, float32 LUT). This is a solid foundation: FAISS uses a high-resolution LUT table for clocking and k-means++ for codebook training. However, TurboQuant and FAISS are within 0-1 point at R@1 for the OpenAI embeddings at d=1536 and d=3072. They both converge to 1.0 recall by k = 4–8. GloVe at d=200 is harder. In this dimension, TurboQuant lags behind FAISS by 3–6 points at R@1, closing at k≈16–32.

In terms of speed, the ARM (Apple M3 Max) results show that Turbovec outperforms the FAISS IndexPQFastScan by 12-20% in every configuration. On x86 OS (Intel Xeon Platinum 8481C / Sapphire Rapids, 8 vCPUs), Turbovec wins every 4-bit configuration by 1-6%. It operates within ~1% of FAISS on a single 2-bit thread. There are two configurations slightly behind FAISS: 2-bit multi-threaded at d = 1536 and d = 3072. There, the internal accumulator loop is too short to cancel amortization. The AVX-512 VBMI route of FAISS has the advantage in these two cases (2-4%).

Python API

Installation is one command: pip install turbovec. The primary class is TurboQuantIndexinitialized using dimension and bit-width.

from turbovec import TurboQuantIndex

index = TurboQuantIndex(dim=1536, bit_width=4)
index.add(vectors)
scores, indices = index.search(query, k=10)
index.write("my_index.tq")

second category, IdMapIndexsupports stable external uint64 ids that survive deletions. The removal is O(1) by id. This is useful for document stores where vectors are frequently updated or deleted.

Turbovec integrates with LangChain (pip install turbovec(langchain)), Llama indicator (pip install turbovec(llama-index)), and a haystack (pip install turbovec(haystack)). Rustbox is available via cargo add turbovec.

Visual explanation of Marktechpost

What is Turbovic?

Turbovec is a vector index written in Rust with Python bindings. It’s built on Google Research’s TurboQuant algorithm – a data-agnostic quantitative measurement tool that requires no codebook training. A collection of 10 million documents takes up 31GB where float32 fits into 4GB with Turbovec.

16x compression at 2 bits

💨 FAISS outperforms ARM by 12-20%.

🔒 Completely local – no data egress

📦 Massachusetts Institute of Technology licensed

stabilizing

Install the Python package from PyPI using a single command. For Rust, add the box via Cargo.

# Python
pip install turbovec

# Rust
cargo add turbovec

Note: To build from source, install Maturin Then run Maturin Build-Version inside turbovik-python/ guide. For rust, run Build goods – release.

Basic Usage – TurboQuantIndex

TurboQuantIndex It is the primary class. Initialize it using a vector faint And a bit_width of 2 or 4. Vectors are indexed immediately adds() -No training step required.

from turbovec import TurboQuantIndex

index = TurboQuantIndex(dim=1536, bit_width=4)

# Add vectors (numpy float32 array, shape (n, dim))
index.add(vectors)
index.add(more_vectors)  # incremental adds are fine

# Search: returns top-k scores and positional indices
scores, indices = index.search(query, k=10)

Static identifiers — IdMapIndex

is used IdMapIndex When you need external uint64 Identifiers that survive deletion. The elimination is O(1) by identifier — useful for document stores where vectors change over time.

import numpy as np
from turbovec import IdMapIndex

index = IdMapIndex(dim=1536, bit_width=4)

# Map vectors to your own uint64 external IDs
index.add_with_ids(vectors, np.array((1001, 1002, 1003), dtype=np.uint64))

# Search returns your external IDs, not positional indices
scores, ids = index.search(query, k=10)

# O(1) delete by external IDnindex.remove(1002)

Save and load the index

Both types of index support persistent storage. TurboQuantIndex writes to .tq Files. IdMapIndex writes to .tvim Files.

from turbovec import TurboQuantIndex, IdMapIndex

# TurboQuantIndex  —>  .tq
index.write("my_index.tq")
loaded = TurboQuantIndex.load("my_index.tq")

# IdMapIndex  —>  .tvim
index.write("my_index.tvim")
loaded = IdMapIndex.load("my_index.tvim")

Frame integrals

Turbovec ships optional plugins for LangChain, LlamaIndex, and Haystack. Install add-ons that match your collection.

# LangChain
pip install turbovec(langchain)

# LlamaIndex
pip install turbovec(llama-index)

# Haystack
pip install turbovec(haystack)

advice: Each integration connects Turbovec as a vector store. See docs/integration/ In the repo for full usage examples with each framework.

Using Turbovec on rust

The Rust API mirrors the Python API. both of them TurboQuantIndex and IdMapIndex Available. All versions of x86_64 create an AVX2 target as the baseline; The AVX-512 is enabled at runtime via feature detection.

use turbovec::TurboQuantIndex;

let mut index = TurboQuantIndex::new(1536, 4);
index.add(&vectors);

let results = index.search(&queries, 10);

index.write("index.tv").unwrap();
let loaded = TurboQuantIndex::load("index.tv").unwrap();

📚 Full API: docs/api.md

⭐ github.com/RyanCodrai/turbovec

Key takeaways

  • There is no training on the code book. Turbovec indexes vectors instantly – no k-means, no reconstructions as the object grows.
  • 16x compression. The 1536-dim float32 vector shrinks from 6144 bytes to 384 bytes at 2-bit quantization.
  • Faster than FAISS on ARM. Turbovec outperforms FAISS IndexPQFastScan by 12-20% on ARM in every configuration.
  • Near-optimal distortion. TurboQuant achieves approximately 2.7x distortion of the Shannon minimum – close to the theoretical limit.
  • Completely local. No managed service, no exit data – combined with any open source air-gap RAG stack embedding model.

verify The repo is here. Also, feel free to follow us on twitter And don’t forget to join us 150k+ mil SubReddit And subscribe to Our newsletter. I am waiting! Are you on telegram? Now you can join us on Telegram too.

Do you need to partner with us to promote your GitHub Repo page, face hug page, product release, webinar, etc.? Contact us


Leave a Reply