DocumentationNeurondB Documentation
Quantization Strategies
Product Quantization (PQ)
Splits vectors into subvectors, assigns each to a codebook centroid, and stores compact codes for fast approximate distance computation. Balances recall with aggressive compression.
Parameters
neurondb.pq_subvector_dim- Dimensions per sub-vector (default: 32)neurondb.pq_codebooks- Centroids per subspace (default: 256, 2^8)neurondb.pq_use_residuals- Stores residual vector for recall boosts (default: on)
PQ index
CREATE INDEX ON documents
USING neurondb_ivf_hnsw (embedding)
WITH (
metric = 'cosine',
pq_enabled = true,
pq_subvector_dim = 32,
pq_codebooks = 256,
pq_residual = true
);Scalar Quantization (SQ)
Quantizes each dimension independently to 8-bit or 16-bit values. Simplest compression with predictable error bounds and GPU-friendly arithmetic.
Parameters
neurondb.sq_bits_per_dim- 4 or 8 bits recommended (default: 8)neurondb.sq_dynamic_range- Auto scales using percentile range (default: percentile)neurondb.sq_rebalance_interval- Recalibrates quantizers hourly (default: 3600s)
Scalar quantization
ALTER TABLE telemetry_embeddings
ALTER COLUMN embedding
SET STORAGE neurondb_scalar(8);
SELECT neurondb_rebalance_scalar_quantizer('telemetry_embeddings', 'embedding');Binary Quantization
Thresholds vector components into binary codes for Hamming distance search. Ideal for large-scale dedupe and anomaly fingerprinting workloads.
Parameters
neurondb.binary_threshold- Median per dimension (default: median)neurondb.binary_pack_width- Bit packing for SIMD execution (default: 64)neurondb.binary_use_gpu- Auto selects GPU at >1M vectors (default: auto)
Binary quantization
UPDATE media_fingerprints
SET fingerprint = neurondb_to_binary(embedding);
CREATE INDEX ON media_fingerprints
USING neurondb_hamming (fingerprint);Next Steps
- Performance Tuning - Optimize quantized search
- Automation Workers - Auto-rebalance quantizers
- Indexing - Create quantized indexes