DocumentationNeurondB Documentation

Quantization Strategies

Product Quantization (PQ)

Splits vectors into subvectors, assigns each to a codebook centroid, and stores compact codes for fast approximate distance computation. Balances recall with aggressive compression.

Parameters

  • neurondb.pq_subvector_dim - Dimensions per sub-vector (default: 32)
  • neurondb.pq_codebooks - Centroids per subspace (default: 256, 2^8)
  • neurondb.pq_use_residuals - Stores residual vector for recall boosts (default: on)

PQ index

CREATE INDEX ON documents
USING neurondb_ivf_hnsw (embedding)
WITH (
  metric = 'cosine',
  pq_enabled = true,
  pq_subvector_dim = 32,
  pq_codebooks = 256,
  pq_residual = true
);

Scalar Quantization (SQ)

Quantizes each dimension independently to 8-bit or 16-bit values. Simplest compression with predictable error bounds and GPU-friendly arithmetic.

Parameters

  • neurondb.sq_bits_per_dim - 4 or 8 bits recommended (default: 8)
  • neurondb.sq_dynamic_range - Auto scales using percentile range (default: percentile)
  • neurondb.sq_rebalance_interval - Recalibrates quantizers hourly (default: 3600s)

Scalar quantization

ALTER TABLE telemetry_embeddings
ALTER COLUMN embedding
SET STORAGE neurondb_scalar(8);

SELECT neurondb_rebalance_scalar_quantizer('telemetry_embeddings', 'embedding');

Binary Quantization

Thresholds vector components into binary codes for Hamming distance search. Ideal for large-scale dedupe and anomaly fingerprinting workloads.

Parameters

  • neurondb.binary_threshold - Median per dimension (default: median)
  • neurondb.binary_pack_width - Bit packing for SIMD execution (default: 64)
  • neurondb.binary_use_gpu - Auto selects GPU at >1M vectors (default: auto)

Binary quantization

UPDATE media_fingerprints
SET fingerprint = neurondb_to_binary(embedding);

CREATE INDEX ON media_fingerprints
USING neurondb_hamming (fingerprint);

Next Steps