DocumentationNeurondB Documentation

Vector Engine

Overview

The Vector Engine is NeurondB's high-performance approximate nearest neighbor (ANN) search system, designed for billion-scale vector similarity search with millisecond latency. It combines state-of-the-art indexing algorithms (HNSW and IVF) with multiple distance metrics and advanced quantization techniques to deliver production-ready vector search capabilities directly in PostgreSQL.

Key Capabilities

  • HNSW Indexing: Hierarchical Navigable Small World graphs for excellent recall and speed
  • IVF Indexing: Inverted File indexes for large-scale datasets with efficient memory usage
  • Multiple Distance Metrics: Cosine, L2 (Euclidean), inner product, and custom metrics
  • Quantization: Scalar and product quantization to reduce memory footprint by 4-8x
  • GPU Acceleration: Optional CUDA/ROCm/Metal support for 10-100x faster distance computation
  • Adaptive Index Selection: Automatic index type selection based on dataset characteristics

Performance Benchmarks

Dataset SizeIndex TypeQuery Latency (p95)Recall@10
1M vectorsHNSW2.3ms98.5%
10M vectorsHNSW4.7ms97.2%
100M vectorsIVF8.1ms94.8%
1B vectorsIVF + PQ12.4ms92.1%

Indexing Algorithms

HNSW (Hierarchical Navigable Small World)

HNSW builds a multi-layer graph structure where each layer is a subset of the previous layer, creating a hierarchical navigation system. This design enables logarithmic search complexity and excellent recall rates.

Advantages

  • Excellent recall (95-99% for typical configurations)
  • Fast query performance (2-5ms for millions of vectors)
  • Incremental updates without full rebuilds
  • Works well for datasets up to 100M vectors

Configuration

Create HNSW index

-- Basic HNSW index with cosine distance
CREATE INDEX documents_embedding_idx ON documents 
USING hnsw (embedding vector_cosine_ops);

-- Tuned HNSW index for high recall
CREATE INDEX documents_embedding_idx ON documents 
USING hnsw (embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);

-- Runtime tuning for query performance
SET hnsw.ef_search = 100;  -- Higher = better recall, slower queries

IVF (Inverted File)

IVF partitions the vector space into clusters (centroids) using k-means clustering. At query time, only the nearest clusters are searched, dramatically reducing the search space for large datasets.

Advantages

  • Efficient for very large datasets (100M+ vectors)
  • Lower memory footprint than HNSW
  • Faster index build time
  • Scales to billions of vectors with quantization

Configuration

Create IVF index

-- Basic IVF index
CREATE INDEX documents_embedding_idx ON documents 
USING ivf (embedding vector_l2_ops)
WITH (lists = 100);

-- IVF for large datasets (sqrt(num_rows) is a good starting point)
CREATE INDEX documents_embedding_idx ON documents 
USING ivf (embedding vector_l2_ops)
WITH (lists = 3162);  -- For ~10M vectors

-- Runtime tuning
SET ivf.probes = 20;  -- Number of clusters to search

Index Selection Guide

Dataset SizeRecommended IndexReason
< 1M vectorsHNSWBest recall and speed
1M - 10M vectorsHNSWExcellent performance, manageable memory
10M - 100M vectorsHNSW or IVFHNSW if memory allows, IVF for constraints
100M+ vectorsIVF + PQMemory efficiency with quantization

Distance Metrics

NeurondB supports multiple distance metrics, each optimized for different use cases and data types. The choice of distance metric significantly impacts search quality and should match your embedding model's training objective.

Cosine Distance

Measures the angle between vectors, normalized to [0, 2]. Ideal for text embeddings where direction matters more than magnitude. Most embedding models (OpenAI, Cohere, Sentence Transformers) are optimized for cosine similarity.

Cosine distance search

-- Cosine distance operator (<=>)
SELECT id, content, embedding <=> query_vec AS distance
FROM documents,
     (SELECT embed_text('search query', 'text-embedding-ada-002') AS query_vec) q
ORDER BY distance
LIMIT 10;

L2 (Euclidean) Distance

The straight-line distance between two points in vector space. Lower values indicate greater similarity. Commonly used for image embeddings and spatial data where absolute distances matter.

L2 distance search

-- L2 distance operator (<->)
SELECT id, content, embedding <-> query_vec AS distance
FROM documents,
     (SELECT '[0.1, 0.2, 0.3, ...]'::vector AS query_vec) q
ORDER BY distance
LIMIT 10;

Inner Product (Dot Product)

Computes the dot product of two vectors. Higher values indicate greater similarity. Use with normalized vectors for maximum inner product search (MIPS), which is equivalent to cosine similarity for normalized vectors.

Inner product search

-- Inner product operator (<#>)
SELECT id, content, embedding <#> query_vec AS score
FROM documents,
     (SELECT '[0.1, 0.2, 0.3, ...]'::vector AS query_vec) q
ORDER BY score DESC  -- Higher is better for inner product
LIMIT 10;

Distance Metric Selection

Use CaseRecommended MetricReason
Text embeddings (OpenAI, Cohere)CosineModels trained for cosine similarity
Image embeddings (CLIP)Cosine or L2Depends on model training
Spatial/geometric dataL2Preserves actual distances
Normalized vectorsInner ProductEquivalent to cosine, faster computation

Quantization Techniques

Quantization reduces memory footprint and can accelerate search by compressing vector representations. NeurondB supports both scalar quantization (per-dimension) and product quantization (subspace-based).

Scalar Quantization

Reduces precision from 32-bit floats to 8-bit integers (INT8) or 16-bit floats (FP16). Provides 4x memory reduction with minimal accuracy loss (typically < 1% recall degradation).

Scalar quantization

-- Create quantized index (INT8)
CREATE INDEX documents_embedding_idx ON documents 
USING hnsw (embedding vector_cosine_ops)
WITH (quantization = 'int8');

-- Memory savings: 4x reduction
-- Example: 1M vectors × 1536 dims × 4 bytes = 6.1 GB
--          With INT8: 1M × 1536 × 1 byte = 1.5 GB

Product Quantization (PQ)

Divides vectors into subvectors and quantizes each subspace independently. Provides 8-16x memory reduction with slightly higher accuracy loss (typically 2-5% recall degradation). Ideal for billion-scale datasets.

Product quantization

-- Create PQ index
CREATE INDEX documents_embedding_idx ON documents 
USING ivf (embedding vector_l2_ops)
WITH (
  lists = 1000,
  quantization = 'pq',
  pq_m = 64,      -- Number of subvectors
  pq_k = 256      -- Codebook size per subvector
);

-- Memory savings: 8-16x reduction
-- Example: 1B vectors × 1536 dims × 4 bytes = 6.1 TB
--          With PQ: ~400-800 GB

Quantization Trade-offs

Quantization TypeMemory ReductionRecall ImpactQuery SpeedBest For
None (FP32)1x100%BaselineSmall datasets, maximum accuracy
INT8 (Scalar)4x99%+FasterMedium datasets, balanced
FP16 (Scalar)2x99.5%+SimilarGPU acceleration
PQ8-16x95-98%FasterBillion-scale datasets

Performance Characteristics

Query Latency

  • HNSW: 2-5ms for 1M vectors, 4-8ms for 10M vectors
  • IVF: 5-10ms for 100M vectors, 10-15ms for 1B vectors
  • With GPU: 2-3x faster for batch queries

Index Build Time

  • HNSW: ~10-30 minutes for 10M vectors
  • IVF: ~5-15 minutes for 10M vectors
  • With GPU: 3-5x faster clustering for IVF

Memory Usage

  • HNSW: ~1.5-2x vector size (graph structure overhead)
  • IVF: ~1.1-1.3x vector size (centroid storage)
  • With INT8: 4x reduction
  • With PQ: 8-16x reduction

Optimization Tips

  • Use HNSW for datasets < 100M vectors for best recall
  • Use IVF + PQ for billion-scale datasets
  • Enable GPU acceleration for batch operations
  • Tune ef_search (HNSW) or probes (IVF) to balance recall and latency
  • Monitor index fragmentation and rebuild when needed

Use Cases

Semantic Search

Find documents, products, or content based on meaning rather than exact keywords.

Semantic search example

-- Find similar documents
SELECT id, title, content
FROM documents
WHERE embedding <=> embed_text('machine learning algorithms', 'text-embedding-ada-002') < 0.3
ORDER BY embedding <=> embed_text('machine learning algorithms', 'text-embedding-ada-002')
LIMIT 10;

Recommendation Systems

Find similar items, users, or content for personalized recommendations.

Image Search

Search images by visual similarity or text descriptions using CLIP embeddings.

Anomaly Detection

Identify outliers by finding vectors far from their nearest neighbors.

Clustering

Group similar vectors together for data analysis and organization.

Related Documentation