DocumentationNeurondB Documentation

Vector Engine

Overview

The Vector Engine is NeurondB's high-performance approximate nearest neighbor (ANN) search system, designed for billion-scale vector similarity search with millisecond latency. It combines state-of-the-art indexing algorithms (HNSW and IVF) with multiple distance metrics and advanced quantization techniques to deliver production-ready vector search capabilities directly in PostgreSQL.

Key Capabilities

HNSW Indexing: Hierarchical Navigable Small World graphs for excellent recall and speed
IVF Indexing: Inverted File indexes for large-scale datasets with efficient memory usage
Multiple Distance Metrics: Cosine, L2 (Euclidean), inner product, and custom metrics
Quantization: Scalar and product quantization to reduce memory footprint by 4-8x
GPU Acceleration: Optional CUDA/ROCm/Metal support for 10-100x faster distance computation
Adaptive Index Selection: Automatic index type selection based on dataset characteristics

Performance Benchmarks

Dataset Size	Index Type	Query Latency (p95)	Recall@10
1M vectors	HNSW	2.3ms	98.5%
10M vectors	HNSW	4.7ms	97.2%
100M vectors	IVF	8.1ms	94.8%
1B vectors	IVF + PQ	12.4ms	92.1%

Indexing Algorithms

HNSW (Hierarchical Navigable Small World)

HNSW builds a multi-layer graph structure where each layer is a subset of the previous layer, creating a hierarchical navigation system. This design enables logarithmic search complexity and excellent recall rates.

Advantages

Excellent recall (95-99% for typical configurations)
Fast query performance (2-5ms for millions of vectors)
Incremental updates without full rebuilds
Works well for datasets up to 100M vectors

Configuration

Create HNSW index

-- Basic HNSW index with cosine distance
CREATE INDEX documents_embedding_idx ON documents 
USING hnsw (embedding vector_cosine_ops);

-- Tuned HNSW index for high recall
CREATE INDEX documents_embedding_idx ON documents 
USING hnsw (embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);

-- Runtime tuning for query performance
SET hnsw.ef_search = 100;  -- Higher = better recall, slower queries

IVF (Inverted File)

IVF partitions the vector space into clusters (centroids) using k-means clustering. At query time, only the nearest clusters are searched, dramatically reducing the search space for large datasets.

Advantages

Efficient for very large datasets (100M+ vectors)
Lower memory footprint than HNSW
Faster index build time
Scales to billions of vectors with quantization

Configuration

Create IVF index

-- Basic IVF index
CREATE INDEX documents_embedding_idx ON documents 
USING ivf (embedding vector_l2_ops)
WITH (lists = 100);

-- IVF for large datasets (sqrt(num_rows) is a good starting point)
CREATE INDEX documents_embedding_idx ON documents 
USING ivf (embedding vector_l2_ops)
WITH (lists = 3162);  -- For ~10M vectors

-- Runtime tuning
SET ivf.probes = 20;  -- Number of clusters to search

Index Selection Guide

Dataset Size	Recommended Index	Reason
< 1M vectors	HNSW	Best recall and speed
1M - 10M vectors	HNSW	Excellent performance, manageable memory
10M - 100M vectors	HNSW or IVF	HNSW if memory allows, IVF for constraints
100M+ vectors	IVF + PQ	Memory efficiency with quantization

Distance Metrics

NeurondB supports multiple distance metrics, each optimized for different use cases and data types. The choice of distance metric significantly impacts search quality and should match your embedding model's training objective.

Cosine Distance

Measures the angle between vectors, normalized to [0, 2]. Ideal for text embeddings where direction matters more than magnitude. Most embedding models (OpenAI, Cohere, Sentence Transformers) are optimized for cosine similarity.

Cosine distance search

-- Cosine distance operator (<=>)
SELECT id, content, embedding <=> query_vec AS distance
FROM documents,
     (SELECT embed_text('search query', 'text-embedding-ada-002') AS query_vec) q
ORDER BY distance
LIMIT 10;

L2 (Euclidean) Distance

The straight-line distance between two points in vector space. Lower values indicate greater similarity. Commonly used for image embeddings and spatial data where absolute distances matter.

L2 distance search

-- L2 distance operator (<->)
SELECT id, content, embedding <-> query_vec AS distance
FROM documents,
     (SELECT '[0.1, 0.2, 0.3, ...]'::vector AS query_vec) q
ORDER BY distance
LIMIT 10;

Inner Product (Dot Product)

Computes the dot product of two vectors. Higher values indicate greater similarity. Use with normalized vectors for maximum inner product search (MIPS), which is equivalent to cosine similarity for normalized vectors.

Inner product search

-- Inner product operator (<#>)
SELECT id, content, embedding <#> query_vec AS score
FROM documents,
     (SELECT '[0.1, 0.2, 0.3, ...]'::vector AS query_vec) q
ORDER BY score DESC  -- Higher is better for inner product
LIMIT 10;

Distance Metric Selection

Use Case	Recommended Metric	Reason
Text embeddings (OpenAI, Cohere)	Cosine	Models trained for cosine similarity
Image embeddings (CLIP)	Cosine or L2	Depends on model training
Spatial/geometric data	L2	Preserves actual distances
Normalized vectors	Inner Product	Equivalent to cosine, faster computation

Quantization Techniques

Quantization reduces memory footprint and can accelerate search by compressing vector representations. NeurondB supports both scalar quantization (per-dimension) and product quantization (subspace-based).

Scalar Quantization

Reduces precision from 32-bit floats to 8-bit integers (INT8) or 16-bit floats (FP16). Provides 4x memory reduction with minimal accuracy loss (typically < 1% recall degradation).

Scalar quantization

-- Create quantized index (INT8)
CREATE INDEX documents_embedding_idx ON documents 
USING hnsw (embedding vector_cosine_ops)
WITH (quantization = 'int8');

-- Memory savings: 4x reduction
-- Example: 1M vectors × 1536 dims × 4 bytes = 6.1 GB
--          With INT8: 1M × 1536 × 1 byte = 1.5 GB

Product Quantization (PQ)

Divides vectors into subvectors and quantizes each subspace independently. Provides 8-16x memory reduction with slightly higher accuracy loss (typically 2-5% recall degradation). Ideal for billion-scale datasets.

Product quantization

-- Create PQ index
CREATE INDEX documents_embedding_idx ON documents 
USING ivf (embedding vector_l2_ops)
WITH (
  lists = 1000,
  quantization = 'pq',
  pq_m = 64,      -- Number of subvectors
  pq_k = 256      -- Codebook size per subvector
);

-- Memory savings: 8-16x reduction
-- Example: 1B vectors × 1536 dims × 4 bytes = 6.1 TB
--          With PQ: ~400-800 GB

Quantization Trade-offs

Quantization Type	Memory Reduction	Recall Impact	Query Speed	Best For
None (FP32)	1x	100%	Baseline	Small datasets, maximum accuracy
INT8 (Scalar)	4x	99%+	Faster	Medium datasets, balanced
FP16 (Scalar)	2x	99.5%+	Similar	GPU acceleration
PQ	8-16x	95-98%	Faster	Billion-scale datasets

Performance Characteristics

Query Latency

HNSW: 2-5ms for 1M vectors, 4-8ms for 10M vectors
IVF: 5-10ms for 100M vectors, 10-15ms for 1B vectors
With GPU: 2-3x faster for batch queries

Index Build Time

HNSW: ~10-30 minutes for 10M vectors
IVF: ~5-15 minutes for 10M vectors
With GPU: 3-5x faster clustering for IVF

Memory Usage

HNSW: ~1.5-2x vector size (graph structure overhead)
IVF: ~1.1-1.3x vector size (centroid storage)
With INT8: 4x reduction
With PQ: 8-16x reduction

Optimization Tips

Use HNSW for datasets < 100M vectors for best recall
Use IVF + PQ for billion-scale datasets
Enable GPU acceleration for batch operations
Tune ef_search (HNSW) or probes (IVF) to balance recall and latency
Monitor index fragmentation and rebuild when needed

Use Cases

Semantic Search

Find documents, products, or content based on meaning rather than exact keywords.

Semantic search example

-- Find similar documents
SELECT id, title, content
FROM documents
WHERE embedding <=> embed_text('machine learning algorithms', 'text-embedding-ada-002') < 0.3
ORDER BY embedding <=> embed_text('machine learning algorithms', 'text-embedding-ada-002')
LIMIT 10;

Recommendation Systems

Find similar items, users, or content for personalized recommendations.

Image Search

Search images by visual similarity or text descriptions using CLIP embeddings.

Anomaly Detection

Identify outliers by finding vectors far from their nearest neighbors.

Clustering

Group similar vectors together for data analysis and organization.

Vector Engine

Overview

Key Capabilities

Performance Benchmarks

Indexing Algorithms

HNSW (Hierarchical Navigable Small World)

Advantages

Configuration

Create HNSW index

IVF (Inverted File)

Advantages

Configuration

Create IVF index

Index Selection Guide

Distance Metrics

Cosine Distance

Cosine distance search

L2 (Euclidean) Distance

L2 distance search

Inner Product (Dot Product)

Inner product search

Distance Metric Selection

Quantization Techniques

Scalar Quantization

Scalar quantization

Product Quantization (PQ)

Product quantization

Quantization Trade-offs

Performance Characteristics

Query Latency

Index Build Time

Memory Usage

Optimization Tips

Use Cases

Semantic Search

Semantic search example

Recommendation Systems

Image Search

Anomaly Detection

Clustering

Related Documentation