DocumentationNeurondB Documentation

Performance

Performance Benchmarks

Test Environment: AWS r6i.2xlarge (8 vCPU, 64GB RAM), 10M vectors, 768 dimensions

OperationThroughputLatency (p95)Notes
Vector Insert50K/sec2msBulk COPY
HNSW Search (k=10)10K QPS5msef_search=40
Embedding Generation1K/sec10msBatch size 32
Hybrid Search5K QPS8msVector+FTS
Reranking2K/sec15msCross-encoder
GPU K-Means55K vectors/sec18ms10 clusters

Optimization Techniques

1. SIMD Acceleration

Automatic SIMD (Single Instruction Multiple Data) optimization for distance calculations using AVX2, AVX-512 (x86) or NEON (ARM).

  • AVX2 Speedup: 4-8x
  • AVX-512 Speedup: 8-16x
  • Auto Detection: Automatically enabled when available

2. Intelligent Caching

  • Embedding Cache: 95%+ hit rate, 50x faster than generation
  • Model Cache: Models loaded in shared memory, 99.8% hit rate
  • ANN Buffer: Hot centroids and entry points cached
  • Index Page Cache: 92%+ hit rate for frequently accessed vectors

3. Query Planning

Intelligent cost-based query planning chooses optimal execution paths:

  • Small result sets → Sequential scan
  • Medium result sets → IVF index
  • Large result sets → HNSW index
  • GPU available + large batch → GPU acceleration
  • Hybrid query → Parallel vector + FTS execution

Best Practices

1. Index Selection

Dataset SizeRecommended IndexParameters
< 100K vectorsHNSWm=16, ef=200
100K - 10M vectorsHNSW or IVFm=32, ef=400 or nlist=sqrt(n)
> 10M vectorsIVF + PQnlist=4000, PQ compression

2. Use Batch Operations

Batch embedding generation

-- Good: Batch embedding generation (5x faster)
UPDATE docs SET embedding = batch.emb
FROM (
  SELECT id, unnest(embed_text_batch(array_agg(content))) AS emb
  FROM docs GROUP BY id % 100
) batch WHERE docs.id = batch.id;

-- Bad: Individual calls
UPDATE docs SET embedding = embed_text(content);  -- Slow!

3. Monitor Cache Hit Rates

Cache statistics

SELECT * FROM neurondb_cache_stats();

-- Target hit rates:
--   Embeddings: > 50%
--   Models: > 95%
--   Index pages: > 90%

Next Steps