DocumentationNeurondB Documentation
Performance
Performance Benchmarks
Test Environment: AWS r6i.2xlarge (8 vCPU, 64GB RAM), 10M vectors, 768 dimensions
| Operation | Throughput | Latency (p95) | Notes |
|---|---|---|---|
| Vector Insert | 50K/sec | 2ms | Bulk COPY |
| HNSW Search (k=10) | 10K QPS | 5ms | ef_search=40 |
| Embedding Generation | 1K/sec | 10ms | Batch size 32 |
| Hybrid Search | 5K QPS | 8ms | Vector+FTS |
| Reranking | 2K/sec | 15ms | Cross-encoder |
| GPU K-Means | 55K vectors/sec | 18ms | 10 clusters |
Optimization Techniques
1. SIMD Acceleration
Automatic SIMD (Single Instruction Multiple Data) optimization for distance calculations using AVX2, AVX-512 (x86) or NEON (ARM).
- AVX2 Speedup: 4-8x
- AVX-512 Speedup: 8-16x
- Auto Detection: Automatically enabled when available
2. Intelligent Caching
- Embedding Cache: 95%+ hit rate, 50x faster than generation
- Model Cache: Models loaded in shared memory, 99.8% hit rate
- ANN Buffer: Hot centroids and entry points cached
- Index Page Cache: 92%+ hit rate for frequently accessed vectors
3. Query Planning
Intelligent cost-based query planning chooses optimal execution paths:
- Small result sets → Sequential scan
- Medium result sets → IVF index
- Large result sets → HNSW index
- GPU available + large batch → GPU acceleration
- Hybrid query → Parallel vector + FTS execution
Best Practices
1. Index Selection
| Dataset Size | Recommended Index | Parameters |
|---|---|---|
| < 100K vectors | HNSW | m=16, ef=200 |
| 100K - 10M vectors | HNSW or IVF | m=32, ef=400 or nlist=sqrt(n) |
| > 10M vectors | IVF + PQ | nlist=4000, PQ compression |
2. Use Batch Operations
Batch embedding generation
-- Good: Batch embedding generation (5x faster)
UPDATE docs SET embedding = batch.emb
FROM (
SELECT id, unnest(embed_text_batch(array_agg(content))) AS emb
FROM docs GROUP BY id % 100
) batch WHERE docs.id = batch.id;
-- Bad: Individual calls
UPDATE docs SET embedding = embed_text(content); -- Slow!3. Monitor Cache Hit Rates
Cache statistics
SELECT * FROM neurondb_cache_stats();
-- Target hit rates:
-- Embeddings: > 50%
-- Models: > 95%
-- Index pages: > 90%Next Steps
- Indexing Guide - Learn about HNSW and IVF indexes
- GPU Acceleration - Enable GPU for 100x speedup
- Configuration - Tune GUC parameters