DocumentationNeurondB Documentation
Blend lexical and semantic ranking for precision
Scoring architecture
Hybrid retrieval uses multiple rankers and merges results. Use the default weighted sum or build custom scoring pipelines with SQL window functions.
Weighted hybrid scoring
WITH hybrid AS (
SELECT
d.id,
lex.rank AS bm25_rank,
sem.distance AS cosine_distance,
lex.rank * 0.4 + (1 - sem.distance) * 0.6 AS combined_score
FROM
lex_ranked_documents lex
JOIN sem_ranked_documents sem ON sem.id = lex.id
)
SELECT *
FROM hybrid
ORDER BY combined_score DESC
LIMIT 10;Metadata filtering
Apply row-level filters and tenant boundaries before scoring to keep results relevant. Use JSONB containment or partitioning for customer isolation.
Tenant scoping
WITH query_input AS (
SELECT embed_text('high availability failover guide') AS q_emb,
'enterprise'::text AS tenant
)
SELECT id,
title,
metadata ->> 'tenant' AS tenant,
embedding <-> (SELECT q_emb FROM query_input) AS distance
FROM knowledge_base,
query_input
WHERE metadata ->> 'tenant' = query_input.tenant
ORDER BY distance
LIMIT 15;Reranking with cross-encoders
After retrieving top K candidates, rerank them with ONNX cross-encoders for better semantic matching. Combine with canary weights to fail open if the reranker is unavailable.
Rerank candidates
WITH
initial AS (
SELECT id,
title,
embedding <-> embed_text('PostgreSQL failover') AS distance
FROM docs
ORDER BY distance
LIMIT 80
),
ranked AS (
SELECT id,
neurondb_rerank(
model_name => 'cross-encoder-nli-base',
query => 'PostgreSQL failover',
document => title
) AS cross_score
FROM initial
)
SELECT id, cross_score
FROM ranked
ORDER BY cross_score DESC
LIMIT 15;Hybrid Search Function
NeuronDB provides a hybrid_search function that combines vector similarity and full-text search in a single call:
Hybrid search function
-- Hybrid search combining vector and text search
SELECT
search_result.id,
hybrid_search_test.title,
hybrid_search_test.content,
search_result.score
FROM hybrid_search_test,
LATERAL hybrid_search(
'hybrid_search_test', -- table name
embed_text('database systems', 'all-MiniLM-L6-v2'), -- query vector
'database systems', -- query text for FTS
'{}'::text, -- additional config
0.7, -- alpha: vector weight (0-1)
5 -- top K results
) AS search_result(id, score)
WHERE hybrid_search_test.id = search_result.id
ORDER BY search_result.score DESC
LIMIT 5;Function Signature:
hybrid_search( table_name TEXT, -- Source table name query_vector VECTOR, -- Query embedding vector query_text TEXT, -- Query text for full-text search config TEXT, -- Additional configuration (JSON string) alpha REAL, -- Vector weight (0.0-1.0), 1-alpha = text weight top_k INTEGER -- Number of results to return ) RETURNS TABLE ( id INTEGER, -- Row ID from source table score REAL -- Combined relevance score )Next Steps
- Multi-Vector Search - Multiple embeddings per document
- Faceted Search - Category-aware filtering
- Temporal Search - Time-decay relevance
- RAG Playbooks - Complete RAG workflows
- Distance Metrics - Tune distance functions
- Reranking Guide - Cross-encoder reranking