DocumentationNeurondB Documentation

RAG Pipeline

What is RAG?

Retrieval Augmented Generation (RAG) enhances LLM responses by retrieving relevant context from your database before generating answers. This grounds LLM outputs in your actual data, reducing hallucinations and improving accuracy.

RAG Workflow

  1. User Question: "What is PostgreSQL replication?"
  2. Retrieve: Find relevant documents using hybrid search
  3. Rerank: Score and sort results by relevance
  4. Generate: LLM creates answer using retrieved context
  5. Response: Return answer with source citations

Implementation

1. Document Ingestion

Ingest documents

CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  embedding vector(1536),
  metadata JSONB
);

-- Generate embeddings during insert
INSERT INTO documents (content, embedding, metadata)
SELECT 
  content,
  neurondb_embed(content, 'text-embedding-ada-002'),
  jsonb_build_object('source', 'docs', 'timestamp', now())
FROM source_documents;

2. Retrieval

Retrieve relevant documents

-- Vector similarity search
SELECT id, content, metadata,
       embedding <=> neurondb_embed('What is PostgreSQL?', 'text-embedding-ada-002') AS distance
FROM documents
ORDER BY distance
LIMIT 10;

3. Generation

Generate answer with context

-- Combine retrieved context with LLM
WITH retrieved AS (
  SELECT content
  FROM documents
  ORDER BY embedding <=> neurondb_embed('What is PostgreSQL?', 'text-embedding-ada-002')
  LIMIT 5
)
SELECT neurondb_llm_generate(
  'gpt-4',
  'Answer the question using only the provided context: ' || 
  string_agg(content, '

') || 
  '

Question: What is PostgreSQL?'
) AS answer;

Reranking

Use cross-encoder models to rerank initial results for better relevance.

Reranking

-- Rerank with cross-encoder
SELECT id, content,
       neurondb_rerank(
         'cross-encoder/ms-marco-MiniLM-L-6-v2',
         content,
         'What is PostgreSQL replication?'
       ) AS relevance_score
FROM (
  SELECT id, content
  FROM documents
  ORDER BY embedding <=> neurondb_embed('What is PostgreSQL replication?', 'text-embedding-ada-002')
  LIMIT 20
) candidates
ORDER BY relevance_score DESC
LIMIT 5;

RAG Components

Next Steps