DocumentationNeurondB Documentation

Embedding Engine

Overview

The Embedding Engine provides multi-modal embedding generation capabilities, transforming text, images, and mixed data into dense vector representations using state-of-the-art transformer models. With support for OpenAI, Cohere, HuggingFace, and custom models, the Embedding Engine enables semantic search, similarity matching, and AI-powered applications directly in PostgreSQL.

Key Features

  • Multi-Modal Support: Text, images, audio, and mixed data embeddings
  • Multiple Providers: OpenAI, Cohere, HuggingFace, and custom models
  • Automatic Caching: Intelligent caching to reduce API calls and latency
  • Batch Processing: Efficient batch generation for high-throughput scenarios
  • GPU Acceleration: Automatic GPU offload for transformer models
  • Model Management: Version control, A/B testing, and model switching

What Are Embeddings?

Embeddings are dense vector representations that capture semantic meaning in a high-dimensional space. Unlike traditional keyword-based representations, embeddings encode contextual relationships, enabling machines to understand similarity and meaning across different data types.

Why Use Embeddings?

  • Semantic Understanding: Find conceptually similar content, not just exact matches
  • Language Independence: Similar concepts in different languages have similar embeddings
  • Cross-Modal Search: Search images with text, or text with images
  • Context Awareness: Understand meaning based on surrounding context

Text Embeddings

Generate embeddings from text using various transformer models optimized for different use cases, languages, and quality requirements.

Basic Usage

Generate text embedding

-- Generate embedding for a single text
SELECT embed_text('Machine learning with PostgreSQL', 'text-embedding-ada-002');

-- Use in similarity search
SELECT 
  id,
  content,
  embedding <=> embed_text('PostgreSQL vector search', 'text-embedding-ada-002') AS distance
FROM documents
ORDER BY distance
LIMIT 10;

Batch Generation

Generate embeddings for multiple texts efficiently with automatic batching and parallel processing.

Batch text embeddings

-- Generate embeddings for multiple texts
SELECT embed_text_batch(
  ARRAY[
    'First document text',
    'Second document text',
    'Third document text'
  ],
  'text-embedding-ada-002'
) AS embeddings;

-- Bulk insert with embeddings
INSERT INTO documents (content, embedding)
SELECT 
  content,
  embed_text(content, 'text-embedding-ada-002')
FROM source_documents
WHERE embedding IS NULL;

Supported Text Models

ProviderModelDimensionsBest For
OpenAItext-embedding-ada-0021536General purpose, production-ready
OpenAItext-embedding-3-small1536Cost-effective, high quality
OpenAItext-embedding-3-large3072Maximum quality, larger vectors
Cohereembed-english-v3.01024English text, high quality
Cohereembed-multilingual-v3.01024100+ languages
HuggingFaceall-MiniLM-L6-v2384Fast, lightweight, self-hosted
HuggingFaceall-mpnet-base-v2768High quality, self-hosted
HuggingFaceparaphrase-multilingual-MiniLM38450+ languages, self-hosted

Image Embeddings

Generate embeddings from images using CLIP (Contrastive Language-Image Pre-training) models, enabling cross-modal search between images and text.

Basic Usage

Generate image embedding

-- Generate embedding from image file
SELECT embed_image('/path/to/image.jpg', 'CLIP-ViT-B-32');

-- Generate from image URL
SELECT embed_image_url('https://example.com/image.jpg', 'CLIP-ViT-B-32');

-- Generate from base64 encoded image
SELECT embed_image_base64(base64_data, 'CLIP-ViT-B-32')
FROM image_data;

Image Search

Search images by visual similarity or using text descriptions.

Image similarity search

-- Find similar images
SELECT 
  id,
  image_path,
  image_embedding <=> query_embedding AS distance
FROM images,
     (SELECT embed_image('/path/to/query.jpg', 'CLIP-ViT-B-32') AS query_embedding) q
ORDER BY distance
LIMIT 10;

-- Search images with text
SELECT 
  id,
  image_path,
  image_embedding <=> embed_text('a red sports car', 'CLIP-ViT-B-32') AS distance
FROM images
ORDER BY distance
LIMIT 10;

Supported Image Models

ModelDimensionsBest For
CLIP-ViT-B-32512General purpose, fast
CLIP-ViT-L-14768High quality, detailed images
CLIP-ViT-B-16512Balanced quality and speed

Multi-Modal Embeddings

Combine text and image embeddings in the same vector space, enabling cross-modal search and unified semantic understanding across different data types.

Cross-Modal Search

Cross-modal search

-- Search images with text
SELECT 
  id,
  image_path,
  image_embedding <=> embed_text('a sunset over mountains', 'CLIP-ViT-B-32') AS similarity
FROM images
ORDER BY similarity
LIMIT 10;

-- Search text with images
SELECT 
  id,
  content,
  text_embedding <=> embed_image('/path/to/query.jpg', 'CLIP-ViT-B-32') AS similarity
FROM documents
ORDER BY similarity
LIMIT 10;

Unified Embedding Space

Store text and image embeddings in the same table and search across both modalities simultaneously.

Unified search

-- Create unified content table
CREATE TABLE content (
  id SERIAL PRIMARY KEY,
  type TEXT,  -- 'text' or 'image'
  content TEXT,  -- text content or image path
  embedding vector(512)  -- CLIP embeddings
);

-- Insert text and images
INSERT INTO content (type, content, embedding)
SELECT 
  'text',
  content,
  embed_text(content, 'CLIP-ViT-B-32')
FROM text_documents;

INSERT INTO content (type, content, embedding)
SELECT 
  'image',
  image_path,
  embed_image(image_path, 'CLIP-ViT-B-32')
FROM images;

-- Search across all content types
SELECT 
  type,
  content,
  embedding <=> embed_text('nature photography', 'CLIP-ViT-B-32') AS similarity
FROM content
ORDER BY similarity
LIMIT 20;

Supported Models

OpenAI Models

  • text-embedding-ada-002: General purpose, 1536 dimensions, production-ready
  • text-embedding-3-small: Cost-effective, 1536 dimensions
  • text-embedding-3-large: Maximum quality, 3072 dimensions

Cohere Models

  • embed-english-v3.0: High-quality English embeddings, 1024 dimensions
  • embed-multilingual-v3.0: 100+ languages, 1024 dimensions

HuggingFace Models

  • all-MiniLM-L6-v2: Fast, lightweight, 384 dimensions
  • all-mpnet-base-v2: High quality, 768 dimensions
  • paraphrase-multilingual-MiniLM: 50+ languages, 384 dimensions

CLIP Models

  • CLIP-ViT-B-32: General purpose, 512 dimensions
  • CLIP-ViT-L-14: High quality, 768 dimensions

Custom Models

Deploy custom transformer models in ONNX format for specialized use cases.

Deploy custom model

-- Deploy custom ONNX model
SELECT deploy_embedding_model(
  model_name => 'custom_text_encoder',
  model_path => '/path/to/model.onnx',
  input_type => 'text',
  output_dim => 768
);

-- Use custom model
SELECT embed_text('sample text', 'custom_text_encoder');

Caching & Performance

Automatic Caching

The Embedding Engine automatically caches embeddings to reduce API calls, latency, and costs. Identical inputs return cached results instantly.

Caching configuration

-- Configure embedding cache
SET neurondb.embedding_cache_size = 10000;  -- Cache 10K embeddings
SET neurondb.embedding_cache_ttl = 86400;     -- 24 hour TTL

-- Check cache statistics
SELECT * FROM neurondb_embedding_cache_stats();

-- Clear cache
SELECT clear_embedding_cache();

Batch Processing

Process multiple embeddings in parallel for improved throughput and efficiency.

Batch processing

-- Batch processing with automatic parallelization
SELECT embed_text_batch(
  texts,
  'text-embedding-ada-002',
  batch_size => 100,
  parallel => true
)
FROM (
  SELECT array_agg(content) AS texts
  FROM documents
  WHERE embedding IS NULL
) batch;

Performance Metrics

  • Single Embedding (API): 50-200ms (depends on provider)
  • Single Embedding (Cached): < 1ms
  • Batch (100 items, API): 200-500ms
  • Batch (100 items, Local): 10-50ms (GPU: 2-10ms)

Optimization Tips

  • Enable caching for frequently accessed content
  • Use batch processing for bulk operations
  • Deploy local models (HuggingFace) for low-latency requirements
  • Use GPU acceleration for local transformer models
  • Pre-compute embeddings during data ingestion

Use Cases

Semantic Search

Find documents, products, or content based on meaning rather than exact keywords.

Recommendation Systems

Recommend similar items, users, or content based on embedding similarity.

Image Search

Search images by visual similarity or text descriptions using CLIP embeddings.

Content Moderation

Identify similar content, detect duplicates, and flag inappropriate material.

Multilingual Search

Search across multiple languages using multilingual embedding models.

RAG Applications

Generate embeddings for retrieval augmented generation (RAG) pipelines.

Related Documentation