DocumentationNeurondB Documentation

Machine Learning & Embeddings

ML Capabilities

In-Database ML Inference

Run ML models directly inside PostgreSQL with zero data movement, batch inference for high throughput, real-time predictions with low latency, and automatic GPU acceleration when available.

Embedding Generation

Generate embeddings from text, images, and more. Supports OpenAI, Cohere, HuggingFace models, custom model deployment, automatic batching and caching, and multi-modal embeddings (text, image, audio).

Model Management

Deploy and manage ML models efficiently with model versioning and rollback, A/B testing support, resource quota management, and performance monitoring.

Supported Models

Text Embeddings

text-embedding-ada-002 (OpenAI) - 1536 dimensions - General text similarity
text-embedding-3-small (OpenAI) - 1536 dimensions - Efficient embeddings
text-embedding-3-large (OpenAI) - 3072 dimensions - High quality embeddings
embed-english-v3.0 (Cohere) - 1024 dimensions - English text
embed-multilingual-v3.0 (Cohere) - 1024 dimensions - Multilingual text

Sentence Transformers

all-MiniLM-L6-v2 (HuggingFace) - 384 dimensions - Fast, lightweight
all-mpnet-base-v2 (HuggingFace) - 768 dimensions - High quality
paraphrase-multilingual-MiniLM (HuggingFace) - 384 dimensions - 50+ languages

Multimodal

CLIP-ViT-B-32 (OpenAI) - 512 dimensions - Image + text
CLIP-ViT-L-14 (OpenAI) - 768 dimensions - High quality image search

ML Functions

embed_text()

Generate text embeddings with automatic caching.

Signature: embed_text(text TEXT, model TEXT DEFAULT 'all-MiniLM-L6-v2') RETURNS vector

Example

SELECT embed_text('Machine learning with PostgreSQL');

embed_text_batch()

Generate embeddings for multiple texts efficiently.

Signature: embed_text_batch(texts TEXT[], model TEXT DEFAULT 'all-MiniLM-L6-v2') RETURNS vector[]

Example

SELECT embed_text_batch(ARRAY['text1', 'text2'], 'all-MiniLM-L6-v2');

train_random_forest_classifier()

Train Random Forest classifier with GPU support.

Signature: train_random_forest_classifier(table_name TEXT, features_col TEXT, label_col TEXT, n_trees INT, max_depth INT)

Example

SELECT train_random_forest_classifier('training_data', 'features', 'label', 100, 10);

cluster_kmeans()

K-means clustering with GPU acceleration.

Signature: cluster_kmeans(table_name TEXT, vector_column TEXT, k INTEGER, max_iter INTEGER DEFAULT 100)

Example

SELECT cluster_kmeans('documents', 'embedding', 5, 100);

Next Steps

ONNX Inference - Deploy ONNX models
Embeddings - Generate embeddings
Clustering - ML clustering algorithms

PreviousAnalytics

NextInference