DocumentationNeurondB Documentation

Outlier Detection

Z-score Outlier Detection

Z-score identifies outliers by measuring how many standard deviations a data point is from the mean. Threshold of 3.0 means flag values more than 3 standard deviations away.

Detect outliers

-- Detect outliers using Z-score method
SELECT detect_outliers_zscore(
    'train_data',        -- Table name
    'features',          -- Column with feature vectors
    3.0,                 -- Threshold (standard deviations)
    'zscore'             -- Method
) as outliers;

Isolation Forest

Isolation Forest detects outliers by isolating anomalies in random subspaces.

Isolation forest

SELECT detect_outliers_isolation_forest(
    'train_data',
    'features',
    100  -- n_estimators
) as outliers;

Next Steps