Additional information
-
If no labelled data, we use an unsupervised learner with the syntax
CREATE ANOMALY DETECTION MODEL <model_name>
without specifying the target to predict. MindsDB then adds a column calledoutlier
when generating results. - If we have labelled data, we use the regular model creation syntax. There is backend logic that chooses between a semi-supervised algorithm (currently XGBOD) vs. a supervised algorithm (currently CatBoost).
- If multiple models are provided, then we create an ensemble and take use majority voting
- See the anomaly detection proposal document for more information - https://docs.google.com/document/d/1Yd7ARZVg_67xlcY-JR2kuO7mak9Ia2YER1Jk0EdpEa0/edit#heading=h.mo4wxsae6t1d
Example usage
To run example queries, use the CSV intests/unit/ml_handlers/anomaly_detection.csv