Skip to main content

Anomaly detection in Centreon is crucial for maintaining system reliability and performance, allowing for preemptive identification of issues before they escalate. This article delves into the workings of Centreon's anomaly detection mechanism, outlining its significance and the advanced machine learning techniques employed to ensure operational integrity.

 
Data Retrieval and Preprocessing
For any given metric tied to a service and host, the system initially retrieves historical data, with a default look-back period of 30 days. This period is adjustable based on the specific requirements of the metric in question. The retrieved data undergoes preprocessing to be formatted suitably for analysis and model training.

Machine Learning Model
Model Training
The core of the anomaly detection service is its regression-based supervised machine learning model, which is designed to predict the values of a given metric for the next 48 hours. The model training process considers various parameters, including:
  • Time of the day
  • Day of the week
  • Month
  • Seasonality aspects (e.g., the value of the metric in the previous week, if weekly seasonality is detected)
  • and more
Seasonality is determined automatically using the Fast Fourier Transform (FFT) technique, which helps in identifying and incorporating cyclic patterns into the prediction model.

The service tests hundreds of machine learning models to select the most effective one for making predictions. This selection is based on the model's accuracy and reliability in forecasting future metric values.


Predictions and Error Range Calculation
Once a model is trained, it is used to compute predictions for the upcoming 48 hours. Alongside predictions, the service calculates the error range, which represents the expected deviation between the predicted values and the actual values that will be observed. This error range is crucial for understanding the confidence level of the predictions.

Integration with Centreon
The predictions and their associated error ranges are not directly sent back to the Centreon interface. Instead, they are securely stored and made available to be fetched by Centreon every 8 hours, ensuring that the monitoring system is continuously updated with the latest forecast data.

Process Scheduling
The processes of model training and prediction generation are conducted independently:
  • Model Training: This is event-driven, based on the need for updating the model in response to new data or changing patterns in the metrics.
  • Prediction Generation: Predictions are calculated at least once a day to provide up-to-date forecasts.
Be the first to reply!

Reply