Anomaly Detection in the Elasticsearch Service

Jennifer Andersson

doi:10.5281/zenodo.3550764

Published November 22, 2019 | Version v1

Report Open

Anomaly Detection in the Elasticsearch Service

Jennifer Andersson

The Elasticsearch Service is a distributed search and analytics engine widely used across CERN. Currently,
issues in the service are resolved manually after being detected through internal monitoring by service
managers. However, the number of clusters and metrics are large which makes them difficult to track, and
issues are often discovered and reported by users. This is time consuming and disturbs the workflow of the
service users. In light of this, the main objective of this project is to develop a model capable of identifying
anomalies in the Elasticsearch Service clusters, in order to predict and eliminate service issues before they
cause problems. This is done by analyzing the history of cluster data using machine learning methods. In
this way, a single metric signaling service issues can be obtained and used to alarm service managers of
upcoming issues. In 2017, a deep neural network model was developed for this purpose. However, several
issues were identified with the model, the most severe being convergence issues in the autoencoder. In this
project, a revised autoencoder based on long short-term memory neural networks (LSTM’s) is developed,
tuned and evaluated. Finally, it is used on new Elasticsearch Service cluster data. The final model shows
improved convergence compared to the previous model, and is able to detect real service issues based on
the anomaly scores obtained. By combining the anomaly scores with those obtained by a model simply
predicting the cluster state as a moving average of preceding states, the rate of false positives is reduced.
The conclusion is that that a combined model, reporting anomalies based on a combination of the anomaly
scores obtained by the LSTM based model and the moving average model, is the most sensitive to real
service issues.

Files

Report_Jennifer_Andersson.pdf

Files (1.9 MB)

Name	Size	Download all
Report_Jennifer_Andersson.pdf md5:af2f56ef873fc3aa32c84fae6b4044d7	1.9 MB	Preview Download

	All versions	This version
Views	470	470
Downloads	731	731
Data volume	1.5 GB	1.5 GB

Anomaly Detection in the Elasticsearch Service

Creators

Description

Files

Report_Jennifer_Andersson.pdf

Files (1.9 MB)