Report Open Access

Anomaly Detection in the Elasticsearch Service

Jennifer Andersson

The Elasticsearch Service is a distributed search and analytics engine widely used across CERN. Currently, 
issues in the service are resolved manually after being detected through internal monitoring by service 
managers. However, the number of clusters and metrics are large which makes them difficult to track, and 
issues are often discovered and reported by users. This is time consuming and disturbs the workflow of the 
service users. In light of this, the main objective of this project is to develop a model capable of identifying 
anomalies in the Elasticsearch Service clusters, in order to predict and eliminate service issues before they 
cause problems. This is done by analyzing the history of cluster data using machine learning methods. In 
this way, a single metric signaling service issues can be obtained and used to alarm service managers of 
upcoming issues. In 2017, a deep neural network model was developed for this purpose. However, several 
issues were identified with the model, the most severe being convergence issues in the autoencoder. In this 
project, a revised autoencoder based on long short-term memory neural networks (LSTM’s) is developed, 
tuned and evaluated. Finally, it is used on new Elasticsearch Service cluster data. The final model shows 
improved convergence compared to the previous model, and is able to detect real service issues based on 
the anomaly scores obtained. By combining the anomaly scores with those obtained by a model simply 
predicting the cluster state as a moving average of preceding states, the rate of false positives is reduced. 
The conclusion is that that a combined model, reporting anomalies based on a combination of the anomaly 
scores obtained by the LSTM based model and the moving average model, is the most sensitive to real 
service issues.  

Files (1.9 MB)
Name Size
Report_Jennifer_Andersson.pdf
md5:af2f56ef873fc3aa32c84fae6b4044d7
1.9 MB Download
0
0
views
downloads
All versions This version
Views 00
Downloads 00
Data volume 0 Bytes0 Bytes
Unique views 00
Unique downloads 00

Share

Cite as