Ravi Charan Nudurupati
2020-06-08
<p>A massive amount of data is generated by the Openstack cloud services in the format of service logs. Besides timestamps and log level fields, these logs contain additional information useful for pattern analysis. Unfortunately, this information is generally exposed in semi-structured text format, not allowing direct analysis without additional munging of the data. Traditional approaches to extract information from those fields are rule-based, mainly applying regular expressions upon knowledge of the text structure. These approaches require a pre-knowledge of all text patterns and are not scalable with the growth of the services. This report proposes a solution that is a mixture of the MinHash Locality Sensitive Hashing and the DB scan algorithm for data clustering. <br>
</p>
https://doi.org/10.5281/zenodo.3885380
oai:zenodo.org:3885380
Zenodo
https://doi.org/10.5281/zenodo.3885379
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
CERN openlab
summer-student programme
Machine Learning applications on OpenStack log data analysis
info:eu-repo/semantics/report