Published September 7, 2021 | Version Version 1
Project deliverable Open

Deep Autoencoders for ATLAS Data Compression - George Dialektakis - Google Summer of Code 2021 Project

  • 1. Aristotle University of Thessaloniki
  • 1. Lund University
  • 2. Ohio State University
  • 3. University of Glasgow

Description

Storage is one of the main limiting factors to the recording of information from proton-proton collision events at the Large Hadron Collider (LHC), at CERN in Geneva. Hence, the ATLAS experiment at the LHC uses a so-called trigger system, which selects and transfers interesting events to the data storage system while filtering out the rest. However, if interesting events are buried in very large backgrounds and difficult to identify as a signal by the trigger system, they will also be discarded together with the background. To alleviate this problem, different compression algorithms are already in use to reduce the size of the data that is recorded. One of those state-of-the-art algorithms is an autoencoder network that tries to implement an approximation to the identity, f(x) = x, and given some input data, its goal is to create a lower-dimensional representation of those data in a latent space using an encoder network. This way when collisions happen on the ATLAS Collider, we run the encoder on the produced data and we save only the latent space representation. Then using this latent representation online the decoder network can reconstruct the original data.


The goal of this project is to experiment with different types of Autoencoders for data compression in-depth and optimize their performance in reconstructing the ATLAS event data. For this reason, three kinds of Autoencoders are proposed, and in specific, the Standard Autoencoder, the Variational Autoencoder, and the Sparse Autoencoder. The above Autoencoders and thoroughly tested using different parameters and data normalization techniques, as our ultimate goal is to obtain the best possible reconstructions of the original event data. The proposed implementations will be a decisive contribution towards future testing and analysis for the ATLAS experiment at CERN and will assist overcome the obstacle of needing much more storage space than in the past due to the increase in the size of the data generated by the continuous proton-proton collision events in CERN's Large Hadron Collider.

Files

Final_report.pdf

Files (2.9 MB)

Name Size Download all
md5:9c43bfb7f6d38bd036835a26476979cc
848.3 kB Download
md5:ae9b21b795fb91b45f6b50f5f747cb01
1.6 MB Preview Download
md5:91cc10589b3fd0cc309febffd00a8d3c
459.2 kB Preview Download