LHCb Particle Identification Compression Challenge

Weisser, Constantin

doi:10.5281/zenodo.1231531

Published April 26, 2018 | Version v2

Dataset Open

LHCb Particle Identification Compression Challenge

Weisser, Constantin¹

1. Massachusetts Institute of Technology

A dataset containing Particle Identification information for a challenge to compress the contained features.

An example of how to use this dataset and a compression benchmark is given here: https://github.com/weissercn/LHCb_PID_Compression

-999 is the missing value identifier.

The goal is to compress features with the name of the form S[n]x[m], where [n] and [m] are integers. Features of the form S[n]aux[m], where [n] and [m] are integers, contain information present at any time during compression and decompression. These features do not have to be compressed. The feature labelled 'pid' is the truth information of which type of particle an example is. This information is not present in real running conditions, is not present during either compression or decompression and can only be used to evaluate how well the autoencoder did. An example of this is in the example code.

Notes

Distinguished between auxiliary and compressible variables

Files

LHCb_PID_obscured.csv

Files (2.0 GB)

Name	Size	Download all
LHCb_PID_obscured.csv md5:4bf4eab072df3919ce0a47a6b316d788	2.0 GB	Preview Download

	All versions	This version
Views	605	326
Downloads	140	97
Data volume	297.0 GB	205.5 GB

LHCb Particle Identification Compression Challenge

Creators

Description

Notes

Files

LHCb_PID_obscured.csv

Files (2.0 GB)