Top Quark Tagging Reference Dataset

Kasieczka, Gregor; Plehn, Tilman; Thompson, Jennifer; Russel, Michael

doi:10.5281/zenodo.2603256

Published March 22, 2019 | Version v0 (2018_03_27)

Dataset Open

Top Quark Tagging Reference Dataset

1. Institut für Experimentalphysik, Universität Hamburg, Germany
2. Institut für Theoretische Physik, Universität Heidelberg, Germany

A set of MC simulated training/testing events for the evaluation of top quark tagging architectures.

In total 1.2M training events, 400k validation events and 400k test events. Use “train” for training, “val” for validation during the training and “test” for final testing and reporting results.

Description

14 TeV, hadronic tops for signal, qcd diets background, Delphes ATLAS detector card with Pythia8
No MPI/pile-up included
Clustering of particle-flow entries (produced by Delphes E-flow) into anti-kT 0.8 jets in the pT range [550,650] GeV
All top jets are matched to a parton-level top within ∆R = 0.8, and to all top decay partons within 0.8
Jets are required to have |eta| < 2
The leading 200 jet constituent four-momenta are stored, with zero-padding for jets with fewer than 200
Constituents are sorted by pT, with the highest pT one first
The truth top four-momentum is stored as truth_px etc.
A flag (1 for top, 0 for QCD) is kept for each jet. It is called is_signal_new
The variable "ttv" (= test/train/validation) is kept for each jet. It indicates to which dataset the jet belongs. It is redundant as the different sets are already distributed as different files.

Files

Files (1.7 GB)

Name	Size	Download all
test.h5 md5:13163479dee30a5fe546e4536cc3d04d	347.8 MB	Download
train.h5 md5:45663819f47c13724f67eb0fd80bfa5c	1.0 GB	Download
val.h5 md5:dca4b7248027618f041f9baa86d360fc	347.4 MB	Download

Additional details

Butter, Anja; Kasieczka, Gregor; Plehn, Tilman and Russell, Michael (2017). Based on data from 10.21468/SciPostPhys.5.3.028 (1707.08966)
Kasieczka, Gregor et al (2019). Dataset used for arXiv:1902.09914 (The Machine Learning Landscape of Top Taggers)

	All versions	This version
Views	6,914	6,815
Downloads	3,472	3,429
Data volume	3.3 TB	3.2 TB

Top Quark Tagging Reference Dataset

Creators

Description

Files

Files (1.7 GB)

Additional details

References