Published August 8, 2019 | Version v0
Dataset Open

CMS 2011A Simulation | Pythia 6 QCD 170-300 | pT > 375 GeV | MOD HDF5 Format


Simulated QCD jets from the Simulated QCD 170-300 Dataset of the CMS 2011 Open Data reprocessed into the MOD HDF5 format. Jets are provided at generator (truth) level in the GEN files and after GEANT4 detector simulation in the SIM files (which also contain associated GEN jets to facilitate studies involving both types of jets). Jets are selected from the hardest two anti-kT R=0.5 jets in events passing the Jet300 High Level Trigger (only relevant for SIM) and are required to have \(p_T^\text{jet}>375\) GeV, where \(p_T^\text{jet}\) includes a jet energy correction factor (again, only relevant for SIM). GEN jets contain truth-level particles with kinematic and PDG ID information, and SIM jets contain Particle Flow Candidates (PFCs) with kinematic, PDG ID, and vertex information. Additionally, jets have metadata describing their kinematics and provenance in the original CMS AOD files.

For additional details about the dataset, please see the accompanying paper, Exploring the Space of Jets with CMS Open Data. There, jets were further restricted to have \(|\eta^\text{jet}|<1.9\) to ensure tracking coverage and (in the case of SIM) have "medium" quality to reject fake jets.

The supported method for downloading, reading, and using this dataset is through the EnergyFlow Python package, which has additional documentation about how to read and use this and related datasets. Should any problems be encountered, please submit an issue on GitHub.

For reference, the other corresponding datasets of simulated jets available on Zenodo are:

There is an associated dataset of jets recorded by the CMS detector available on Zenodo:


Files (45.8 MB)

Name Size Download all
14.3 MB Download
31.5 MB Download

Additional details

Related works

Is supplement to
arXiv:1908.08542 (arXiv)
