Published March 15, 2023 | Version 1.0
Top quark pair production at the LHC, with all hadronic resolved decays for solving event combinatorics


R&D Datasets for solving event combinatorics in all hadronic top quark pair events at the LHC.

Used in the development of Topographs: Topological Reconstruction of Particle Physics Processes using Graph Neural Networks


The datasets contain 5.8M ttbar events in the all hadronic decay channel, with jets matched to the truth partons in the top quark decays.


Event generation

  • Centre of Mass energy: 13 TeV
  • MC Generator: MadGraph5_aMC@NLO v3.1.0, with MadSpin modelling the decays of the top quarks and W bosons.
  • Parton Shower: Pythia v .243
  • Detector response: Delphes v3.4.2 using ATLAS-like geometry
  • Jets reconstructed with anti-kt algorithm, R=0.4, using FastJet
  • b-Tagging corresponds to inclusive 70% b-jet efficiency

Event selection and truth matching

  • All events are required to have at least six reconstructed jets and exactly zero leptons (electrons or muons)
  • Partons are matched to jets using \(\Delta R\) matching, with \(\Delta R < 0.4\)
  • Events with partons matched to multiple jets or jets to multiple partons are discarded
  • Up to 16 jets are stored per event

In the training dataset 1,340,000 events have all partons from the ttbar decays matched to jets.

In the validation dataset, 71,000 events have all partons from the ttbar decays matched to jets.

In the testing dataset 76,000 events have all partons from the ttbar decays matched to jets.

Dataset format

The dataset is in h5 format and the key 'delphes' has the following numpy arrays:

jets (16), jets_indices (16), matchability, nbjets, njets, partons (10)

Jets structured numpy array per event:

  • (pt, eta, phi, energy, is_tagged)


  • Integer corresponding to the parton the jet is matched to
  • From 0 to 5: b1 W1j1 W1j2 b2 W2j1 W2j2 (1= from top, 2=from antitop)
  • -1 indicates not matched to a parton
  • Properties of matched partons can be obtained from the partons array


  • Which partons are matched to jets in event
  • Binary representation with bits corresponding to each parton (length 6) 0b111111
  • From left to right: b1 W1j1 W1j2 b2 W2j1 W2j2
  • 0b111000 (56) is one top fully matched, 0b000111 (7) is the other top fully matched, 0b111111 (63) is both tops fully matched

njets, nbjets:

  • How many jets/bjets in event


  • List of truth particles from ttbar decay: tops, Ws, quarks, ordered by top quark and its decays followed by anti-top and its decays
  • PDGID, pt, eta, phi, mass





Files (1.6 GB)

Name Size Download all
84.4 MB Download
1.5 GB Download
78.9 MB Download

Additional details


Swiss National Science Foundation
Robust Deep Density Models for High-Energy Particle Physics and Solar Flare Analysis (RODEM) CRSII5_193716