Published August 20, 2024 | Version Version 1
Dataset Open

Dataset for flavour tagging R&D

  • 1. Laboratório de Instrumentação e Física Experimental de Partículas (Portugal)
  • 2. ROR icon SLAC National Accelerator Laboratory

Description

This is a dataset for flavour tagging R&D.

It consists of b-jets, c-jets and light-jets in equal number and equal distributions of transverse momentum, pseudo-rapidity and track multiplicity. 

The jets are sampled from ttbar events produced from proton-proton collisions at 14 TeV, using Pythia8. An ATLAS-like detector is parameterized using Delphes. The anti-kT R=0.4 algorithm with calorimeter inputs is used to define the jets.

Jets are labelled as b-jets if there is a b-hadron within dR<0.3 of the jet, otherwise as c-jets if there is a c-hadron within dR<0.3, otherwise as light-jets. 

Tracks are associated to jets if they are within dR<0.4. If this condition holds for more than one jet, only the closest one is considered. Tracks are represented by their perigee parameters (d0, z0, pt, phi, cotan(theta)), which are smeared according to track pt and track eta dependent uncertainties.

This dataset is heavily based on the Secondary Vertex Finding in Jets Dataset and we thank J. Shlomi for the code and input.

Technical info

Full statistics available in: train_*.root (140M jets), validate.root (47M jets), test.root (47M jets)

Reduced statistics available in: train_small.root (2.2M jets), validate_small.root (750k jets), test_small.root (747k jets)

 

Dataset contents & definitions:

  • Jets:
    • jet_pt, jet_eta, jet_phi, jet_M, jet_flav
  • Tracks:
    • n_trks: number of tracks per jet
    • trk_pythia_i
    • trk_pdg_id
    • trk_vtx_index
    • trk_d0, trk_z0, trk_phi, trk_ctgtheta
    • trk_pt, trk_charge, trk_eta,
    • trk_d0err, trk_z0err, trk_phierr, trk_ctgthetaerr, trk_pterr
    • trk_prod_x, trk_prod_y, trk_prod_z
    • trk_dec_x, trk_dec_y, trk_dec_z
  • Hadrons:
    • hadron_pdgid
    • hadron_decx, hadron_decy, hadron_decz: decay point
    • hadron_x, hadron_y, hadron_z: production point
  • Vertices:
    • n_vertices: counter for number of particle production points within a jet; vertices within 0.0001 mm of each other are merged
    • true_vtx_x, true_vtx_y, true_vtx_z
    • true_vtx_L3D
    • true_vtx_ntrks

Files

Files (137.8 GB)

Name Size Download all
md5:b1e138e4380cfe4700bcb07a3be0a0f0
27.3 GB Download
md5:db0300032bfe6312fcf1ad2c7a078789
436.6 MB Download
md5:5071d484de42499b6acd8c6b2f00cdb4
8.1 GB Download
md5:022c088efb01881af0cbae663d3de89b
7.0 GB Download
md5:da3f44c24bd133c621a9ce40531946ad
8.1 GB Download
md5:bbdad7b16050fe6e4bc645aac4b2caf9
8.1 GB Download
md5:2cf7af1850b8a3d5dd7ae533322c623e
8.3 GB Download
md5:544b32411a5a5ee9686e129388c0e17e
8.5 GB Download
md5:36e196da4c5a9495ab5d3a083f1f9528
8.6 GB Download
md5:ee600c43ab5680d426449061dddd6d41
8.5 GB Download
md5:bafb7f2fcd59390bddc4961c74bd8070
8.1 GB Download
md5:b957d0a868e1382e6cc73f846203fbba
7.8 GB Download
md5:442b9eb821b3ae11843479a9e5cae31e
1.3 GB Download
md5:e4c36edd74c7b5c371d96403da734d5c
27.4 GB Download
md5:604a3d8dd76e1bb8fc4e36b39c959588
434.8 MB Download

Additional details

Related works

Is new version of
Peer review: 10.1103/PhysRevD.110.052010 (DOI)
Dataset: 10.5281/zenodo.4044628 (DOI)
Is part of
Preprint: arXiv:2409.12589 (arXiv)

Funding

Fundação para a Ciência e Tecnologia
FCT RESTART 2023.00042.RESTART
European Union
Marie Skłodowska-Curie 847648
United States Department of Energy
OPERATION AND MAINTENANCE LINEAR ACCELLERATOR DEAC02-76SF00515
Knut and Alice Wallenberg Foundation
Knut and Alice Wallenberg Foundation Postdoctoral Scholarship KAW 2022.0358

References

  • Shlomi, J. (2020). Secondary Vertex Finding in Jets Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4044628
  • Bierlich, C (2022). A comprehensive guide to the physics and usage of PYTHIA 8.3. SciPost Phys.Codeb. 2022 (2022) 8. arXiV:2203.11601
  • The Delphes3 collaboration. DELPHES 3: a modular framework for fast simulation of a generic collider experiment. JHEP 02 (2014) 057. arXiv:1307.6346