Dataset for flavour tagging R&D
Description
This is a dataset for flavour tagging R&D.
It consists of b-jets, c-jets and light-jets in equal number and equal distributions of transverse momentum, pseudo-rapidity and track multiplicity.
The jets are sampled from ttbar events produced from proton-proton collisions at 14 TeV, using Pythia8. An ATLAS-like detector is parameterized using Delphes. The anti-kT R=0.4 algorithm with calorimeter inputs is used to define the jets.
Jets are labelled as b-jets if there is a b-hadron within dR<0.3 of the jet, otherwise as c-jets if there is a c-hadron within dR<0.3, otherwise as light-jets.
Tracks are associated to jets if they are within dR<0.4. If this condition holds for more than one jet, only the closest one is considered. Tracks are represented by their perigee parameters (d0, z0, pt, phi, cotan(theta)), which are smeared according to track pt and track eta dependent uncertainties.
This dataset is heavily based on the Secondary Vertex Finding in Jets Dataset and we thank J. Shlomi for the code and input.
Technical info
Full statistics available in: train_*.root (140M jets), validate.root (47M jets), test.root (47M jets)
Reduced statistics available in: train_small.root (2.2M jets), validate_small.root (750k jets), test_small.root (747k jets)
Dataset contents & definitions:
- Jets:
- jet_pt, jet_eta, jet_phi, jet_M, jet_flav
- Tracks:
- n_trks: number of tracks per jet
- trk_pythia_i
- trk_pdg_id
- trk_vtx_index
- trk_d0, trk_z0, trk_phi, trk_ctgtheta
- trk_pt, trk_charge, trk_eta,
- trk_d0err, trk_z0err, trk_phierr, trk_ctgthetaerr, trk_pterr
- trk_prod_x, trk_prod_y, trk_prod_z
- trk_dec_x, trk_dec_y, trk_dec_z
- Hadrons:
- hadron_pdgid
- hadron_decx, hadron_decy, hadron_decz: decay point
- hadron_x, hadron_y, hadron_z: production point
- Vertices:
- n_vertices: counter for number of particle production points within a jet; vertices within 0.0001 mm of each other are merged
- true_vtx_x, true_vtx_y, true_vtx_z
- true_vtx_L3D
- true_vtx_ntrks
Files
Files
(137.8 GB)
Name | Size | Download all |
---|---|---|
md5:b1e138e4380cfe4700bcb07a3be0a0f0
|
27.3 GB | Download |
md5:db0300032bfe6312fcf1ad2c7a078789
|
436.6 MB | Download |
md5:5071d484de42499b6acd8c6b2f00cdb4
|
8.1 GB | Download |
md5:022c088efb01881af0cbae663d3de89b
|
7.0 GB | Download |
md5:da3f44c24bd133c621a9ce40531946ad
|
8.1 GB | Download |
md5:bbdad7b16050fe6e4bc645aac4b2caf9
|
8.1 GB | Download |
md5:2cf7af1850b8a3d5dd7ae533322c623e
|
8.3 GB | Download |
md5:544b32411a5a5ee9686e129388c0e17e
|
8.5 GB | Download |
md5:36e196da4c5a9495ab5d3a083f1f9528
|
8.6 GB | Download |
md5:ee600c43ab5680d426449061dddd6d41
|
8.5 GB | Download |
md5:bafb7f2fcd59390bddc4961c74bd8070
|
8.1 GB | Download |
md5:b957d0a868e1382e6cc73f846203fbba
|
7.8 GB | Download |
md5:442b9eb821b3ae11843479a9e5cae31e
|
1.3 GB | Download |
md5:e4c36edd74c7b5c371d96403da734d5c
|
27.4 GB | Download |
md5:604a3d8dd76e1bb8fc4e36b39c959588
|
434.8 MB | Download |
Additional details
Related works
- Is new version of
- Peer review: 10.1103/PhysRevD.110.052010 (DOI)
- Dataset: 10.5281/zenodo.4044628 (DOI)
- Is part of
- Preprint: arXiv:2409.12589 (arXiv)
Funding
- Fundação para a Ciência e Tecnologia
- FCT RESTART 2023.00042.RESTART
- European Union
- Marie Skłodowska-Curie 847648
- United States Department of Energy
- OPERATION AND MAINTENANCE LINEAR ACCELLERATOR DEAC02-76SF00515
- Knut and Alice Wallenberg Foundation
- Knut and Alice Wallenberg Foundation Postdoctoral Scholarship KAW 2022.0358
References
- Shlomi, J. (2020). Secondary Vertex Finding in Jets Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4044628
- Bierlich, C (2022). A comprehensive guide to the physics and usage of PYTHIA 8.3. SciPost Phys.Codeb. 2022 (2022) 8. arXiV:2203.11601
- The Delphes3 collaboration. DELPHES 3: a modular framework for fast simulation of a generic collider experiment. JHEP 02 (2014) 057. arXiv:1307.6346