Published October 29, 2024 | Version v1
Dataset Open

PYTHIA Jet Datasets for cDDPM Unfolder

  • 1. ROR icon Tufts University

Description

Datasets of QCD jets used for studying unfolding in "Towards Universal Unfolding using Denoising Diffusion Probabilistic Models" consist of two different detector simulation frameworks:

Data-driven detector smearing:

  • Events were generated using PYTHIA 8.3 for proton-proton collisions at √s = 14 TeV
  • Several physics processes were simulated:
    • ttbar production with various PDFs (CT14lo, NNPDF23, CTEQ6L1)
    • Z+jets (Z → μμ) with CT14lo, NNPDF23, CTEQ6L1
    • W+jets (W → μν) with CT14lo, NNPDF23, CTEQ6L1
    • Dijet production
    • Leptoquark production
  • Jets with radius parameter R = 0.4 were reconstructed using the anti-kT algorithm at particle-level ("gen_jets"), and then detector effects were applied ("reco_jets")
  • Detector effects were simulated using ATLAS 8 TeV calibration data-derived jet resolution functions for pT, η, and φ
  • Phase space bias was applied to some samples to enhance high-pT statistics: (pT_hat/pT_ref)^a with pT_ref = 100 GeV and a = 5

 

DELPHES CMS detector simulation:

  • Events were generated using PYTHIA 8.3 for proton-proton collisions at √s = 14 TeV
  • Physics processes included:
    • ttbar production with CTEQ6L1
    • Z+jets (Z → μμ) with CTEQ6L1
    • W+jets (W → μν) with CTEQ6L1
    • Dijet production with CTEQ6L1
    • Leptoquark production with CTEQ6L1
  • Events were passed through DELPHES 3.4.2 fast detector simulation using the CMS detector configuration
  • Jets with radius parameter R = 0.4 were reconstructed using the anti-kT algorithm at both particle level ("gen_jets") and detector level ("reco_jets")
  • Phase space bias was applied to some samples using (pT_hat/pT_ref)^a with pT_ref = 100 GeV and a = 5

 

For both frameworks, each dataset consists of several arrays containing jet kinematic information (pT, η, φ, E, px, py, pz) at both truth ("gen_jets") and detector ("reco_jets") level. Additional features such as event identifiers ("event_num") are included to enable reconstruction of event-level observables.

Files

Files (4.1 GB)

Name Size Download all
md5:8ae0033b8981bffb86dfbeacfc15dbb9
3.2 GB Download
md5:ca5c2b6e0cc092fe630ba4fe33f65c10
896.8 MB Download