Published October 25, 2019 | Version v0
Dataset Open

DCTR: Pythia e+e- -> Z -> dijets datasets

  • 1. Lawrence Berkeley National Lab

Description

A collection of datasets used in Neural Networks for Full Phase-space Reweighting and Parameter Tuning. Sample code for reproducing the results is available on GitHub.

Each dataset was generated with the Pythia 8.230 event generator. Particle-level \(e^+ e^- \to Z \to \text{dijet}\) events with about 100 particles in each event are clustered into jets using the anti-kt clustering algorithm (R = 0.8) with Fastjet 3.0.3. Every jet is presented as a list of constituents \((p_T, \eta, \phi, \text{particle ID}, \theta)\) where \(\theta = (\texttt{TimeShower:alphaSvalue}, \texttt{StringZ:aLund }, \texttt{StringFlav:probStoUD})\).

Training Datasets:
Each training file contains two arrays, X and Y.
X is an array of jets and Y is 0 (1) if the jet was generated with default (non-default) Pythia parameters. For a non-default jet (Y=1), the \(\theta\) in each constituent represents the value of the Pythia parameter that was used. Note that for a default jet (Y=0) the \(\theta\) in each constituent is not the default Pythia parameters, but \(\theta\) uniformly sampled in the same range as the Y=1 jets. 

The parameters were uniformly sampled in 

  • \(\texttt{TimeShower:alphaSvalue} \in [0.10, 0.18]\)
  • \(\texttt{StringZ:aLund } \in [0.50, 0.90]\)
  • \(\texttt{StringFlav:probStoUD } \in [0.10, 0.30]\)

The 1D datasets are labeled by which parameter was changed, and the 3D dataset simultaneously vary all three parameters. 

Test Datasets:
Each test dataset consists of an dictionary containing: 

  • 'jet': the jet constituents
  • 'multiplicity': Number of particles in jet
  • 'tau21': Nsubjettiness observable
  • 'tau32': Nsubjettiness observable
  • 'ECF_N3_B4': Energy Correlation Function(N=3, \(\beta\)=4)
  • 'ECF_N4_B4': Energy Correlation Function(N=4, \(\beta\)=4)

The corresponding \(\theta\) values for each test set are described in the paper

Files

Files (4.7 GB)

Name Size Download all
md5:ec59049d88732242b3e4451f673107c7
667.4 MB Download
md5:d3fbfe8930aa78b45596b89e87464dfb
665.9 MB Download
md5:a7d25719c4a1284d22d08a6af3e35ee1
661.1 MB Download
md5:bbb5593aa1ce99be9926087f732cf3dd
686.5 MB Download
md5:0555c32df68c38ad13a68a67f064bdb7
365.5 MB Download
md5:82c765c4442184627275b4c1adebe89b
336.3 MB Download
md5:9588ebf3e7a0a56f103d2e648184c93e
331.2 MB Download
md5:74f20d553dc9468844c31cc1d9cf7165
331.0 MB Download
md5:1209f29f6b858dea4aad902505e89ba8
309.0 MB Download
md5:bf8877f39b51497120e7fbe17b54f559
378.2 MB Download