Test Sets for Jet Anomaly Detection at the LHC
Description
A few datasets are updated in Version 2.1. These datasets are tagged with 'new' in the file names.
- pT~1.2TeV top jets, the mother Z' mass is slightly adjusted to better match pT peaks around 1.2TeV
- Especially there was a bug in the previous version for top jets with mass 80 GeV `*top_m80_100k*` (the top jet mass was not correctly set). So please be careful when you use those datasets.
Data Description
These datasets are generated as a series of test sets for anomalous jet tagging at the LHC. They include boosted W jets, Top jets, and Higgs jets. Jet transverse momentum is focused around 600 GeV and 1200 GeV (with prefix "pt1200_" in file names). Each file includes 100k original events from MadGraph, but might have slightly less events in the final h5 files due to fatjet pre-selection. Production processes include:
- pp -> W' -> W (jj) Z(
); - pp -> Z' -> t t~;
. For m_t=80 GeV, the decay product W mass is set to 20 GeV. - pp -> HH -> (hh) (hh), (h -> jj);
,
Data Generation
Jet samples in this dataset are generated with MadGraph, Pythia8, and Delphes (no pile-up effects simulated). Particle flow objects are used to cluster jets. FastJet was used for jet clustering. Jets are clustered using the anti-kt algorithm with the cone size R=1.0.
- Leading jet:
; sub-leading jet:
Data Structure
- To get jets: f['objects/jets']
- For jets, there are two datasets: ['constituents', 'obs']. (jets information is stored with the higher-pt jet first)
- `obs[:, n_j - 1]`: jet four vectors and n-subjettiness for the
-th jet (pt, eta, phi, m, tau1, tau2, tau3, tau4, tau5) - pt-sorted (highest first) jet constituents information are stored in variable length arrays for the
-th jet `constituents[:, n_j - 1]`: (PID: PDG for tracks; [22] for photons; [0] for neutral hadrons)
- `obs[:, n_j - 1]`: jet four vectors and n-subjettiness for the
Extra Notes
- Since the dataset is structured as events, for W jet samples, only leading jet is available; while for Top and Higgs jets, leading and sub-leading jets are both valid. One might need to restrict the jet
range at use. - e.g. to get leading jet constituents: `f["objects/jets/constituents"][:, 0]`
- The file names are self-explanatory on the corresponding generation process. Each file was generated in 100K original Madgraph events. After the preselection, a small fraction of events is discarded.
Contact
- we are welcoming any feedback, suggestions, or requests on new test samples. Please contact chengtaoli.1990@gmail.com for more information.
Files
Files
(10.5 GB)
Name | Size | Download all |
---|---|---|
md5:878a2e82124505b473f170bc761e0375
|
415.0 MB | Download |
md5:160449b582832b0f7df649fbf9055106
|
309.1 MB | Download |
md5:9d290f8e0b1cf382014ea896110a1192
|
653.3 MB | Download |
md5:bf10738d086a8867720d1de3619d8ef0
|
653.2 MB | Download |
md5:69151006bbd7ac9ef6165fa49c9f865b
|
406.6 MB | Download |
md5:67f4492488021e1a871d61be6455007a
|
406.7 MB | Download |
md5:872693cdf8e93fca3fe0d4ca48674904
|
654.3 MB | Download |
md5:ecc587c614acf970f23939846f701abf
|
654.7 MB | Download |
md5:c2ca330a8f1620044fd0881a1663b426
|
1.1 GB | Download |
md5:2b804a1c0388b4cd4446146ddf26310e
|
875.6 MB | Download |
md5:24a78b4e7b41385b09d1e1f787b6c17e
|
206.5 MB | Download |
md5:3caf57fb762192b5c2d463fb6b647cd0
|
206.5 MB | Download |
md5:6b022b77298bd8c4b226f5b8259edd36
|
231.1 MB | Download |
md5:1cdc3b9d84d7e55b6e477503a4650fd8
|
230.9 MB | Download |
md5:3ebc9991a962eaca93215edbefee8c86
|
253.6 MB | Download |
md5:2955d2882458b48c182d7c6e5840b681
|
253.5 MB | Download |
md5:8ca4178c101948b2f653533c66de5486
|
189.7 MB | Download |
md5:ef11fbd81537a67092c6fefd7d7d2e7c
|
190.0 MB | Download |
md5:99db554eded764956fdd79114e9b2cca
|
346.0 MB | Download |
md5:1bea3066d64a7ed5031771f20af1cfde
|
347.2 MB | Download |
md5:808c88ebf6da4a7c03152a489115f942
|
237.4 MB | Download |
md5:8b744da7d65106b44e7e11bbe4c84488
|
236.4 MB | Download |
md5:927e8848ae3727a87f853e93915b1082
|
174.6 MB | Download |
md5:02bcf73b3ff7c9d1d96b2bb25580d5f6
|
174.8 MB | Download |
md5:bb7669eb2dc7bd73261b93a0130086af
|
189.2 MB | Download |
md5:aa9e8b182d2433c512b360598dd827a6
|
188.5 MB | Download |
md5:bb88288e6d00b8e16fdfddfb3162f877
|
194.9 MB | Download |
md5:d0152da6c7eabe19c7b243c575e14cd3
|
193.9 MB | Download |
md5:10c694766b0601b723c9bec31196ecb4
|
162.7 MB | Download |
md5:ceb2faa355e39cfb638133e92cab9443
|
162.7 MB | Download |