TrackML Throughput Phase
Creators
-
Salzburger, Andreas1
-
Innocente, Vincenzo1
-
vlimant, jean-roch2
-
rousseau, david3
-
Gligorov, Vladimir3
-
Basara, Laurent3
- Estrade, Victor3
-
Calafiura, Paolo4
-
Farell, Steven4
-
Gray, Heather4
-
Golling, Tobias5
-
Kiehn, Moritz5
- Amrouche, Sabrina5
-
Hushchyn, Mikhail6
-
Ustyuzhanin, Andrey6
-
Moyse, Edward7
-
Germain, Cecile8
-
Guyon, Isabelle8
- 1. CERN
- 2. California Institute of Technology
- 3. CNRS
- 4. LBNL
- 5. University of Geneva
- 6. School of Data Analysis
- 7. University of Massachusetts
- 8. INRIA
Description
Original source from Codalab : https://competitions.codalab.org/competitions/20112
The dataset comprises multiple independent events, where each event contains simulated measurements (essentially 3D points) of particles generated in a collision between proton bunches at the Large Hadron Collider at CERN. The goal of the tracking machine learning challenge is to group the recorded measurements or hits for each event into tracks, sets of hits that belong to the same initial particle. A solution must uniquely associate each hit to one track. The training dataset contains the recorded hits, their ground truth counterpart and their association to particles, and the initial parameters of those particles. The test dataset contains only the recorded hits.
Once unzipped, the dataset is provided as a set of plain .csv files. Each event has four associated files that contain hits, hit cells, particles, and the ground truth association between them. The common prefix, e.g. event000000010, is always event followed by 9 digits.
event000000000-hits.csv
event000000000-cells.csv
event000000000-particles.csv
event000000000-truth.csv
event000000001-hits.csv
event000000001-cells.csv
event000000001-particles.csv
event000000001-truth.csv
Event hits
The hits file contains the following values for each hit/entry:
-
hit_id: numerical identifier of the hit inside the event.
-
x, y, z: measured x, y, z position (in millimeter) of the hit in global coordinates.
-
volume_id: numerical identifier of the detector group.
-
layer_id: numerical identifier of the detector layer inside the group.
-
module_id: numerical identifier of the detector module inside the layer.
The volume/layer/module id could in principle be deduced from x, y, z. They are given here to simplify detector-specific data handling.
Event truth
The truth file contains the mapping between hits and generating particles and the true particle state at each measured hit. Each entry maps one hit to one particle.
-
hit_id: numerical identifier of the hit as defined in the hits file.
-
particle_id: numerical identifier of the generating particle as defined in the particles file. A value of 0 means that the hit did not originate from a reconstructible particle, but e.g. from detector noise.
-
tx, ty, tz true intersection point in global coordinates (in millimeters) between the particle trajectory and the sensitive surface.
-
tpx, tpy, tpz true particle momentum (in GeV/c) in the global coordinate system at the intersection point. The corresponding vector is tangent to the particle trajectory at the intersection point.
-
weight per-hit weight used for the scoring metric; total sum of weights within one event equals to one.
Event particles
The particles files contains the following values for each particle/entry:
-
particle_id: numerical identifier of the particle inside the event.
-
vx, vy, vz: initial position or vertex (in millimeters) in global coordinates.
-
px, py, pz: initial momentum (in GeV/c) along each global axis.
-
q: particle charge (as multiple of the absolute electron charge).
-
nhits: number of hits generated by this particle.
All entries contain the generated information or ground truth.
Event hit cells
The cells file contains the constituent active detector cells that comprise each hit. The cells can be used to refine the hit to track association. A cell is the smallest granularity inside each detector module, much like a pixel on a screen, except that depending on the volume_id a cell can be a square or a long rectangle. It is identified by two channel identifiers that are unique within each detector module and encode the position, much like column/row numbers of a matrix. A cell can provide signal information that the detector module has recorded in addition to the position. Depending on the detector type only one of the channel identifiers is valid, e.g. for the strip detectors, and the value might have different resolution.
-
hit_id: numerical identifier of the hit as defined in the hits file.
-
ch0, ch1: channel identifier/coordinates unique within one module.
-
value: signal value information, e.g. how much charge a particle has deposited.
Additional detector geometry information
The detector is built from silicon slabs (or modules, rectangular or trapezoïdal), arranged in cylinders and disks, which measure the position (or hits) of the particles that cross them. The detector modules are organized into detector groups or volumes identified by a volume id. Inside a volume they are further grouped into layers identified by a layer id. Each layer can contain an arbitrary number of detector modules, the smallest geometrically distinct detector object, each identified by a module_id. Within each group, detector modules are of the same type have e.g. the same granularity. All simulated detector modules are so-called semiconductor sensors that are build from thin silicon sensor chips. Each module can be represented by a two-dimensional, planar, bounded sensitive surface. These sensitive surfaces are subdivided into regular grids that define the detectors cells, the smallest granularity within the detector.
Each module has a different position and orientation described in the detectors file. A local, right-handed coordinate system is defined on each sensitive surface such that the first two coordinates u and v are on the sensitive surface and the third coordinate w is normal to the surface. The orientation and position are defined by the following transformation
pos_xyz = rotation_matrix * pos_uvw + translation
that transform a position described in local coordinates u,v,w into the equivalent position x,y,z in global coordinates using a rotation matrix and and translation vector (cx,cy,cz).
-
volume_id: numerical identifier of the detector group.
-
layer_id: numerical identifier of the detector layer inside the group.
-
module_id: numerical identifier of the detector module inside the layer.
-
cx, cy, cz: position of the local origin in the global coordinate system (in millimeter).
-
rot_xu, rot_xv, rot_xw, rot_yu, ...: components of the rotation matrix to rotate from local u,v,w to global x,y,z coordinates.
-
module_t: half thickness of the detector module (in millimeter).
-
module_minhu, module_maxhu: the minimum/maximum half-length of the module boundary along the local u direction (in millimeter).
-
module_hv: the half-length of the module boundary along the local v direction (in millimeter).
-
pitch_u, pitch_v: the size of detector cells along the local u and v direction (in millimeter).
There are two different module shapes in the detector, rectangular and trapezoidal. The pixel detector ( with volume_id = 7,8,9) is fully built from rectangular modules, and so are the cylindrical barrels in volume_id=13,17. The remaining layers are made out disks that need trapezoidal shapes to cover the full disk.
Notes
Files
detectors.csv
Files
(67.6 GB)
Name | Size | Download all |
---|---|---|
md5:440dbb939f62cc94de563de8cdae47d1
|
1.9 MB | Preview Download |
md5:72380b0d17260cc1b450ac30dbd5c24f
|
7.0 GB | Download |
md5:4ef4e1fc3fb4f2386c71bb88c5482a13
|
7.3 GB | Download |
md5:b0c407c72e2753ed4e1e619c6c030e62
|
6.9 GB | Download |
md5:6a3e8529a95be085b564f9391f8c4c87
|
7.7 GB | Download |
md5:11180caac1a0b125ac13d65938c21354
|
7.7 GB | Download |
md5:2f1df2e1480925466b69454ab4eba427
|
7.7 GB | Download |
md5:d5fe471441dd450ede18ff3956ab72c0
|
7.7 GB | Download |
md5:1274fb71105334e9915686fec5761b6c
|
7.7 GB | Download |
md5:109dd519f313909fff98c71a2b2b96b6
|
7.7 GB | Download |
Additional details
References
- The Acts project: track reconstruction software for HL-LHC and beyond https://doi.org/10.1051/epjconf/202024510003