Dataset Open Access

Top Quark Tagging Reference Dataset

Kasieczka, Gregor; Plehn, Tilman; Thompson, Jennifer; Russel, Michael


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Butter, Anja; Kasieczka, Gregor; Plehn, Tilman and Russell, Michael (2017). Based on data from 10.21468/SciPostPhys.5.3.028 (1707.08966)</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Kasieczka, Gregor et al (2019). Dataset used for arXiv:1902.09914 (The Machine Learning Landscape of Top Taggers)</subfield>
  </datafield>
  <controlfield tag="005">20200124192456.0</controlfield>
  <controlfield tag="001">2603256</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Institut für Theoretische Physik, Universität Heidelberg, Germany</subfield>
    <subfield code="a">Plehn, Tilman</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Institut für Theoretische Physik, Universität Heidelberg, Germany</subfield>
    <subfield code="a">Thompson, Jennifer</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Institut für Theoretische Physik, Universität Heidelberg, Germany</subfield>
    <subfield code="a">Russel, Michael</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">347849376</subfield>
    <subfield code="z">md5:13163479dee30a5fe546e4536cc3d04d</subfield>
    <subfield code="u">https://zenodo.org/record/2603256/files/test.h5</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1038496555</subfield>
    <subfield code="z">md5:45663819f47c13724f67eb0fd80bfa5c</subfield>
    <subfield code="u">https://zenodo.org/record/2603256/files/train.h5</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">347378076</subfield>
    <subfield code="z">md5:dca4b7248027618f041f9baa86d360fc</subfield>
    <subfield code="u">https://zenodo.org/record/2603256/files/val.h5</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-03-22</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:2603256</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Institut für Experimentalphysik, Universität Hamburg, Germany</subfield>
    <subfield code="a">Kasieczka, Gregor</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Top Quark Tagging Reference Dataset</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;A set of MC simulated training/testing events for the evaluation of top quark tagging architectures.&lt;/p&gt;

&lt;p&gt;In total 1.2M training events, 400k validation events and 400k test events. Use &amp;ldquo;train&amp;rdquo; for training, &amp;ldquo;val&amp;rdquo; for validation during the training and &amp;ldquo;test&amp;rdquo; for final testing and reporting results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;
	&lt;p&gt;14 TeV, hadronic tops for signal, qcd diets background, Delphes ATLAS detector card with Pythia8&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;No MPI/pile-up included&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;Clustering of&amp;nbsp; particle-flow entries (produced by Delphes E-flow) into anti-kT 0.8 jets in the pT range [550,650] GeV&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;All top jets are matched to a parton-level top within ∆R = 0.8, and to all top decay partons within 0.8&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;Jets are required to have |eta| &amp;lt; 2&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;The leading 200 jet constituent four-momenta are stored, with zero-padding for jets with fewer than 200&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;Constituents are sorted by pT, with the highest pT one first&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;The truth top four-momentum is stored as truth_px etc.&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;A flag (1 for top, 0 for QCD) is kept for each jet. It is called is_signal_new&lt;/p&gt;
	&lt;/li&gt;
	&lt;li&gt;
	&lt;p&gt;The variable &amp;quot;ttv&amp;quot; (= test/train/validation) is kept for each jet. It indicates to which dataset the jet belongs. It is redundant as the different sets are already distributed as different files.&lt;/p&gt;
	&lt;/li&gt;
&lt;/ul&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.2603255</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.2603256</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
2,662
3,145
views
downloads
All versions This version
Views 2,6622,662
Downloads 3,1453,145
Data volume 1.8 TB1.8 TB
Unique views 2,3742,374
Unique downloads 1,5261,526

Share

Cite as