Presentation Open Access

PIDs, Petabytes and Neutrons

Gareth Murphy

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">neutron, PID, spallation</subfield>
  <controlfield tag="005">20200120173452.0</controlfield>
  <controlfield tag="001">2547497</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">20377407</subfield>
    <subfield code="z">md5:3f852639785a893d357ebe3f714843d0</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-01-23</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-pidapalooza19</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">European Spallation Source ERIC</subfield>
    <subfield code="0">(orcid)0000-0002-2785-3674</subfield>
    <subfield code="a">Gareth Murphy</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">PIDs, Petabytes and Neutrons</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-pidapalooza19</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Neutron science generates an exploding volume of scientific research data which needs to be managed under FAIR data principles. The European Spallation Source (ESS), is expected to generate tens of petabytes of data per year.&lt;br&gt;
This volume of scientific research data benefits from a &amp;ldquo;PIDcentric&amp;rdquo; approach to manage the many science users, data sets and instruments.&lt;br&gt;
In partnership with the Swiss and Swedish national synchrotron radiation facilities at PSI and MAXIV, ESS have developed SciCat, a new data catalogue. SciCat allows users to register and access data using persistent identifiers.&lt;br&gt;
SciCat users are identified using the ORCID database, published datasets are identified using DOIs and &amp;nbsp;for raw and derived datasets, identifiers are used.&lt;br&gt;
SciCat uses a document-oriented database, MongoDB, which allows datasets to be tagged with ad hoc, unstructured scientific metadata as well as traditionally structured metadata, which can then be recovered using the DOI or Handle.&lt;br&gt;
In the day to day operations, SciCat also has to deal with the problem of legacy scientific research data. Legacy data with incomplete or lost metadata presents a challenge to include in data catalogue. In my session, I will talk about the ESS&amp;rsquo;s experiences with PIDs, data catalogues and legacy data.&lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.2547496</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.2547497</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">presentation</subfield>
All versions This version
Views 264263
Downloads 105105
Data volume 2.1 GB2.1 GB
Unique views 248247
Unique downloads 9393


Cite as