Published November 8, 2019 | Version v1
Report Open

Provisional Data Management Plan for DiSSCo infrastructure. Deliverable D6.6

  • 1. Cardiff University: Cardiff, South Glamorgan, GB

Description

Excecutive summary

DiSSCo,  the  “Distributed  System  of  Scientific  Collections,  is  a  pan‐European  Research  Infrastructure mobilising, unifying and delivering bio‐ and geo‐diversity digital information to scientific communities and beyond as a single digital virtual collection. With approximately 1.5 billion objects to be digitised, bringing natural science collections to the information age is expected to result in 100 petabytes of new data over the next two decades, used on average by 5,000 – 15,000 unique users every day. 
The DiSSCo Data Management Plan (DMP) is a living document reflecting the active data management planning and stewardship philosophy of DiSSCo, with focus on achieving maximum accessibility and reusability of data according to core principles of  'findable,  accessible,  interoperable  and  reusable'  (FAIR),  longevity  of  data  and  data  preservation, community curation, linking to third‐party information and reproducible science. The DiSSCo DMP offers unified data management principles  for data providers, data managers and users, and guidance to engineers and programmers on technical standards and best practices. It applies to data management  activities  (production  and  acquisition,  curation,  publishing,  processing  and  use)  of  the geographically  distributed  collection‐holding  organisations  (the  DiSSCo  acilities)  and  to  all  DiSSCo  Hub activities. 
DiSSCo adopts Digital Object Architecture (DOA) as its foundation because of its future‐proof flexibility over long timescales in the face of technological change, and because DOA has been shown to offer adherence to the FAIR principles as an integral characteristic, providing mechanisms inherently that directly address the specific principles to be promoted. In DOA the core concept is the ‘digital object’. 
Digitisation is the process of making data about physical objects digitally available, and the output of that process  is Digital  Specimens  and Digital  Collections. Digital  Specimens  and Digital  Collections  are  specific types of ‘digital objects’, which are the fundamental entities to be the subject of data management in DiSSCo. 
Each instance of a digital object collects and organizes all the core information about the physical things it represents.  These  identified  objects  are  amenable  to  processing  and  to  transport  from  one  system  to another, making DOA a powerful yet simple extension of the existing Internet. A link must be maintained by the Digital Specimen to the physical specimen it represents. This link is the identifier of the physical specimen. These Digital Specimen objects are the principal data that DiSSCo manages.  Each  Digital  Specimen  or  other  digital  object  instance  handled  by  the  DiSSCo  infrastructure  must  be unambiguously, universally and persistently identified by an identifier (Natural Science Identifier, NSId) which shall  be  assigned  when  the  object  is  first  created.  Each  DiSSCo  Facility  shall  be  responsible  for  creating (minting) and managing their own NSIds in accordance with the DiSSCo policy for NSIds, and for registering their own Digital Specimens with the DiSSCo Hub infrastructure. Resolution of an NSId shall always return the current version of an object’s content, as well as any interpretations and annotations associated with it. 
The principle object types in DiSSCo (Digital Specimens, Digital Collections) are treated as mutable objects with access control and object history (provenance), meaning that they can be updated as new knowledge becomes available. Provenance data must be generated and preserved by all operations acting upon DiSSCo data objects. Timestamped records of change (provenance data) allow reconstruction of a specific ‘version’ of a digital object at a date and time in the past. 
Information about Digital Specimens and Digital Collections must be published and managed as part of the European Collection Objects Index. DiSSCo Facilities are encouraged to publish the  fullest available digital data about their individual specimens and collections at the earliest opportunity, aiming as best practice to achieve at least MIDS level 2 for Digital Specimens and MICS level 2 for Digital Collections information. 
Several characteristics, such as centrality, accuracy and authenticity of the Digital Specimen, protection of data,  preservation  of  readability,  traceability/provenance,  and  annotation  history  are  essential  for developing long‐term community trust in DiSSCo. They are the protected characteristics of DiSSCo that must be protected throughout the DiSSCo lifetime. Thus, all design decisions (technical, procedural, organisational, etc.) must be assessed for their effect on the protected characteristics. Such decisions and changes must not destroy or lessen the protected characteristics. 

Files

Deliverable D6.6_ICEDIG_Provisional DMP for DiSSCo.pdf

Files (1.7 MB)

Additional details

Funding

ICEDIG – Innovation and consolidation for large scale digitisation of natural heritage 777483
European Commission