Report Open Access
DiSSCo, the “Distributed System of Scientific Collections, is a pan‐European Research Infrastructure mobilising, unifying and delivering bio‐ and geo‐diversity digital information to scientific communities and beyond as a single digital virtual collection. With approximately 1.5 billion objects to be digitised, bringing natural science collections to the information age is expected to result in 100 petabytes of new data over the next two decades, used on average by 5,000 – 15,000 unique users every day.
The DiSSCo Data Management Plan (DMP) is a living document reflecting the active data management planning and stewardship philosophy of DiSSCo, with focus on achieving maximum accessibility and reusability of data according to core principles of 'findable, accessible, interoperable and reusable' (FAIR), longevity of data and data preservation, community curation, linking to third‐party information and reproducible science. The DiSSCo DMP offers unified data management principles for data providers, data managers and users, and guidance to engineers and programmers on technical standards and best practices. It applies to data management activities (production and acquisition, curation, publishing, processing and use) of the geographically distributed collection‐holding organisations (the DiSSCo acilities) and to all DiSSCo Hub activities.
DiSSCo adopts Digital Object Architecture (DOA) as its foundation because of its future‐proof flexibility over long timescales in the face of technological change, and because DOA has been shown to offer adherence to the FAIR principles as an integral characteristic, providing mechanisms inherently that directly address the specific principles to be promoted. In DOA the core concept is the ‘digital object’.
Digitisation is the process of making data about physical objects digitally available, and the output of that process is Digital Specimens and Digital Collections. Digital Specimens and Digital Collections are specific types of ‘digital objects’, which are the fundamental entities to be the subject of data management in DiSSCo.
Each instance of a digital object collects and organizes all the core information about the physical things it represents. These identified objects are amenable to processing and to transport from one system to another, making DOA a powerful yet simple extension of the existing Internet. A link must be maintained by the Digital Specimen to the physical specimen it represents. This link is the identifier of the physical specimen. These Digital Specimen objects are the principal data that DiSSCo manages. Each Digital Specimen or other digital object instance handled by the DiSSCo infrastructure must be unambiguously, universally and persistently identified by an identifier (Natural Science Identifier, NSId) which shall be assigned when the object is first created. Each DiSSCo Facility shall be responsible for creating (minting) and managing their own NSIds in accordance with the DiSSCo policy for NSIds, and for registering their own Digital Specimens with the DiSSCo Hub infrastructure. Resolution of an NSId shall always return the current version of an object’s content, as well as any interpretations and annotations associated with it.
The principle object types in DiSSCo (Digital Specimens, Digital Collections) are treated as mutable objects with access control and object history (provenance), meaning that they can be updated as new knowledge becomes available. Provenance data must be generated and preserved by all operations acting upon DiSSCo data objects. Timestamped records of change (provenance data) allow reconstruction of a specific ‘version’ of a digital object at a date and time in the past.
Information about Digital Specimens and Digital Collections must be published and managed as part of the European Collection Objects Index. DiSSCo Facilities are encouraged to publish the fullest available digital data about their individual specimens and collections at the earliest opportunity, aiming as best practice to achieve at least MIDS level 2 for Digital Specimens and MICS level 2 for Digital Collections information.
Several characteristics, such as centrality, accuracy and authenticity of the Digital Specimen, protection of data, preservation of readability, traceability/provenance, and annotation history are essential for developing long‐term community trust in DiSSCo. They are the protected characteristics of DiSSCo that must be protected throughout the DiSSCo lifetime. Thus, all design decisions (technical, procedural, organisational, etc.) must be assessed for their effect on the protected characteristics. Such decisions and changes must not destroy or lessen the protected characteristics.
Deliverable D6.6_ICEDIG_Provisional DMP for DiSSCo.pdf