10 Simple rules for design, provision, and reuse of persistent identifiers for life science data

McMurry, Julie; Blomberg, Niklas; Burdett, Tony; Conte, Nathalie; Dumontier, Michel; Fellows, Donal K; Gonzalez-Beltran, Alejandra; Gormanns, Philipp; Hastings, Janna; Haendel, Melissa A; Hermjakob, Henning; Hériché, Jean-Karim; Ison, Jon C; Jimenez, Rafael C; Jupp, Simon; Juty, Nick; Laibe, Camille; Le Novère, Nicolas; Malone, James; Martin, Maria J; McEntyre, Johanna R; Morris, Chris; Muilu, Juha; Müller, Wolfgang; Mungall, Christopher J; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Sariyar, Murat; Snoep, Jacky L; Stanford, Natalie J; Swainston, Neil; Washington, Nicole; Williams, Alan R; Wolstencroft, Katherine; Goble, Carole; Parkinson, Helen

doi:10.5281/zenodo.18003

Published May 26, 2015 | Version v1

Preprint Open

10 Simple rules for design, provision, and reuse of persistent identifiers for life science data

1. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
2. ELIXIR Hub, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
3. Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
4. School of Computer Science, The University of Manchester, Manchester, United Kingdom
5. Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
6. Institute of Experimental Genetics, Helmholtz Centre Munich -German Research Center for Environmental Health (GmbH), Neuherberg, Germany
7. Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, USA.
8. European Molecular Biology Laboratory, Heidelberg, Germany
9. Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
10. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom | Babraham Institute, Cambridge, United Kingdom
11. STFC, Daresbury Laboratory, Warrington, United Kingdom
12. Genomics Coordination Center, Department of Genetics, University Medical Center Groningen and Groningen Bioinformatics Center, University of Groningen, Groningen, Netherlands
13. SDBV, HITS, Heidelberg, Germany
14. Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
15. Institute of Pathology, Charite – University Medicine Berlin, Berlin, Germany | TMF – Technologie- und Methodenplattform e. V. Berlin, Germany
16. MIB, University of Manchester, Manchester, UK | Department of Biochemistry, Stellenbosch University, Stellenbosch, South Africa
17. Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), University of Manchester, Manchester, UK.
18. Leiden Institute of Advanced Computer Science, Leiden University, Leiden, Netherlands

In the life sciences, problems with identifiers impede the flow and integrity of information. This is especially challenging within “synthesis research” disciplines such as systems biology, translational medicine, and ecology. Implementation-driven initiatives such as ELIXIR, BD2K, and others have therefore been actively working to understand and address underlying problems with identifiers.

Good, global-scale, persistent identifier design is harder than it appears, and is essential for data to be Findable, Accessible, Interoperable, and Reusable (Data FAIRport principles). Here, we build on emerging conventions and existing general recommendations and summarise the identifier characteristics most important to optimising the utility of life-science data. We propose actions to take in the identifier ‘green field’ and offer guidance for using real-world identifiers from diverse sources.

Notes

ORCIDs corresponding to the authors are: http://orcid.org/0000-0002-9353-5498 http://orcid.org/0000-0003-4155-5910 http://orcid.org/0000-0002-2513-5396 http://orcid.org/0000-0002-1010-3121 http://orcid.org/0000-0003-4727-9435 http://orcid.org/0000-0002-9091-5938 http://orcid.org/0000-0003-3499-8262 http://orcid.org/0000-0001-9823-1621 http://orcid.org/0000-0002-3469-4923 http://orcid.org/0000-0001-9114-8737 http://orcid.org/0000-0001-8479-0262 http://orcid.org/0000-0001-6867-9425 http://orcid.org/0000-0001-6666-1520 http://orcid.org/0000-0001-5404-7670 http://orcid.org/0000-0002-0643-3144 http://orcid.org/0000-0002-2036-8350 http://orcid.org/0000-0002-4625-743X http://orcid.org/0000-0002-6309-7327 http://orcid.org/0000-0002-1615-2899 http://orcid.org/0000-0001-5454-2815 http://orcid.org/0000-0002-1611-6935 http://orcid.org/0000-0002-9533-5684 http://orcid.org/0000-0002-1034-5171 http://orcid.org/0000-0002-4980-3512 http://orcid.org/0000-0002-6601-2165 http://orcid.org/0000-0001-9853-5668 http://orcid.org/0000-0001-5306-5690 http://orcid.org/0000-0002-5595-689X http://orcid.org/0000-0002-0405-8854 http://orcid.org/0000-0003-4958-0184 http://orcid.org/0000-0001-7020-1236 http://orcid.org/0000-0001-8936-9143 http://orcid.org/0000-0003-3156-2105 http://orcid.org/0000-0002-1279-5133 http://orcid.org/0000-0003-1219-2137 http://orcid.org/0000-0003-3035-4195

Files

MS_2015-05-23.pdf

Files (1.1 MB)

Name	Size	Download all
MS_2015-05-23.pdf md5:09f2036e95d7f259bc75ac3e0af75c08	1.1 MB	Preview Download

Additional details

Is previous version of: 10.5281/zenodo.31765 (DOI)

European Commission
BIOMEDBRIDGES - Building data bridges between biological and medical infrastructures in Europe 284209
European Commission
DIACHRON - DIACHRON – Managing the Evolution and Preservation of the Data Web 601043
European Commission
ISBE - Infrastructure for Systems Biology - Europe 312455
European Commission
ELIXIR - European Life-science Infrastructure for Biological Information 211601

	All versions	This version
Views	3,936	2,072
Downloads	1,295	773
Data volume	932.8 MB	892.3 MB

MS_2015-05-23.pdf

Files (1.1 MB)

Related works

Funding

10 Simple rules for design, provision, and reuse of persistent identifiers for life science data

Authors/Creators

Description

Notes

Files

MS_2015-05-23.pdf

Files (1.1 MB)

Additional details

Related works

Funding