10 Simple rules for design, provision, and reuse of persistent identifiers for life science data
Creators
- McMurry, Julie1
- Blomberg, Niklas2
- Burdett, Tony1
- Conte, Nathalie1
- Dumontier, Michel3
- Fellows, Donal K4
- Gonzalez-Beltran, Alejandra5
- Gormanns, Philipp6
- Hastings, Janna1
- Haendel, Melissa A7
- Hermjakob, Henning1
- Hériché, Jean-Karim8
- Ison, Jon C9
- Jimenez, Rafael C2
- Jupp, Simon1
- Juty, Nick1
- Laibe, Camille1
- Le Novère, Nicolas10
- Malone, James1
- Martin, Maria J1
- McEntyre, Johanna R1
- Morris, Chris11
- Muilu, Juha12
- Müller, Wolfgang13
- Mungall, Christopher J14
- Rocca-Serra, Philippe5
- Sansone, Susanna-Assunta5
- Sariyar, Murat15
- Snoep, Jacky L16
- Stanford, Natalie J4
- Swainston, Neil17
- Washington, Nicole14
- Williams, Alan R4
- Wolstencroft, Katherine18
- Goble, Carole4
- Parkinson, Helen1
- 1. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- 2. ELIXIR Hub, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- 3. Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
- 4. School of Computer Science, The University of Manchester, Manchester, United Kingdom
- 5. Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
- 6. Institute of Experimental Genetics, Helmholtz Centre Munich -German Research Center for Environmental Health (GmbH), Neuherberg, Germany
- 7. Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, USA.
- 8. European Molecular Biology Laboratory, Heidelberg, Germany
- 9. Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
- 10. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom | Babraham Institute, Cambridge, United Kingdom
- 11. STFC, Daresbury Laboratory, Warrington, United Kingdom
- 12. Genomics Coordination Center, Department of Genetics, University Medical Center Groningen and Groningen Bioinformatics Center, University of Groningen, Groningen, Netherlands
- 13. SDBV, HITS, Heidelberg, Germany
- 14. Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- 15. Institute of Pathology, Charite – University Medicine Berlin, Berlin, Germany | TMF – Technologie- und Methodenplattform e. V. Berlin, Germany
- 16. MIB, University of Manchester, Manchester, UK | Department of Biochemistry, Stellenbosch University, Stellenbosch, South Africa
- 17. Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), University of Manchester, Manchester, UK.
- 18. Leiden Institute of Advanced Computer Science, Leiden University, Leiden, Netherlands
Description
In the life sciences, problems with identifiers impede the flow and integrity of information. This is especially challenging within “synthesis research” disciplines such as systems biology, translational medicine, and ecology. Implementation-driven initiatives such as ELIXIR, BD2K, and others have therefore been actively working to understand and address underlying problems with identifiers.
Good, global-scale, persistent identifier design is harder than it appears, and is essential for data to be Findable, Accessible, Interoperable, and Reusable (Data FAIRport principles). Here, we build on emerging conventions and existing general recommendations and summarise the identifier characteristics most important to optimising the utility of life-science data. We propose actions to take in the identifier ‘green field’ and offer guidance for using real-world identifiers from diverse sources.
Notes
Files
MS_2015-05-23.pdf
Files
(1.1 MB)
Name | Size | Download all |
---|---|---|
md5:09f2036e95d7f259bc75ac3e0af75c08
|
1.1 MB | Preview Download |
Additional details
Related works
- Is previous version of
- 10.5281/zenodo.31765 (DOI)
Funding
- BIOMEDBRIDGES – Building data bridges between biological and medical infrastructures in Europe 284209
- European Commission
- DIACHRON – DIACHRON – Managing the Evolution and Preservation of the Data Web 601043
- European Commission
- ISBE – Infrastructure for Systems Biology - Europe 312455
- European Commission
- ELIXIR – European Life-science Infrastructure for Biological Information 211601
- European Commission