31765
doi
10.5281/zenodo.31765
oai:zenodo.org:31765
user-eu
Blomberg, Niklas
ELIXIR Hub, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Burdett, Tony
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Conte, Nathalie
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Dumontier, Michel
Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
Fellows, Donal K
School of Computer Science, The University of Manchester, Manchester, United Kingdom
Gonzalez-Beltran, Alejandra
Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
Gormanns, Philipp
Institute of Experimental Genetics, Helmholtz Centre Munich -German Research Center for Environmental Health (GmbH), Neuherberg, Germany
Hastings, Janna
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Haendel, Melissa A
Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, USA.
Hermjakob, Henning
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Hériché, Jean-Karim
European Molecular Biology Laboratory, Heidelberg, Germany
Ison, Jon C
Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
Jimenez, Rafael C
ELIXIR Hub, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Jupp, Simon
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Juty, Nick
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Laibe, Camille
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Le Novère, Nicolas
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom | Babraham Institute, Cambridge, United Kingdom
Malone, James
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Martin, Maria J
McEntyre, Johanna R
Morris, Chris
STFC, Daresbury Laboratory, Warrington, United Kingdom
Muilu, Juha
Genomics Coordination Center, Department of Genetics, University Medical Center Groningen and Groningen Bioinformatics Center, University of Groningen, Groningen, Netherlands
Müller, Wolfgang
SDBV, HITS, Heidelberg, Germany
Mungall, Christopher J
Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Rocca-Serra, Philippe
Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
Sansone, Susanna-Assunta
Oxford e-Research Centre, University of Oxford, Oxford, United Kingdom
Sariyar, Murat
Institute of Pathology, Charite – University Medicine Berlin, Berlin, Germany | TMF – Technologie- und Methodenplattform e. V. Berlin, Germany
Snoep, Jacky L
MIB, University of Manchester, Manchester, UK | Department of Biochemistry, Stellenbosch University, Stellenbosch, South Africa
Stanford, Natalie J
School of Computer Science, The University of Manchester, Manchester, United Kingdom
Swainston, Neil
Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), University of Manchester, Manchester, UK.
Washington, Nicole
Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Williams, Alan R
School of Computer Science, The University of Manchester, Manchester, United Kingdom
Wolstencroft, Katherine
Leiden Institute of Advanced Computer Science, Leiden University, Leiden, Netherlands
Goble, Carole
School of Computer Science, The University of Manchester, Manchester, United Kingdom
Parkinson, Helen
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
10 Simple rules for design, provision, and reuse of identifiers for web-based life science data
Julie McMurry
European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
doi:10.5281/zenodo.18003
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
identifiers
identifier design
web-based identifiers
accessions
databases
big data
interoperability
reproducibility
synthesis research
standards
e-science
<p>Life science data is evolving to be ever larger, more distributed, and more natively web-based. However, our collective handling of identifiers has lagged behind these advances. Diverse identifier issues (for instance “link rot” and “content drift”) have hampered our ability to integrate data and derive new knowledge from it. Optimizing web-based identifiers is harder than it appears and no single scheme is perfect: Identifiers are reused in different ways for different reasons, by different consumers. Moreover, digital entities (e.g., files), physical entities (e.g., biosamples), and descriptive entities (e.g., ‘mitosis’) have different requirements for identifiers. Nevertheless, there is substantial room for improvement throughout the life sciences and several other groups have been converging on identifier standards that are broadly applicable.</p>
<p>Building on these efforts and drawing on our experience, we focus on the use case of large-scale data integration: we outline the identifier qualities and best practices that we feel are most important in this context. Specifically, we propose actions that providers of online databases (repositories, registries, and knowledgebases) should take when designing new identifiers or maintaining existing ones (<strong>Rules 1-9</strong>). In <strong>Rule 10</strong>, we conclude with guidance to data integrators and redistributors on how best to reference identifiers from these diverse sources. This article may also be useful to data generators and end users as it offers insight into the issues associated with data provision in a web environment. We call upon data providers to take a long-term view of their entities’ scope and lifecycle, and to consider existing identifier platforms and services. </p>
<p>Rule 1. Use established identifiers</p>
<p>Rule 2. Design identifiers for use by others</p>
<p>Rule 3. Help local identifiers travel well: document Prefix and Namespace</p>
<p>Rule 4. Opt for simple durable web resolution</p>
<p>Rule 5. Avoid embedding meaning</p>
<p>Rule 6. Make URIs clear and findable</p>
<p>Rule 7. Implement a version management policy</p>
<p>Rule 8. Do not re-assign or delete identifiers</p>
<p>Rule 9. Document the identifiers you issue and use</p>
<p>Rule 10. Reference responsibly</p>
This manuscript is a revision of doi:10.5281/zenodo.18003 and was recently resubmitted to PLoS Computational Biology
Zenodo
2015-10-02
info:eu-repo/semantics/preprint
610288
user-eu
award_title=Building data bridges between biological and medical infrastructures in Europe; award_number=284209; award_identifiers_scheme=url; award_identifiers_identifier=https://cordis.europa.eu/projects/284209; funder_id=00k4n6c32; funder_name=European Commission;
award_title=European Life-science Infrastructure for Biological Information; award_number=211601; award_identifiers_scheme=url; award_identifiers_identifier=https://cordis.europa.eu/projects/211601; funder_id=00k4n6c32; funder_name=European Commission;
award_title=DIACHRON – Managing the Evolution and Preservation of the Data Web; award_number=601043; award_identifiers_scheme=url; award_identifiers_identifier=https://cordis.europa.eu/projects/601043; funder_id=00k4n6c32; funder_name=European Commission;
award_title=Infrastructure for Systems Biology - Europe; award_number=312455; award_identifiers_scheme=url; award_identifiers_identifier=https://cordis.europa.eu/projects/312455; funder_id=00k4n6c32; funder_name=European Commission;
1579542245.830979
39732
md5:b10cc7ba34f117afbd7cee96cd391d30
https://zenodo.org/records/31765/files/10RulesIdentifiers_S1-S6_2015-09-24_Final.docx
130459
md5:5313278a2210529fdaccbf0799faea84
https://zenodo.org/records/31765/files/10RulesIdentifiers_MS_2015-09-24_Final_Clean.docx
22065
md5:cd82a44d92ed308739a86098022ea51a
https://zenodo.org/records/31765/files/10RulesIdentifiersResubmission_Authors_2015-09-23.docx
public
10.5281/zenodo.18003
Is new version of
doi
10.5281/zenodo.610288
isVersionOf
doi