Published March 10, 2019 | Version v1
Dataset Open

Bibliographic dataset characterizing studies that use online biodiversity databases

  • 1. Field Museum of Natural History
  • 2. Florida Museum of Natural History, University of Florida, Gainesville
  • 3. Department of Environmental Biology, Universidad de Navarra

Description

This dataset includes bibliographic information for 501 papers that were published from 2010-April 2017 (time of search) and use online biodiversity databases for research purposes. Our overarching goal in this study is to determine how research uses of biodiversity data developed  during a time of unprecedented growth of online data resources. We also determine uses with the highest number of citations, how online occurrence data are linked to other data types, and if/how data quality is addressed.  Specifically, we address the following questions:

1.) What primary biodiversity databases have been cited in published research, and which

     databases have been cited most often?

2.) Is the biodiversity research community citing databases appropriately, and are

     the cited databases currently accessible online?

3.) What are the most common uses, general taxa addressed, and data linkages, and how   

     have they changed over time?

4.) What uses have the highest impact, as measured through the mean number of citations

     per year?

5.) Are certain uses applied more often for plants/invertebrates/vertebrates?

6.) Are links to specific data types associated more often with particular uses?

7.) How often are major data quality issues addressed?

8.) What data quality issues tend to be addressed for the top uses?  

Relevant papers for this analysis include those that use online and openly accessible primary occurrence records, or those that add data to an online database. Google Scholar (GS) provides full-text indexing, which was important to identify data sources that often appear buried in the methods section of a paper. Our search was therefore restricted to GS. All authors discussed and agreed upon representative search terms, which were relatively broad to capture a variety of databases hosting primary occurrence records. The terms included: “species occurrence” database (8,800 results), “natural history collection” database (634 results), herbarium database (16,500 results), “biodiversity database” (3,350 results), “primary biodiversity data” database (483 results), “museum collection” database (4,480 results), “digital accessible information” database (10 results), and “digital accessible knowledge” database (52 results)--note that quotations are used as part of the search terms where specific phrases are needed in whole. We  downloaded all records returned by each search (or the first 500 if there were more) into a Zotero reference management database. About one third of the 2500 papers in the final dataset were relevant. Three of the authors with specialized knowledge of the field characterized relevant papers using a standardized tagging protocol based on a series of key topics of interest. We developed a list of potential tags and descriptions for each topic, including: database(s) used, database accessibility, scale of study, region of study, taxa addressed, research use of data, other data types linked to species occurrence data, data quality issues addressed, authors, institutions, and funding sources. Each tagged paper was thoroughly checked by a second tagger.

The final dataset of tagged papers allow us to quantify general areas of research made possible by the expansion of online species occurrence databases, and trends over time. Analyses of this data will be published in a separate quantitative review.

Files

Citations.csv

Files (2.6 MB)

Name Size Download all
md5:3ea1fe569e0be50d6e2d47e81dc130ef
391.6 kB Preview Download
md5:999d75029703298aad4955327853b9a3
46.8 kB Preview Download
md5:23ea2fd5546cd83f4e713f7f2513a706
1.8 MB Preview Download
md5:61fe8a4f8c999ef101827d1b4a5d1a5c
156.4 kB Preview Download
md5:5c56951ca7379f8d6dfb07586a06cc05
111.1 kB Download
md5:19126559f405133874071162db564ead
1.9 kB Preview Download
md5:e22273a64cf539e90698476edf08167d
13.8 kB Preview Download

Additional details

Related works