Published May 26, 2021 | Version v1
Journal article Open

A Test Collection for Dataset Retrieval in Biodiversity Research

  • 1. Friedrich Schiller University Jena, Department of Mathematics and Computer Science, Heinz Nixdorf Chair for Distributed Information Systems, Jena, Germany
  • 2. Department Forest Nature Conservation, Georg-August-Universität Göttingen, Göttingen, Germany
  • 3. Michael-Stifel-Center for Data-Driven and Simulation Science, Jena, Germany|German Center for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany|Friedrich Schiller University Jena, Department of Mathematics and Computer Science, Heinz Nixdorf Chair for Distributed Information Systems, Jena, Germany
  • 4. German Center for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany|Institute of Biology / Geobotany and Botanical Garden, Martin Luther University Halle-Wittenberg, Halle, Germany
  • 5. Michael-Stifel-Center for Data-Driven and Simulation Science, Jena, Germany|Department of Citizen Science, Institute of Data Science, German Aerospace Center (DLR), Jena, Germany

Description

Searching for scientific datasets is a prominent task in scholars' daily research practice. A variety of data publishers, archives and data portals offer search applications that allow the discovery of datasets. The evaluation of such dataset retrieval systems requires proper test collections, including questions that reflect real world information needs of scholars, a set of datasets and human judgements assessing the relevance of the datasets to the questions in the benchmark corpus. Unfortunately, only very few test collections exist for a dataset search. In this paper, we introduce the BEF-China test collection, the very first test collection for dataset retrieval in biodiversity research, a research field with an increasing demand in data discovery services. The test collection consists of 14 questions, a corpus of 372 datasets from the BEF-China project and binary relevance judgements provided by a biodiversity expert.

Files

RIO_article_67887.pdf

Files (263.1 kB)

Name Size Download all
md5:5ff4b861d44b3cee8c044c3c0eb6bddd
207.7 kB Preview Download
md5:aa7424c1d8ccfe09926f147f234eea21
55.4 kB Preview Download

Additional details

Related works

Has part
Figure: 10.3897/rio.7.e67887.figure1 (DOI)