Published February 7, 2022 | Version v1
Dataset Open

ir_metadata: An Extensible Metadata Schema for Information Retrieval Experiments

  • 1. TH Köln


This dataset accompanies our work that introduces a metadata schema for TREC run files based on the PRIMAD model. PRIMAD considers essential components of computational experiments that possibly can affect reproducibility on a conceptual level. We propose to align the metadata annotations to the PRIMAD components. In order to demonstrate the potential of metadata annotations, we curated a dataset with run files derived from experiments with different instantiations of PRIMAD components and annotated these with the corresponding metadata. With this work, we hope to stimulate IR researchers to annotate run files and improve the reuse value of experimental artifacts even further.


This archive contains the following data:

  • demo.tar.xz : Selected annotated runs files that are used in the Colab demonstration.

  • : YAML files containing only the metadata annotations for each run.

  • : The entire set of run files with annotations.


The annotated runs result from the following experiments:


Files (4.4 GB)

Name Size Download all
534.2 MB Download
1.0 MB Preview Download
3.8 GB Preview Download