Published November 21, 2022 | Version 1.0.0
Software Open

TREC-doc-2-doc-relevance assessment interface

Description

TREC-doc-2-doc-relevance

The code, data and docs at this repo aim at facilitating the creation of a doc-2-doc relevance assessment on PMIDs used in the TREC 2005 Genomics track. A doc-2-doc relevance assessment takes one document as reference and assess a second document regarding its relevance to the reference one. This doc-2-doc collection will be used to evaluate the doc-2-doc recommendations approaches that we are working on.

The TREC 2005 Genomics track corresponds to document-2-topic relevance assessment. Our assumption is that articles relevant to a topic will be more relevant to each other that those which are not relevant to the topic. This premise has been used before in previous approaches . With this TREC-doc-2-doc-relevance project, we want to double check this assumption by creating a doc-2-doc relevance assessment corpus based on TREC 2005 Genomics track.

Data

The TREC 2005 Genomics Track data can be found at https://trec.nist.gov/data/t14_genomics.html. The data used for the doc-2-doc relevance assessment and the assessments carried by four annotators can be found at https://zenodo.org/record/7324822 while the Fleiss Kappa data at https://zenodo.org/record/7338056.

Releases

v1.0.0 is the initial release of the TREC-doc-2-doc-relevance assessment interface. The tools has been used by four annotators on a corpus extracted from TREC 2005 Genomics Track.

Acknowledgements

This work is part of the STELLA project funded by DFG (project no. 407518790). This work was supported by the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) (031A532B, 031A533A, 031A533B, 031A534A, 031A535A, 031A537A, 031A537B, 031A537C, 031A537D, 031A538A).

Files

zbmed-semtec/TREC-doc-2-doc-relevance-v1.0.0.zip

Files (36.7 MB)

Name Size Download all
md5:9d8c63bb80d95bb268acf52b743a1264
36.7 MB Preview Download

Additional details

References

  • Giraldo O, Solanki D, Rebholz-Schuhmann D, Castro LJ. Fleiss kappa for doc-2-doc relevance assessment. Zenodo; 2022. doi:10.5281/zenodo.7338056
  • Giraldo O, Solanki D, Rebholz-Schuhmann D, Castro LJ. Fleiss kappa for doc-2-doc relevance assessment. Zenodo; 2022. doi:10.5281/zenodo.7338056
  • Hersh W, Cohen A, Yang J, Bhupatiraju RT, Roberts P, Hearst M. TREC 2005 Genomics Track Overview. : 26.