Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published September 21, 2021 | Version 1.0
Dataset Open

Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance Feedback Retrieval

  • 1. TH Köln

Description

This data archive accompanies our work, in which we analyze a pseudo-relevance retrieval method that is based on the results of web search engines. By enriching topics with text data from web search engine result pages and linked contents, we train topic-specific and cost-efficient classifiers that can be used to search test collections for relevant documents. Building up on attempts that were initially made at TREC Common Core 2018 by Grossman and Cormack, we address the questions of system performance over time considering different search engines, queries and test collections. Our experimental results show how and to which extent the considered components affect the retrieval performance. Overall, the analyzed method is robust in terms of average retrieval performance and a promising way to use web content for the data enrichment of relevance feedback methods.

Files

run.zip

Files (2.0 GB)

Name Size Download all
md5:dcf218a3264ff370cf1d480aed993694
560.4 MB Preview Download
md5:28fb06b35e82f387caab91414f4e1606
678.5 MB Preview Download
md5:8c93a725838649c5b7261f519e983eff
725.8 MB Preview Download