Dataset Open Access
The Webis Known-Item Question Corpus 2013 (Webis-KIQC-13) contains annotations for 2,755 questions posted on Yahoo! Answers. For each question, 2 annotators were asked to categorize the question as having a known-item information need or not, to identify a ClueWeb09 website representing the known item, and whether false memories are contained in the description of the need. The corpus represents the decisions of the annotators who had discussions for the few questions on which they did not agree initially.
The corpus contains the IDs of the ClueWeb09 documents representing the known item and an annotated categorization and correction for questions with a false memory.
Matthias Hagen, Daniel Wägner, and Benno Stein. A Corpus of Realistic Known-Item Topics with Associated Web Pages in the ClueWeb09. In Advances in Information Retrieval. 37th European Conference on IR Research (ECIR 2015) volume 9022 of Lecture Notes in Computer Science, pages 741-754, Berlin Heidelberg New York, March 2015. Springer