Dataset Open Access
Ponza, Marco; Ceccarelli, Diego; Ferragina, Paolo; Meij, Edgar; Kothari, Sambhav
This repository contains the enrichments for the dataset The New York Times Annotated Corpus developed for the paper:
“Marco Ponza, Diego Ceccarelli, Paolo Ferragina, Edgar Meij, Sambhav Kothari. Contextualizing Trending Entities in News Stories. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM 2021).”
It includes a total of 149 trends constituted by 120K entities. The goal is to retrieve a set of entities ranked with respect to their usefulness in explaining why a given trending entity is actually trending.
Format
The repository contains the enrichments in JSON format.
The news stories of the New York Times from which these enrichments have been developed are available from LDC.
Data Splits
We perform two kinds of evaluation.
Use
Please cite the data set and the accompanying paper if you found the resources in this repository useful:
@inproceedings{ponza2021,
Title = {Contextualizing Trending Entities in News Stories},
author = {Ponza, Marco and Ceccarelli, Diego and Ferragina, Paolo and Meij, Edgar and Kothari, Sambhav},
Booktitle = {Proceedings of the 14th ACM International Conference on Web Search and Data Mining},
Year = {2021},
}
Name | Size | |
---|---|---|
contextualizing-trending-entities.zip
md5:3cc71b7e1461637531143549f6c4c5ba |
19.0 MB | Download |
All versions | This version | |
---|---|---|
Views | 245 | 245 |
Downloads | 15 | 15 |
Data volume | 284.7 MB | 284.7 MB |
Unique views | 211 | 211 |
Unique downloads | 15 | 15 |