A Data set for Information Spreading over the News
Creators
- 1. Jozef Stefan Institute and Jozef Stefan International Postgraduate School
Description
Abstract:
Analyzing the spread of information related to a specific event in the news has many potential applications. Consequently, various systems have been developed to facilitate the analysis of information spreadings such as detection of disease propagation and identification of the spreading of fake news through social media. There are several open challenges in the process of discerning information propagation, among them the lack of resources for training and evaluation. This paper describes the process of compiling a corpus from the EventRegistry global media monitoring system. We focus on information spreading in three domains: sports (i.e. the FIFA WorldCup), natural disasters (i.e. earthquakes), and climate change (i.e.global warming). This corpus is a valuable addition to the currently available datasets to examine the spreading of information about various kinds of events.
Introduction:
Domain-specific gaps in information spreading are ubiquitous and may exist due to economic conditions, political factors, or linguistic, geographical, time-zone, cultural, and other barriers. These factors potentially contribute to obstructing the flow of local as well as international news. We believe that there is a lack of research studies that examine, identify, and uncover the reasons for barriers in information spreading. Additionally, there is limited availability of datasets containing news text and metadata including time, place, source, and other relevant information. When a piece of information starts spreading, it implicitly raises questions such as as
- How far does the information in the form of news reach out to the public?
- Does the content of news remain the same or changes to a certain extent?
- Do the cultural values impact the information especially when the same news will get translated in other languages?
Statistics about datasets:
--------------------------------------------------------------------------------------------------------------------------------------
# Domain Event Type Articles Per Language Total Articles
1 Sports FIFA World Cup 983-en, 762-sp, 711-de, 10-sl, 216-pt 2679
2 Natural Disaster Earthquake 941-en, 999-sp, 937-de, 19-sl, 251-pt 3194
3 Climate Changes Global Warming 996-en, 298-sp, 545-de, 8-sl, 97-pt 1945
--------------------------------------------------------------------------------------------------------------------------------------
Files
Earthquake - Metadata.csv
Files
(24.3 MB)
Name | Size | Download all |
---|---|---|
md5:20fdd239047b52ce756640ba18a5a438
|
880.5 kB | Preview Download |
md5:44320c69650daf92dd2dd349625ee7c7
|
6.3 MB | Preview Download |
md5:af0b0bf78d17378ca23da612387248c6
|
1.6 MB | Preview Download |
md5:aa6e04d49e0f0485ae893a1587c50206
|
5.9 kB | Preview Download |
md5:2c998649ea8ec6981fd96cb9f50021cb
|
9.6 kB | Preview Download |
md5:e72d21452d0e3d77a8ee039a52d9fbfa
|
2.8 kB | Preview Download |
md5:b123d87240e9554f8b1421d4a4c82292
|
44.3 kB | Preview Download |
md5:21d9562f9f29aaf38bc03d87d7f477b8
|
32.1 kB | Preview Download |
md5:ea3c48c66709db95af475666f6c5ad5f
|
8.0 kB | Preview Download |
md5:840044f4bcf56598dd6333b1c541df41
|
6.5 MB | Preview Download |
md5:67c1f47c271930cbbd34dfb595109a25
|
761.7 kB | Preview Download |
md5:a5a6af65128797cd1acca7dd33560e0c
|
1.3 MB | Preview Download |
md5:966bda8fc63290c7f7f1087e49992383
|
13.0 kB | Preview Download |
md5:a69f7a66cc18674cda62c6f7de7e3a58
|
11.4 kB | Preview Download |
md5:f44d388484cb5efc4ce66177fe4ef058
|
4.1 kB | Preview Download |
md5:6b88d23e645b3637e885bb8f5f389df5
|
20.7 kB | Preview Download |
md5:e23e567caccc3c3b11892d98e2724644
|
10.9 kB | Preview Download |
md5:1038e0875b1d9357fdc525008e928a27
|
8.4 kB | Preview Download |
md5:4811da7726f19d3afe429efef0087da5
|
5.3 MB | Preview Download |
md5:9aab8ab7156e06651b9e69f10f8ad103
|
557.0 kB | Preview Download |
md5:78451b831912313821db15307e9ee1f3
|
945.7 kB | Preview Download |
md5:7592d2abc8afa143d71e9e9fccaac724
|
3.9 kB | Preview Download |
md5:083d8162ba96d313d29928a430b2146d
|
8.0 kB | Preview Download |
md5:dd0758b4d2a26f81bd5cb108a23d1619
|
4.3 kB | Preview Download |
md5:5ae1aa036c2099acecaab7f19e1fbe5f
|
14.5 kB | Preview Download |
md5:7fbfee6b67a10417faef28aeefea01ee
|
5.3 kB | Preview Download |
md5:aa231fffddc382a32de7968ae36ac479
|
9.2 kB | Preview Download |