Representation of crowd accidents in popular media

Feliciani, Claudio; Corbetta, Alessandro; Haghani, Milad; Nishinari, Katsuhiro

doi:10.5281/zenodo.10432170

Published December 26, 2023 | Version v3

Dataset Open

Representation of crowd accidents in popular media

1. The University of Tokyo
2. Eindhoven University of Technology
3. The University of New South Wales

This repository contains results related to the analysis of a corpus of news reports covering the topic of crowd accidents. To facilitate online visualization and offline analysis, the files are organized by assigning a number to each. The number system and the details of each set of files are described as follows:

Class 0 – This contains the same files provided in this repository, but they are organized into folders to make analysis easier. If you intend to analyze the data from our lexical analysis, we suggest using this file since it is better organized and can be directly downloaded.
Class 1 – This contains the sources and relevant information for people who are interested in replicating our dataset or accessing the news reports used in our analysis. Please note that due to copyright regulations, the texts cannot be shared. However, you can refer to the links provided in these files to access the news articles and Wikipedia pages. Some links have stopped working during the time we were working on this study, and others may be unreachable in the future.
Class 2 – This contains the results from a lexical analysis of the corpus. The HTML page allows you to visualize each result interactively through the online VOSviewer app (you need to download the file and open it using a browser since Zenodo does not recognize this as a link). It is possible that this service (VOSviewer app) may be discontinued at some point in the future. PNG images of lexical maps are, therefore, available for download through the ZIP archive, although they do not allow interactive access. If you plan to read our results using the offline VOSviewer software or perform a more systematic analysis, JSON files are available for each category (time period, geographical area of the reporting institution, and purpose of gathering). The same files can be also find in the ZIP archive in class 0.
Class 3 – These are the results of the sentiment analysis. For each report, a single result is generated for the title. However, for the body, the text is divided into parts, which are analyzed independently.
Class 4 – These two files contains the corpus of Wikipedia relative to 68 crowd accidents which occurred between 1990 and 2019. The text for all accidents were scraped on October 15th, 2022 (before the tragedy in Itaewon) and on May 25th, 2023 (after the tragedy). Sources relative to the content in Wikipedia are listed in the file contained in Class 1 ("1_list_wiki_report.csv"). More generally, accidents listed on dedicated Wikipedia pages on https://en.wikipedia.org/wiki/List_of_fatal_crowd_crushes are reported in the corpus provided here (the period 1900-2019 is considered here).

The format of CSV and JSON files should be self-explanatory after reading our publication. For specific questions or queries, please contact one of the authors, and we will try to assist you.

Files

0_data_all.zip

Files (297.6 MB)

Name	Size	Download all
0_data_all.zip md5:30ea10d2f06e93d51c27195011e8e6ab	2.3 MB	Preview Download
1_list_news_report.csv md5:19612684182639c75c52bb7e8b606071	63.4 kB	Preview Download
1_list_wiki_report.csv md5:5341c565fd23b6f4ec354e386d055a1e	7.0 kB	Preview Download
1_sources_info.csv md5:0b5a52d1705ae0e99ad6c3238a8b9f69	975 Bytes	Preview Download
2_lexical_analysis_app_VOSviewer.html md5:799abf5b04ffd2a2546937c87c64d349	9.0 kB	Download
2_lexical_analysis_files.zip md5:e1953d39b285190b8ff2343e1d23f051	2.0 MB	Preview Download
2_lexical_analysis_maps.zip md5:f639f8662ce7dc3a923ae00e0451d198	292.3 MB	Preview Download
3_sentiment_body.csv md5:b566e958369101f15b58bef84bac7212	129.1 kB	Preview Download
3_sentiment_title.csv md5:910e0950babb0dc93ee74c7670eed93d	10.6 kB	Preview Download
4_wiki_corpus_after.txt md5:63236e37c7484b7f4d2ff4a7b0be99ad	396.4 kB	Preview Download
4_wiki_corpus_before.txt md5:1ac05dc96ea6dab0f2328158809540d6	395.2 kB	Preview Download

Additional details

Is published in: Journal article: 10.1016/j.ssci.2024.106423 (DOI)

	All versions	This version
Views	947	443
Downloads	2,395	1,183
Data volume	61.3 GB	26.9 GB

Representation of crowd accidents in popular media

Authors/Creators

Description

Files

0_data_all.zip

Files (297.6 MB)

Additional details

Related works