There is a newer version of the record available.

Published September 15, 2023 | Version v1
Dataset Open

Representation of crowd accidents in popular media

  • 1. The University of Tokyo
  • 2. Eindhoven University of Technology
  • 3. The University of New South Wales

Description

This repository contains results related to the analysis of a corpus of news reports covering the topic of crowd accidents. To facilitate online visualization and offline analysis, the files are organized by assigning a number to each. The number system and the details of each set of files are described as follows:

  • Class 0 – This contains the same files provided in this repository, but they are organized into folders to make analysis easier. If you intend to analyze the data from our lexical analysis, we suggest using this file since it is better organized and can be directly downloaded.
  • Class 1 – This contains the sources and relevant information for people who are interested in replicating our dataset or accessing the news reports used in our analysis. Please note that due to copyright regulations, the texts cannot be shared. However, you can refer to the links provided in these files to access the news articles and Wikipedia pages. Some links have stopped working during the time we were working on this study, and others may be unreachable in the future.
  • Class 2 – This contains the results from a lexical analysis of the corpus. The HTML page allows you to visualize each result interactively through the online VOSviewer app (you need to download the file and open it using a browser since Zenodo does not recognize this as a link). It is possible that this service (VOSviewer app) may be discontinued at some point in the future. PNG images of lexical maps are, therefore, available for download through the ZIP archive, although they do not allow interactive access. If you plan to read our results using the offline VOSviewer software or perform a more systematic analysis, JSON files are available for each category (time period, geographical area of the reporting institution, and purpose of gathering). The same files can be also find in the ZIP archive in class 0.
  • Class 3 – These are the results of the sentiment analysis. For each report, a single result is generated for the title. However, for the body, the text is divided into parts, which are analyzed independently.

The format of CSV and JSON files should be self-explanatory after reading our publication. For specific questions or queries, please contact one of the authors, and we will try to assist you.

Files

0_data_all.zip

Files (296.5 MB)

Name Size Download all
md5:c30bc3937037c0a0b5b4d73236b78356
2.0 MB Preview Download
md5:19612684182639c75c52bb7e8b606071
63.4 kB Preview Download
md5:5341c565fd23b6f4ec354e386d055a1e
7.0 kB Preview Download
md5:5831c37e9b4294368a89cd7dc81517e3
968 Bytes Preview Download
md5:60d17ca5de39ce89aa4ad9c4237182d7
9.0 kB Download
md5:4033a119e9be585b2a93039b5ac3a72b
2.0 MB Preview Download
md5:3deb959327fb83b7d7d752defa51e54b
292.3 MB Preview Download
md5:b566e958369101f15b58bef84bac7212
129.1 kB Preview Download
md5:910e0950babb0dc93ee74c7670eed93d
10.6 kB Preview Download