10.5281/zenodo.6635559
https://zenodo.org/records/6635559
oai:zenodo.org:6635559
Nirmalya Thakur
Nirmalya Thakur
0000-0002-3225-1870
University of Cincinnati
MonkeyPox2022Tweets: The First Public Twitter Dataset on the 2022 MonkeyPox Outbreak
Zenodo
2022
Monkeypox
Monkey Pox
Twitter
Tweets
Dataset
Social Media
Big Data
Data Mining
Data Science
2022-06-12
eng
10.5281/zenodo.6635558
https://zenodo.org/communities/genetics-datasets
https://zenodo.org/communities/twitter-datasets
https://zenodo.org/communities/datascience
1
Creative Commons Attribution 4.0 International
Please cite the following paper when using this dataset:
Nirmalya Thakur, "MonkeyPox2022Tweets: The First Public Twitter Dataset on the 2022 MonkeyPox Outbreak", Journal of Data (Paper Submitted).
The preprint of this paper is available at: https://www.preprints.org/manuscript/202206.0172/v1
Abstract
The world is currently facing an outbreak of the monkeypox virus, and confirmed cases have been reported from 28 countries. Following a recent “emergency meeting”, the World Health Organization is considering whether the outbreak should be assessed as a “potential public health emergency of international concern” or PHEIC, as was done for the COVID-19 and Ebola outbreaks in the past. During this time, people from all over the world are using social media platforms, such as Twitter, for information seeking and sharing related to the outbreak, as well as for familiarizing themselves with the guidelines and protocols that are being recommended by various policy-making bodies to reduce the spread of the virus. This is resulting in the generation of tremendous amounts of Big Data related to such paradigms of social media behavior. Mining this Big Data and compiling it in the form of a dataset can serve a wide range of use-cases and applications such as analysis of public opinions, interests, views, perspectives, attitudes, and sentiment towards this outbreak. Therefore, this work presents MonkeyPox2022Tweets, an open-access dataset of Tweets related to the 2022 monkeypox outbreak that were posted on Twitter since the first detected case of this outbreak on May 7, 2022. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management.
Data Description
The dataset consists of a total of 68,934 tweet IDs of the same number of tweets about monkeypox that were posted on Twitter from 7th May 2022 to 11th June 2022 (the most recent date at the time of dataset upload). The Tweet IDs are presented in 4 different .txt files based on the timelines of the associated tweets. The following table provides the details of these dataset files.
Filename
No. of Tweet IDs
Date Range of the Tweet IDs
TweetIDs_Part1.txt
19718
June 11, 2022 to June 5, 2022
TweetIDs_Part2.txt
17585
June 5, 2022 to May 27, 2022
TweetIDs_Part3.txt
17705
May 27, 2022 to May 21, 2022
TweetIDs_Part4.txt
13926
May 21, 2022 to May 7, 2022
The dataset contains only Tweet IDs in compliance with the terms and conditions mentioned in the privacy policy, developer agreement, and guidelines for content redistribution of Twitter. The Tweet IDs need to be hydrated to be used. For hydrating this dataset the Hydrator application (link to download and a step-by-step tutorial on how to use Hydrator) may be used.