Twitter cascade datasets
Description
This repository contains a set of Twitter datasets containing metadata of tweets and retweets posted during specific events such as 2015 Nepal Earthquake, IPL 2018, 15-M movement in Spain and also tweets posted by celebrity such as Lady Gaga and her followers. The details of each dataset are as follows:
1. 2015 Nepal Earthquake: This folder contains the list of follower IDs of a user per line in "followers_network". The files "timeseries.txt" and "userseries.txt" contains the sorted timestamp of retweets and sorted sequence of retweeting users for a cascade, per line.
2. IPL 2018: This folder contains sequence of inter-retweet time intervals for every cascade per line in the file "cascade-intervals-IPL.txt".
3. 15-M: This folder contains a .csv and .txt file containing tweet metadata for a tweet per line in the following format:
idt;segs;hashtags;mentions
where idt - tweet ID, segs - segment number of the tweet, hashtags - set of hashtags separated by whitespace that were used in the tweet text, mentions - IDs of users mentioned in the tweet text
4. Lady Gaga: This folder contains metadata of tweets and retweets in the following format:
User_Name
Tweet_ID
Time
Via
retweet_from
reply_to_user reply_to_tweet(if not reply, just "-1")
content
Number_of_link_in_tweet
type_of_link1 link1
type_of_link2 link2
type_of_link3 link3
...
Tweets were crawled for the users related to "Lady Gaga”, and randomly collected 10,000 of her followers from Jan1, 2010 to Oct, 2010 and from Oct 1, 2010 to Jan 15, 2010.