Evolution of Retweet Rates in Twitter User Careers: Analysis and Model
Description
About this repository
The respository contains data and code from the paper: Evolution of Retweet Rates in Twitter User Careers: Analysis and Model, accepted at the International Conference on Web and Social Media 2021.
The repository contains 4 datasets.
The filenames start with the dataset name: {verified, political, despoina and despoina_random}. The filenames start with the dataset name
Each dataset has the following files:
## Data
1. $DATASET_NAME$_tweets_num_followers.txt.gz -- contains 4 columns (tab separated). twitter userid, tweetid, retweet_count, timestamp of tweeting, (estimated) number of followers at the time of tweeting
2. $DATASET_NAME$_follower_counts_wayback_archive.txt -- contains the raw follower count information obtained by scraping archive.org. Contains three columns (tab separated), (username, date of crawl on archive, number of followers)
3. $DATASET_NAME$_fit_functions.tar.gz -- a folder containing one pickle file per user. This pickle file is a polynomial function that was fit on the user's follower counts from archive. This function can be used to estimate the number of followers a user has at any point in time.
`user_func = pickle.load(open(dataset + "/fit_functions/" + user + ".pickle","rb"));`
`num_followers = user_func(timestamp); # for a given timestamp`
4. $DATASET_NAME$_userinfo.txt.gz -- contains the user profile information crawled in October 2018. The file is tab separated and contains the following columns:
user id, twitter screen_name, user name, profile location, profile description, followers_count, friends_count, statuses_count, profile created_at, is_protected, is_verified, language, is_geo_enabled, url, timezone, source, profile_image_url
This file can be used to obtain the user id and screen name mapping. (Note that a user id for a user is fixed, while the screen name can change over time).
## Code
5. getFollowerHistoryArchive.py -- Script to get archive data
run as `cat users.txt | python getFollowerHistoryArchive.py` (users.txt is a file containing twitter user screen names, one per line)
Abstract
We study the evolution of the number of retweets received by Twitter users over the course of their “careers” on the platform. We find that on average the number of retweets received by users tends to increase over time. This is partly expected because users tend to gradually accumulate followers. Normalizing by the number of followers, however, reveals that the relative, per-follower retweet rate tends to be non-monotonic, maximized at a “peak age” after which it does not increase, or even decreases. We develop a simple mathematical model of the process behind this phenomenon, which assumes a constantly growing number of followers, each of whom loses interest over time. We show that this model is sufficient to explain the non-monotonic nature of per-follower retweet rates, without any assumptions about the quality of content posted at different times
Files
despoina_follower_counts_wayback_archive.txt
Files
(2.2 GB)
Name | Size | Download all |
---|---|---|
md5:c08cda9ceb9a6e7aabdef8d5bddd99d4
|
550.4 kB | Download |
md5:c10de35bb35dde820be4454a5466e036
|
2.8 MB | Preview Download |
md5:2408fa137623c132459a1efaeb12effe
|
598.6 kB | Download |
md5:6f41e111fa3d846ceb512859c9b492d4
|
3.2 MB | Preview Download |
md5:62f5ac26c3787129fae2d858090aba47
|
270.9 MB | Download |
md5:fabe96965c7933087aa7196123b23d85
|
103.0 MB | Download |
md5:1a1a7c6615ed1335d516ddb2a5aabb98
|
481.7 MB | Download |
md5:3df63391dc1434325a5e06b86c2f10d7
|
120.1 MB | Download |
md5:e3057cdcd168813517b98edd5e0a1063
|
5.3 kB | Download |
md5:a142eedadd1734c7dfa1047cc6231f79
|
2.1 MB | Download |
md5:b9968c3bbd294ae5a3a002abdf01afa3
|
12.0 MB | Preview Download |
md5:537e99cbde5659d1d5fc493faf2b252e
|
976.7 MB | Download |
md5:9b3e6691e33988a9542e4c1a6a4ad5bf
|
77.2 MB | Download |
md5:d45ad1b805768b0afad6644042a8ab34
|
2.0 kB | Preview Download |
md5:cb118eeaf462483be98d871e25ee2f0e
|
144.3 kB | Download |
md5:113663e7a6bb169f628a63f7c8a6c999
|
863.6 kB | Preview Download |
md5:888aedddde5f46df82c204530d1d3a1f
|
117.5 MB | Download |
md5:ac213ebe608c145ea4d5e394dcc66b15
|
500.0 kB | Download |