Open dataset of scholars on Twitter
Creators
Description
IMPORTANT NOTE: This dataset was created using the May 2022 OpenAlex data dump. In June 2023, OpenAlex announced the implementation of a new author disambiguation algorithm that replaced all the old IDs with new ones, essentially making the dataset unusable. We published a new version of this dataset using the current OpenAlex IDs (April 2024). The new dataset (version 2) is available here: https://zenodo.org/records/10905839
----------------------------------------------------------------------------------------------------------------------------------------------
This is a dataset of paired OpenAlex author_ids (https://docs.openalex.org/about-the-data/author) and tweeter_ids (usernames).
The dataset includes 492,124 unique author_ids and 423,920 unique tweeter_ids forming 498,672 unique author-tweeter pairs. The file contains the following columns:
Column | Description |
author_id | author_id from OpenAlex |
tweeter_id | tweeter_id of the Twitter user |
criteria | A list of the different matching criteria that identified the pair |
valid | This column indicates whether the match has been manually checked. A 0 indicates a false positive, and a 1 indicates a true positive. Empty rows have not been manually validated. |
When using the dataset, please cite the following preprint which provides details about the matching process:
Mongeon, P., Bowman, T. D., & Costas, R. (2022). An open dataset of scholars on Twitter (arXiv:2208.11065). arXiv. https://doi.org/10.48550/arXiv.2208.11065
Links to R scripts can be found here: https://github.com/pmongeon/scholars-on-twitter/.
Files
authors_tweeters_2022_08_21.csv
Files
(96.5 MB)
Name | Size | Download all |
---|---|---|
md5:b74fc8a8ed3be0efeff56935774067e1
|
96.5 MB | Preview Download |