Congressional Tweets
Description
This dataset was collected to measure Anti-Democratic Rhetoric (ADR) in social media posts on Twitter from members of the U.S. Congress, in order to analyze (one aspect of) democratic backsliding in the United States. It involves a textual corpus of all tweets sent from the official accounts of sitting members of the 117th Congress during the period spanning January 2020 through June 2022, encompassing the 2020 election and the events of January 6, 2021.
The scraped Tweets are stored in this Excel file. There are 1,048,515 rows, each row representing a Tweet. Each tweet is accompanied by Twitter-derived metadata (e.g., timestamps, hashtags, and number of replies) as well as relevant demographic data about the members (including each congressmember’s name, Twitter username, party ID, gender, state and district represented, chamber of Congress, and tenure in office), drawn from Ballotpedia.org and members’ own web sites.
The data-scraping code and the metadata for members of Congress are available on GitHub at https://github.com/yucongj/congressional-tweets.
Analysis of this dataset has been published in:
Miller, C. J., & Jiang, Y. (2025). Congressional rhetoric on Twitter and the crisis of democracy. Communication and Democracy, 59(1), 161–204. https://doi.org/10.1080/27671127.2025.2478863
Files
Files
(531.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:518d6407e9a310d7e7401a462bbdaa12
|
531.9 MB | Download |
Additional details
Related works
- Is compiled by
- Software: https://github.com/yucongj/congressional-tweets (URL)
- Is supplement to
- Journal article: 10.1080/27671127.2025.2478863 (DOI)