Conference paper Open Access

The Pushshift Telegram Dataset

Baumgartner, Jason; Zannettou, Savvas; Squire, Megan; Blackburn, Jeremy


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3607497", 
  "author": [
    {
      "family": "Baumgartner, Jason"
    }, 
    {
      "family": "Zannettou, Savvas"
    }, 
    {
      "family": "Squire, Megan"
    }, 
    {
      "family": "Blackburn, Jeremy"
    }
  ], 
  "issued": {
    "date-parts": [
      [
        2020, 
        1, 
        14
      ]
    ]
  }, 
  "abstract": "<p>The Pushshift Telegram Dataset</p>\n\n<p>The dataset consists of three files:</p>\n\n<p><em>Accounts.ndjson:&nbsp;</em>Provides data for 2.2M Telegram users that were active in the channels we crawled.</p>\n\n<p><em>Channels.ndjson:&nbsp;</em>Provides data for 28K Telegram channels that we crawled.</p>\n\n<p><em>Messages.ndjson: </em>Provides data for 317M Telegram messages that were posted by 2.2M Telegram users in 28K Telegram channels.</p>\n\n<p>Each file is a newline delimited json (ndjson) file that includes a json object with the data for each account/channel/message. The format of each object is according to the Telethon API (<a href=\"https://docs.telethon.dev/en/latest/\">https://docs.telethon.dev/en/latest/</a>), which is a Python interface for Telegram&#39;s API.</p>", 
  "title": "The Pushshift Telegram Dataset", 
  "type": "paper-conference", 
  "id": "3607497"
}
1,887
3,362
views
downloads
All versions This version
Views 1,8871,887
Downloads 3,3623,362
Data volume 143.5 TB143.5 TB
Unique views 1,6701,670
Unique downloads 1,2411,241

Share

Cite as