Conference paper Open Access

The Pushshift Telegram Dataset

Baumgartner, Jason; Zannettou, Savvas; Squire, Megan; Blackburn, Jeremy


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Baumgartner, Jason</dc:creator>
  <dc:creator>Zannettou, Savvas</dc:creator>
  <dc:creator>Squire, Megan</dc:creator>
  <dc:creator>Blackburn, Jeremy</dc:creator>
  <dc:date>2020-01-14</dc:date>
  <dc:description>The Pushshift Telegram Dataset

The dataset consists of three files:

Accounts.ndjson: Provides data for 2.2M Telegram users that were active in the channels we crawled.

Channels.ndjson: Provides data for 28K Telegram channels that we crawled.

Messages.ndjson: Provides data for 317M Telegram messages that were posted by 2.2M Telegram users in 28K Telegram channels.

Each file is a newline delimited json (ndjson) file that includes a json object with the data for each account/channel/message. The format of each object is according to the Telethon API (https://docs.telethon.dev/en/latest/), which is a Python interface for Telegram's API.</dc:description>
  <dc:identifier>https://zenodo.org/record/3607497</dc:identifier>
  <dc:identifier>10.5281/zenodo.3607497</dc:identifier>
  <dc:identifier>oai:zenodo.org:3607497</dc:identifier>
  <dc:relation>doi:10.5281/zenodo.3607496</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>Telegram</dc:subject>
  <dc:subject>pushshift</dc:subject>
  <dc:title>The Pushshift Telegram Dataset</dc:title>
  <dc:type>info:eu-repo/semantics/conferencePaper</dc:type>
  <dc:type>publication-conferencepaper</dc:type>
</oai_dc:dc>
1,873
3,358
views
downloads
All versions This version
Views 1,8731,873
Downloads 3,3583,358
Data volume 143.3 TB143.3 TB
Unique views 1,6571,657
Unique downloads 1,2381,238

Share

Cite as