There is a newer version of the record available.

Published May 16, 2024 | Version v1
Dataset Open

TUApps

Description

To research the illegal activities of underground apps on Telegram, we have created a dataset called TUApps. TUApps is a progressively growing dataset of underground apps, collected from September 2023 to February 2024, consisting of a total of 1,000 underground apps and 200 million messages distributed across 71,332 Telegram channels. 
In the process of creating this dataset, we followed strict ethical standards to ensure the lawful use of the data and the protection of user privacy. The dataset includes the following files:
(1) dataset.zip: We have packaged the underground app samples. The naming of Android app files is based on the SHA256 hash of the file, and the naming of iOS app files is based on the SHA256 hash of the publishing webpage.
(2) code.zip: We have packaged the code used for crawling data from Telegram and for performing data analysis.
(3) message.zip: We have packaged the messages crawled from Telegram, the files are named after the names of the channels in Telegram.
Availability of code and messages
Upon acceptance of our research paper, the dataset containing user messages and the code used for data collection and analysis will only be made available upon request to researchers who agree to adhere to strict ethical principles and maintain the confidentiality of the data.

Files

dataset.zip

Files (11.8 GB)

Name Size Download all
md5:3cb4eacf6b289b8d17a36bd9ee2827c5
11.8 GB Preview Download