TikTok Representative Data Release
Description
# TikTok Data
This directory contains data analysis results from TikTok video and comment data. The analysis is split into different time frames (hour and day) and further categorized by topics, children presence, and comments.
## Directory Structure
```
tiktok-hour/
├── hour/
│ ├── share_count.csv # Distribution of share counts
│ ├── digg_count.csv # Distribution of like counts
│ ├── comment_count.csv # Distribution of comment counts
│ ├── play_count.csv # Distribution of play/view counts
│ ├── create_time_count.csv # Distribution of video creation timestamps from createTime attribute
| ├── id_timestamp_count.csv # Distribution of video creation timestamps from ID (included videos with error message)
│ ├── location_created_count.csv # Distribution of video creation locations
│ ├── createtime_location_created.csv # Joint distribution of creation time and location
│ ├── topics/
│ │ ├── topic_desc.parquet # Topic descriptions with video counts
│ │ ├── [topic_id]/
│ │ │ ├── play_count_dist.csv # Play count distribution for videos in this topic
│ │ │ └── location_created_dist.csv # Creation location distribution for videos in this topic
│ │
│ └── [country_name]/ # Country-specific directory
│ └── [engagement_type]_count.csv # Engagement distribution for the specific country
│
├── day/
│ ├── share_count.csv # Distribution of share counts (24-hour data)
│ ├── digg_count.csv # Distribution of like counts (24-hour data)
│ ├── comment_count.csv # Distribution of comment counts (24-hour data)
│ ├── play_count.csv # Distribution of play/view counts (24-hour data)
│ ├── create_time_count.csv # Distribution of video creation timestamps from createTime attribute (24-hour data)
| ├── id_timestamp_count.csv # Distribution of video creation timestamps from ID (included videos with error message) (24 hour data)
│ ├── location_created_count.csv # Distribution of video creation locations (24-hour data)
│ ├── createtime_location_created.csv # Joint distribution of creation time and location (24-hour data)
│ ├── children/
│ │ ├── play_count_dist.csv # Play count distribution for videos with children
│ │ ├── country_share_videos_with_children.csv # Percentage of videos with children by country
│ │ └── [country_name]/
│ │ └── play_count_dist.csv # Play count distribution for videos with children in specific country
│ │
│ ├── comments/
│ │ ├── comment_language_counts.csv # Distribution of comment languages
│ │ └── create_time_dist.csv # Distribution of comment creation times
│ │
│ └── [country_name]/ # Country-specific directory
│ └── [engagement_type]_count.csv # Engagement distribution for the specific country
```
## File Descriptions
### Engagement Distribution Files
- **share_count.csv**: Distribution of the number of shares per video
- **digg_count.csv**: Distribution of the number of likes (diggs) per video
- **comment_count.csv**: Distribution of the number of comments per video
- **play_count.csv**: Distribution of the number of plays/views per video
### Time and Location Files
- **create_time_count.csv**: Distribution of video creation timestamps
- **id_timestamp_count.csv**: Distribution of video creation timestamps from ID (included videos with error message)
- **location_created_count.csv**: Distribution of locations where videos were created
- **createtime_location_created.csv**: Joint distribution of creation time and location
### Topic-Related Files
- **topic_desc.parquet**: Topic descriptions along with the count of videos in each topic
- **play_count_dist.csv** (in topic directories): Distribution of play counts for videos in a specific topic
- **location_created_dist.csv** (in topic directories): Distribution of creation locations for videos in a specific topic
### Children-Related Files
- **play_count_dist.csv** (in children directory): Distribution of play counts for videos that contain children
- **country_share_videos_with_children.csv**: Percentage of videos with children by country
- **play_count_dist.csv** (in country subdirectories): Distribution of play counts for videos with children in specific countries
### Comment-Related Files
- **comment_language_counts.csv**: Distribution of languages used in comments
- **create_time_dist.csv**: Distribution of comment creation times
### Country-Specific Files
- **[engagement_type]_count.csv**: Distribution of engagement metrics (share, digg, comment, play) for videos from specific countries
## Data Collection Methodology
This data is collected from TikTok videos with the following characteristics:
- Hour data: Videos created during hour 19 (7 PM)
- Day data: Videos created at minute 42 of each hour across a 24-hour period
- Child detection: Videos with children are identified using our classifier
Files
tiktok-hour.zip
Files
(26.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:52d0795fe414fd72820e7bbb57e11a17
|
26.7 MB | Preview Download |