youtube-kids-ads-www24
Description
While focusing on "made for kids" channels is a useful starting point for analysing ad patterns on kids' videos, it is also important to consider the wider landscape of child-oriented content on the platform, much of which remains unlabelled. To build a representative dataset of such videos, we use seed search words reflecting popular child interests, some of which include "toys", "kids cartoon", and "Barbie." The results are then parsed to find popular channels with unlabelled content, with a minimum threshold of 400,000 views.
Next, we scrape ad data across all videos for further analysis, covering all major ad formats on the platform including (i) skippable and (ii) unskippable video ads, (iii) sidebar ads, (iv) in-feed ads, and (v) banner ads. We use a Selenium Webdriver script launched in a new logged-out Chrome window, with no previous history, cookies, or user data. We then scrape each ad’s unique YouTube-assigned video ID, and any embedded external link as the video plays.
Next, we use YouTube Data API to obtain additional metadata like video title, duration, and "made for kids" label for each video ad, the result of which is recorded in the dataset. The videos are played from different VPN locations to explore the varied experiences based on geographical location.
Files
data.zip
Files
(4.0 MB)
Name | Size | Download all |
---|---|---|
md5:96d49976648f4abd2a77ccec41b10615
|
4.0 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/nsgLUMS/youtube-kids-ads-www24