Published February 20, 2024 | Version v1
Dataset Open

youtube-kids-ads-www24

Description

Our methodology for gathering "made for kids" videos on YouTube involves identifying the most popular children’s videos worldwide. We utilise data from Social Blade, a YouTube-certified user-analytics platform, which maintains a list of the most popular channels with the "made for kids" tag, ranked based on total lifetime views. We select the top 10 most popular videos from the 75 highest-viewed kids channels on this list, by querying the YouTube API. This forms our labelled video dataset, comprising 750 videos, capturing a broad spectrum of content and styles which are likely to attract a large number of young viewers worldwide.

While focusing on "made for kids" channels is a useful starting point for analysing ad patterns on kids' videos, it is also important to consider the wider landscape of child-oriented content on the platform, much of which remains unlabelled. To build a representative dataset of such videos, we use seed search words reflecting popular child interests, some of which include "toys", "kids cartoon", and "Barbie." The results are then parsed to find popular channels with unlabelled content, with a minimum threshold of 400,000 views.

Next, we scrape ad data across all videos for further analysis, covering all major ad formats on the platform including (i) skippable and (ii) unskippable video ads, (iii) sidebar ads, (iv) in-feed ads, and (v) banner ads. We use a Selenium Webdriver script launched in a new logged-out Chrome window, with no previous history, cookies, or user data. We then scrape each ad’s unique YouTube-assigned video ID, and any embedded external link as the video plays.

Next, we use YouTube Data API to obtain additional metadata like video title, duration, and "made for kids" label for each video ad, the result of which is recorded in the dataset. The videos are played from different VPN locations to explore the varied experiences based on geographical location.

Files

data.zip

Files (4.0 MB)

Name Size Download all
md5:96d49976648f4abd2a77ccec41b10615
4.0 MB Preview Download

Additional details