Podcast annotation dataset for paper "Identifying Introductions in Podcast Episodes from Automatically Generated Transcripts "

doi:10.5281/zenodo.5762442

Published December 6, 2021 | Version 0

Dataset Open

Podcast annotation dataset for paper "Identifying Introductions in Podcast Episodes from Automatically Generated Transcripts "

1. Sirius XM

Dataset for paper "Identifying Introductions in Podcast Episodes from Automatically Generated Transcripts". Please refer to the paper for details. Compared to the dataset used in the paper, 20 out of the 445 episodes have been removed due to copyright issues.

Each data file contains the following fields:

- "episode_intro_start": the time stamp for episode introduction start (in milliseconds)

- "episode_intro_end": the time stamp for episode introduction end (in milliseconds)

- "program_intro_start": the time stamp for program introduction start (in milliseconds)

- "program_intro_end": the time stamp for program introduction end (in milliseconds)

- "program_name": name of the podcast program

- "episode_name": name of the podcast episode

- "transcription": JSON string containing the transcription, including the timestamps.

- "annotator": anonymized annotator ID.

Files

Files (118.0 MB)

Name	Size	Download all
seen_program_test_episodes_202107_pub.tsv md5:feee4ff9e4ca5f50a8413577b6da0ba5	12.3 MB	Download
seen_program_val_episodes_202107_pub.tsv md5:abc086d1a9ae257d9a5a50ba1775f015	9.3 MB	Download
train_episodes_202107_pub.tsv md5:f4af18f9fb0b5735f06e02405f3410c0	91.1 MB	Download
unseen_program_test_episodes_202107_pub.tsv md5:e819398ff47cb098f06bed662889e2c2	2.7 MB	Download
unseen_program_val_episodes_202107_pub.tsv md5:e819398ff47cb098f06bed662889e2c2	2.7 MB	Download

Additional details

Is supplement to: Journal article: https://arxiv.org/abs/2110.07096 (URL)

	All versions	This version
Views	480	84
Downloads	96	26
Data volume	4.9 GB	799.5 MB

Podcast annotation dataset for paper "Identifying Introductions in Podcast Episodes from Automatically Generated Transcripts "

Creators

Description

Files

Files (118.0 MB)

Additional details

Related works