University of Rochester Audio-Visual Solo Singing Performance (URSing) Dataset

Bochen Li; Yuxuan Wang; Zhiyao Duan

doi:10.5281/zenodo.6404999

Published April 1, 2022 | Version 1.0

Dataset Open

University of Rochester Audio-Visual Solo Singing Performance (URSing) Dataset

1. ByteDance
2. University of Rochester

We introduce a dataset for facilitating audio-visual analysis of singing performances. The dataset comprises a number of songs where singers’ solo voices are recorded in isolation. For each song, we provide the high-quality audio recordings of the solo singing voice and mix with accompaniments, and the video recording of the upper body of the vocal soloist which contains facial expressions and lip movements. We anticipate that the dataset will be useful for developing audiovisual source separation systems. Note that some of the accompaniment tracks come with the backing vocals, which introduces extra challenges of developing an audio-based singing voice separation system, and encourages researchers to integrate the soloists’ visual information to facilitate the separation process. We also anticipate that the dataset will be useful for other multi-modal information retrieval techniques such as audiovisual expressions analysis, audio-visual correspondence, audiovisual lyrics transcription, etc.

Files

data.zip

Files (16.8 GB)

Name	Size	Download all
data.zip md5:78eaa9ae8df2e3eeb5d87b47a1b07a65	16.8 GB	Preview Download
Manual.pdf md5:6efe2304eb1316ab86294fa1c5a76a7b	453.7 kB	Preview Download

	All versions	This version
Views	1,348	1,342
Downloads	773	765
Data volume	34.3 TB	34.2 TB

University of Rochester Audio-Visual Solo Singing Performance (URSing) Dataset

Creators

Description

Files

data.zip

Files (16.8 GB)