University of Rochester Audio-Visual Solo Singing Performance (URSing) Dataset
Description
We introduce a dataset for facilitating audio-visual analysis of singing performances. The dataset comprises a number of songs where singers’ solo voices are recorded in isolation. For each song, we provide the high-quality audio recordings of the solo singing voice and mix with accompaniments, and the video recording of the upper body of the vocal soloist which contains facial expressions and lip movements. We anticipate that the dataset will be useful for developing audiovisual source separation systems. Note that some of the accompaniment tracks come with the backing vocals, which introduces extra challenges of developing an audio-based singing voice separation system, and encourages researchers to integrate the soloists’ visual information to facilitate the separation process. We also anticipate that the dataset will be useful for other multi-modal information retrieval techniques such as audiovisual expressions analysis, audio-visual correspondence, audiovisual lyrics transcription, etc.
Files
data.zip
Files
(16.8 GB)
Name | Size | Download all |
---|---|---|
md5:78eaa9ae8df2e3eeb5d87b47a1b07a65
|
16.8 GB | Preview Download |
md5:6efe2304eb1316ab86294fa1c5a76a7b
|
453.7 kB | Preview Download |