Published April 1, 2022 | Version 1.0
Dataset Open

University of Rochester Audio-Visual Solo Singing Performance (URSing) Dataset

  • 1. ByteDance
  • 2. University of Rochester

Description

We introduce a dataset for facilitating audio-visual analysis of singing performances. The dataset comprises a number of songs where singers’ solo voices are recorded in isolation. For each song, we provide the high-quality audio recordings of the solo singing voice and mix with accompaniments, and the video recording of the upper body of the vocal soloist which contains facial expressions and lip movements. We anticipate that the dataset will be useful for developing audiovisual source separation systems. Note that some of the accompaniment tracks come with the backing vocals, which introduces extra challenges of developing an audio-based singing voice separation system, and encourages researchers to integrate the soloists’ visual information to facilitate the separation process. We also anticipate that the dataset will be useful for other multi-modal information retrieval techniques such as audiovisual expressions analysis, audio-visual correspondence, audiovisual lyrics transcription, etc.

Files

data.zip

Files (16.8 GB)

Name Size Download all
md5:78eaa9ae8df2e3eeb5d87b47a1b07a65
16.8 GB Preview Download
md5:6efe2304eb1316ab86294fa1c5a76a7b
453.7 kB Preview Download