Published January 23, 2026 | Version 0
Dataset Open

EMO

  • 1. Music and AI Lab
  • 2. ROR icon National Taiwan University

Description

The EMO dataset is a high-quality, paired audio corpus specifically developed to support research in vocal timbral technique conversion, with a primary focus on the vocal fry scream. It was created to address the significant scarcity of paired data for extreme vocalizations in the research community.

Key Dataset Features:

  • Content: A total of 1040 high-quality clips consisting of 520 modal voice and 520 vocal fry scream pairs.

  • Duration: Approximately  42 min.

  • Source: Recorded by a single professional metal singer.

  • Languages: Includes vocalizations in both Chinese and English.

  • Alignment: All clips were manually aligned within a Digital Audio Workstation to ensure precise temporal consistency between the modal and scream pairs.

Files

EMO.zip

Files (207.5 MB)

Name Size Download all
md5:604b11bf48dac8d53c083f3dbcf39733
207.5 MB Preview Download

Additional details

Related works

Is supplement to
Publication: https://alberthsu0509.github.io/FABYOL/ (URL)

Dates

Submitted
2026-01-24

Software

Repository URL
https://alberthsu0509.github.io/FABYOL/
Development Status
Wip

References

  • Ting-Chao Hsu and Yi-Hsuan Yang. "Conditional Vocal Timbral Technique Conversion via Embedding-Guided Dual Attribute Modulation." In 1st International Workshop on Emerging AI Technologies for Music (EAIM 2026) at AAAI. 2026.