EMO
Authors/Creators
Description
The EMO dataset is a high-quality, paired audio corpus specifically developed to support research in vocal timbral technique conversion, with a primary focus on the vocal fry scream. It was created to address the significant scarcity of paired data for extreme vocalizations in the research community.
Key Dataset Features:
-
Content: A total of 1040 high-quality clips consisting of 520 modal voice and 520 vocal fry scream pairs.
-
Duration: Approximately 42 min.
-
Source: Recorded by a single professional metal singer.
-
Languages: Includes vocalizations in both Chinese and English.
-
Alignment: All clips were manually aligned within a Digital Audio Workstation to ensure precise temporal consistency between the modal and scream pairs.
Files
EMO.zip
Files
(207.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:604b11bf48dac8d53c083f3dbcf39733
|
207.5 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Publication: https://alberthsu0509.github.io/FABYOL/ (URL)
Dates
- Submitted
-
2026-01-24
Software
- Repository URL
- https://alberthsu0509.github.io/FABYOL/
- Development Status
- Wip
References
- Ting-Chao Hsu and Yi-Hsuan Yang. "Conditional Vocal Timbral Technique Conversion via Embedding-Guided Dual Attribute Modulation." In 1st International Workshop on Emerging AI Technologies for Music (EAIM 2026) at AAAI. 2026.