Published October 10, 2022
| Version 1.1
Dataset
Open
N20EM dataset for multimodal lyric transcription
Description
N20EM dataset for multimodal lyric transcription, proposed in our ACM MM 2022 paper, MM-ALT: A Multimodal Automatic Lyric Transcription System. This dataset contains recordings of three modalities: audio, video, and IMU motion signal.
Our paper's camera ready version: https://arxiv.org/abs/2207.06127
Project website: https://n20em.github.io/
Note:
- Once you download the dataset, we assume you have read and agreed with the Terms and Conditions.
- Commercial usage is strictly prohibited.
Please cite our work as:
@inproceedings{gu2022mm, title={MM-ALT: A multimodal automatic lyric transcription system}, author={Gu, Xiangming and Ou, Longshen and Ong, Danielle and Wang, Ye}, booktitle={Proceedings of the 30th ACM International Conference on Multimedia}, pages={3328--3337}, year={2022} }
Notes
Files
accompaniment.zip
Additional details
Related works
- Is published in
- Conference paper: 10.1145/3503161.3548411 (DOI)
References
- Gu, X., Ou, L., Ong, D. and Wang, Y., 2022, October. Mm-alt: A multimodal automatic lyric transcription system. In Proceedings of the 30th ACM International Conference on Multimedia (pp. 3328-3337).