ASR Corpus Creator

Published December 4, 2022 | Version 1.5.1

Software Open

Overview

This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.

Features

Send links of YouTube content
Send direct links to video/audio from remote servers
Collect metadata
- Loudness
- Label language detection
- Audio language detection
Export labeled data using a console
whisper, wav2vec2, or NeMo as an ASR backend

More details: https://github.com/egorsmkv/asr-corpus-creator

Files

Name	Size	Download all
asr-corpus-creator-main.zip md5:031d10ea74287a3d01b4bd66c7b18191	2.6 MB	Preview Download

Radford, Alec et al. "Robust Speech Recognition via Large-Scale Weak Supervision." (2022).
Kuchaiev, O, Li, J, Nguyen, H, Hrinchuk, O, Leary, R, Ginsburg, B, Kriman, S, Beliaev, S, Lavrukhin, V, Cook, J, Castonguay, P, Popova, M, Huang, J, Cohen, J. NeMo: a toolkit for building AI applications using Neural Modules.
Baevski, A, Zhou, H, Mohamed, A, Auli, M. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.
Lugosch, L., Likhomanenko, T., Synnaeve, G., & Collobert, R.. (2021). Pseudo-Labeling for Massively Multilingual Speech Recognition.