Published December 4, 2022 | Version 1.5.1
Software Open

ASR Corpus Creator

Authors/Creators

Description

Overview

This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.

Features

  • Send links of YouTube content
  • Send direct links to video/audio from remote servers
  • Collect metadata
    • Loudness
    • Label language detection
    • Audio language detection
  • Export labeled data using a console
  • whisperwav2vec2, or NeMo as an ASR backend

More detailshttps://github.com/egorsmkv/asr-corpus-creator

 

Files

asr-corpus-creator-main.zip

Files (2.6 MB)

Name Size Download all
md5:031d10ea74287a3d01b4bd66c7b18191
2.6 MB Preview Download

Additional details

References

  • Radford, Alec et al. "Robust Speech Recognition via Large-Scale Weak Supervision." (2022).
  • Kuchaiev, O, Li, J, Nguyen, H, Hrinchuk, O, Leary, R, Ginsburg, B, Kriman, S, Beliaev, S, Lavrukhin, V, Cook, J, Castonguay, P, Popova, M, Huang, J, Cohen, J. NeMo: a toolkit for building AI applications using Neural Modules.
  • Baevski, A, Zhou, H, Mohamed, A, Auli, M. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.
  • Lugosch, L., Likhomanenko, T., Synnaeve, G., & Collobert, R.. (2021). Pseudo-Labeling for Massively Multilingual Speech Recognition.