Published May 13, 2022 | Version 1.0
Other Open

Lahjoita puhetta semisupervised baseline Kaldi ASR model

Authors/Creators

  • 1. Aalto University

Description

Lahjoita puhetta semisupervised baseline speech recognition model, built with the Kaldi toolkit. Trained on 100 hours of supervised and approx. 1600 hours of untranscribed Finnish speech. Described in more detail in the paper https://arxiv.org/abs/2203.12906 "Lahjoita puhetta – a large-scale corpus of spoken Finnish with some benchmarks". For details on the training method, see https://github.com/aalto-speech/lahjoita-puhetta-baseline-kaldi.

Files

graph_sup100h_subword.zip

Files (828.0 MB)

Name Size Download all
md5:cad14fc9c38ba6bc283271f214dc666f
745.2 MB Preview Download
md5:96e27bbe60231a6412ba95aea05ae09f
82.8 MB Preview Download