Published May 13, 2022
| Version 1.0
Other
Open
Lahjoita puhetta semisupervised baseline Kaldi ASR model
Description
Lahjoita puhetta semisupervised baseline speech recognition model, built with the Kaldi toolkit. Trained on 100 hours of supervised and approx. 1600 hours of untranscribed Finnish speech. Described in more detail in the paper https://arxiv.org/abs/2203.12906 "Lahjoita puhetta – a large-scale corpus of spoken Finnish with some benchmarks". For details on the training method, see https://github.com/aalto-speech/lahjoita-puhetta-baseline-kaldi.
Files
graph_sup100h_subword.zip
Files
(828.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:cad14fc9c38ba6bc283271f214dc666f
|
745.2 MB | Preview Download |
|
md5:96e27bbe60231a6412ba95aea05ae09f
|
82.8 MB | Preview Download |