Lahjoita puhetta semisupervised baseline Kaldi ASR model

Tamás Grósz

doi:10.5281/zenodo.6545290

Published May 13, 2022 | Version 1.0

Other Open

Lahjoita puhetta semisupervised baseline Kaldi ASR model

Tamás Grósz¹

1. Aalto University

Lahjoita puhetta semisupervised baseline speech recognition model, built with the Kaldi toolkit. Trained on 100 hours of supervised and approx. 1600 hours of untranscribed Finnish speech. Described in more detail in the paper https://arxiv.org/abs/2203.12906 "Lahjoita puhetta – a large-scale corpus of spoken Finnish with some benchmarks". For details on the training method, see https://github.com/aalto-speech/lahjoita-puhetta-baseline-kaldi.

Files

graph_sup100h_subword.zip

Files (828.0 MB)

Name	Size	Download all
graph_sup100h_subword.zip md5:cad14fc9c38ba6bc283271f214dc666f	745.2 MB	Preview Download
tdnn_semisup_big_model.zip md5:96e27bbe60231a6412ba95aea05ae09f	82.8 MB	Preview Download

202

Views

103

Downloads

Show more details

	All versions	This version
Views	202	201
Downloads	103	103
Data volume	45.2 GB	45.2 GB

More info on how stats are collected....

DOI

Resource type

Other

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: May 13, 2022
Modified: May 13, 2022

Lahjoita puhetta semisupervised baseline Kaldi ASR model

Authors/Creators

Description

Files

graph_sup100h_subword.zip

Files (828.0 MB)