Dataset Open Access
Hongjie Chen Chen; Cheung-Chi Leung; Lei Xie; Bin Ma; Haizhou Li
We investigate the extraction of bottle-neck features (BNFs) for multiple languages without access to manual transcription. Multilingual BNFs are derived from a multi-task learning deep neural network which is trained with unsupervised phoneme-like labels. The unsupervised phoneme-like labels are obtained from language-dependent Dirichlet process Gaussian mixture models separately trained on untranscribed speech of multiple languages.
In this version, the input MFCC for DPGMM is processed with VTLN.