Dataset Open Access
Shibata Hayato; Kato Taku; Shinozaki Takahiro; Watanabe Shinji
Deep neural networks (DNNs) were trained for posterior and bottleneck features using Japanese and other language speech data. We explore various DNN types, their combinations, and dimension reduction by principal component analysis (PCA).
This version (version 2 ) concatenates CSJ feature vector and PCA compressed feature vector made from attention end-to-end feature.
X:CSJ feature (60 dim bottleneck, (version 1 feature))
S:Attention end-to-end feature (320 dim)
T:PCA(S) (60 dim)