Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

Cheng, Qiao; Fan, Meiyuan; Han, Yaqian; Huang, Jin; Duan, Yitao

doi:10.5281/zenodo.3524969

Published November 2, 2019 | Version v1

Conference paper Open

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

1. NetEase Youdao Information Technology (Beijing) Co., LTD., Beijing, China
2. NetEase Youdao Information Technology (Beijing) Co., LTD., Beijing, ChinaNetEase Youdao Information Technology (Beijing) Co., LTD., Beijing, China

In a pipeline speech translation system, automatic speech recognition (ASR) system will transmit errors in recognition to the downstream machine translation (MT) system. A standard machine translation system is usually trained on parallel corpus composed of clean text and will perform poorly on text with recognition noise, a gap well known in speech translation community. In this paper, we propose a training architecture which aims at making a neural machine translation model more robust against speech recognition errors. Our approach addresses the encoder and the decoder simultaneously using adversarial learning and data augmentation, respectively. Experimental results on IWSLT2018 speech translation task show that our approach can bridge the gap between the ASR output and the MT input, outperforms the baseline by up to 2.83 BLEU on noisy ASR output, while maintaining close performance on clean text.

Files

IWSLT2019_paper_6.pdf

Files (400.2 kB)

Name	Size	Download all
IWSLT2019_paper_6.pdf md5:36e0bc7ac948b105985ffc27839fe087	400.2 kB	Preview Download

	All versions	This version
Views	423	422
Downloads	253	253
Data volume	110.5 MB	110.5 MB

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

Authors/Creators

Description

Files

IWSLT2019_paper_6.pdf

Files (400.2 kB)