Conference paper Open Access

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

Di Gangi, Matti; Enyedi, Robert; Brusadin, Alessandra; Federico, Marcello

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3524947", 
  "language": "eng", 
  "title": "Robust Neural Machine Translation for Clean and Noisy Speech Transcripts", 
  "issued": {
    "date-parts": [
  "abstract": "<p>Neural machine translation models have shown to achieve high quality when trained and fed with well structured and punctuated input&nbsp;texts. Unfortunately, the latter condition is not met in spoken language translation, where the input is generated by an automatic speech&nbsp;recognition (ASR) system. In this paper, we study how to adapt a strong NMT system to make it robust to typical ASR errors. As in our application scenarios transcripts might be post-edited by human experts, we propose adaptation strategies to train a single system that can translate either clean or noisy input with no supervision on the input type. Our experimental results on a public speech translation data set show that adapting a model on a significant amount of parallel data including ASR transcripts is beneficial with test data of the same type, but produces a small degradation when translating clean text. Adapting on both clean and noisy variants of the same data leads to the best results on both input types.</p>", 
  "author": [
      "family": "Di Gangi, Matti"
      "family": "Enyedi, Robert"
      "family": "Brusadin, Alessandra"
      "family": "Federico, Marcello"
  "type": "paper-conference", 
  "id": "3524947"
All versions This version
Views 696696
Downloads 161161
Data volume 22.5 MB22.5 MB
Unique views 666666
Unique downloads 143143


Cite as