Conference paper Open Access

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

Di Gangi, Matti; Enyedi, Robert; Brusadin, Alessandra; Federico, Marcello


JSON-LD (schema.org) Export

{
  "inLanguage": {
    "alternateName": "eng", 
    "@type": "Language", 
    "name": "English"
  }, 
  "description": "<p>Neural machine translation models have shown to achieve high quality when trained and fed with well structured and punctuated input&nbsp;texts. Unfortunately, the latter condition is not met in spoken language translation, where the input is generated by an automatic speech&nbsp;recognition (ASR) system. In this paper, we study how to adapt a strong NMT system to make it robust to typical ASR errors. As in our application scenarios transcripts might be post-edited by human experts, we propose adaptation strategies to train a single system that can translate either clean or noisy input with no supervision on the input type. Our experimental results on a public speech translation data set show that adapting a model on a significant amount of parallel data including ASR transcripts is beneficial with test data of the same type, but produces a small degradation when translating clean text. Adapting on both clean and noisy variants of the same data leads to the best results on both input types.</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "Fondazione Bruno Kessler, Trento, Italy & University of Trento, Italy", 
      "@type": "Person", 
      "name": "Di Gangi, Matti"
    }, 
    {
      "affiliation": "Amazon AI, East Palo Alto, USA", 
      "@type": "Person", 
      "name": "Enyedi, Robert"
    }, 
    {
      "affiliation": "Amazon AI, East Palo Alto, USA", 
      "@type": "Person", 
      "name": "Brusadin, Alessandra"
    }, 
    {
      "affiliation": "Amazon AI, East Palo Alto, USA", 
      "@type": "Person", 
      "name": "Federico, Marcello"
    }
  ], 
  "headline": "Robust Neural Machine Translation for Clean and Noisy Speech Transcripts", 
  "image": "https://zenodo.org/static/img/logos/zenodo-gradient-round.svg", 
  "datePublished": "2019-11-02", 
  "url": "https://zenodo.org/record/3524947", 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.5281/zenodo.3524947", 
  "@id": "https://doi.org/10.5281/zenodo.3524947", 
  "@type": "ScholarlyArticle", 
  "name": "Robust Neural Machine Translation for Clean and Noisy Speech Transcripts"
}
693
161
views
downloads
All versions This version
Views 693693
Downloads 161161
Data volume 22.5 MB22.5 MB
Unique views 663663
Unique downloads 143143

Share

Cite as