Published February 12, 2021 | Version v1
Journal article Open

Revisiting Multi-Domain Machine Translation

  • 1. SYSTRAN, LIMSI/CNRS
  • 2. SYSTRAN
  • 3. LIMSI/CNRS

Description

When building machine translation systems, one often needs to make the best out of heterogeneous sets of parallel data in training, and to robustly handle inputs from un-expected domains in testing. This multi-domain scenario has attracted a lot of recent work, that fall under the general umbrella of transfer learning. In this study, we revisit multi-domain machine translation, with the aim to formulate the motivations for developing such systems and the associated expectations with respect to performance. Our experiments with a large sample of multi-domain systems show that most of these expectations are hardly met and suggest that further work is needed to better analyze the current behaviour of multi-domain systems and to make them fully hold their promises.

Files

main-2327-PhamMinhQuang.pdf

Files (284.7 kB)

Name Size Download all
md5:c2db23c132297b90a385481171fff2c2
284.7 kB Preview Download

Additional details

Funding

ANITA – Advanced tools for fighting oNline Illegal TrAfficking 787061
European Commission