Published October 20, 2021 | Version v1
Dataset Open

The EUTRANS-I corpus

Description

EUTRANS-I is a simple translation corpus which was produced and used in the EuTrans project. It corresponds to the so called "Traveller Task" which involves human-to-human communication situations in the front-desk of a hotel. Bilingual data were produced semi-automatically in three language pairs on the base of small "seed corpora", obtained from several traveler-oriented booklets.

In using this corpus you agree that:

1. This corpus may be used free of cost for non commercial purposes.

2. If commercial use is intended, please first contact:
   Dr. Enrique Vidal
   Pattern Recognition and Language Technologies Research Center
   Cno. de Vera s/n
   E - 46071 Valencia
   tel: +34 / 96 387 93 51
   fax: +34 / 96 387 73 58
   e-mail: evidal@prhlt.upv.es

3. In case of redistribution, this file must be distributed as is
   along with the corpus.

4. Any publication derived from the use of this corpus must reference
   the EuTrans project ("Example-based langUage TRANslation Systems",
   EU Esprit #30268) and appropriate scholarly citation(s), such as:

   J.C. Amengual, J.M. Benedí, F. Casacuberta, A. Castaño,
   A. Castellanos, V. Jiménez, D. Llorens, A. Marzal, F. Prat,
   E. Vidal, and J.M. Vilar: "Using categories in the EuTrans
   system". In ACL-ELSNET Workshoop on Spoken Language Translation,
   pages 44-53, Madrid (Spain), July 1997.

   J.C.Amengual, J.M.Benedí F.Casacuberta, A.Castaño, A.Castellanos,
   V.Jiménez, D.Llorens, A.Marzal, M.Pastor, F.Prat, E.Vidal,
   J.M.Vilar: "The EuTrans-I Speech Translation System". Machine
   Translation. Vol.15, pp.75-103, 2001.

   F.Casacuberta, H.Ney, F.J.Och, J.M.Vilar, E.Vidal, S.Barrachina,
   I.García-Varea, C.Martínez D.Llorens, S.Molau, F.Nevado, M.Pastor,
   D.Picó, A.Sanchís: "Some ap- proaches to statistical and
   finite-state speech-to-speech translation". Computer Speech and
   Language, Vol.18, pp.25-47, 2004.

   F.Casacuberta, E.Vidal, D.Picó: "Inference of finite-state
   transducers from regular languages". Pattern Recognition. Vol.38,
   pp.1431-1443, 2005.

   F.Casacuberta, E.Vidal: "Learning finite-state models for machine
   translation". Machine Learning, Vol.66(1), pp.69-91, 2007.

Files

Files (267.0 kB)

Name Size Download all
md5:46c4d75b2f6d3d8cfecc84fe6f19f684
267.0 kB Download

Additional details

References

  • J.C. Amengual, J.M. Benedí, F. Casacuberta, A. Castaño, A. Castellanos, V. Jiménez, D. Llorens, A. Marzal, F. Prat, E. Vidal, and J.M. Vilar: "Using categories in the EuTrans system". In ACL-ELSNET Workshoop on Spoken Language Translation, pages 44-53, Madrid (Spain), July 1997.
  • J.C.Amengual, J.M.Benedí F.Casacuberta, A.Castaño, A.Castellanos, V.Jiménez, D.Llorens, A.Marzal, M.Pastor, F.Prat, E.Vidal, J.M.Vilar: "The EuTrans-I Speech Translation System". Machine Translation. Vol.15, pp.75-103, 2001.
  • F.Casacuberta, H.Ney, F.J.Och, J.M.Vilar, E.Vidal, S.Barrachina, I.García-Varea, C.Martínez D.Llorens, S.Molau, F.Nevado, M.Pastor, D.Picó, A.Sanchís: "Some ap- proaches to statistical and finite-state speech-to-speech translation". Computer Speech and Language, Vol.18, pp.25-47, 2004.
  • F.Casacuberta, E.Vidal, D.Picó: "Inference of finite-state transducers from regular languages". Pattern Recognition. Vol.38, pp.1431-1443, 2005.
  • F.Casacuberta, E.Vidal: "Learning finite-state models for machine translation". Machine Learning, Vol.66(1), pp.69-91, 2007.