Multitask Learning For Different Subword Segmentations In Neural Machine Translation

Srinivasan, Tejas; Sanabria, Ramon; Metze, Florian

doi:10.5281/zenodo.3524988

Published November 2, 2019 | Version v1

Conference paper Open

Multitask Learning For Different Subword Segmentations In Neural Machine Translation

1. Language Technologies Institute, Carnegie Mellon University, USA

In Neural Machine Translation (NMT) the usage of sub􏰃words and characters as source and target units offers a simple and flexible solution for translation of rare and unseen words. However, selecting the optimal subword segmentation involves a trade-off between expressiveness and flexibility, and is language and dataset-dependent. We present Block Multitask Learning (BMTL), a novel NMT architecture that predicts multiple targets of different granularities simultaneously, removing the need to search for the optimal segmentation strategy. Our multi-task model exhibits improvements of up to 1.7 BLEU points on each decoder over single-task baseline models with the same number of parameters on datasets from two language pairs of IWSLT15 and one from IWSLT19. The multiple hypotheses generated at different granularities can be combined as a post-processing step to give better translations, which improves over hypothesis combination from baseline models while using substantially fewer parameters.

Files

IWSLT2019_paper_13.pdf

Files (847.4 kB)

Name	Size	Download all
IWSLT2019_paper_13.pdf md5:fe1bf170efecc844b7dbee0fd4508755	847.4 kB	Preview Download

	All versions	This version
Views	318	317
Downloads	165	165
Data volume	146.6 MB	146.6 MB

Multitask Learning For Different Subword Segmentations In Neural Machine Translation

Authors/Creators

Description

Files

IWSLT2019_paper_13.pdf

Files (847.4 kB)