Published August 20, 2018 | Version v1
Conference paper Open

Local String Transduction as Sequence Labeling

  • 1. University of Edinburgh
  • 2. dMetrics

Description

We show that the general problem of string transduction can be reduced to the problem of sequence labeling. While character deletions and insertions are allowed in string transduction, they do not exist in sequence labeling. We show how to overcome this difference. Our approach can be used with any sequence labeling algorithm and it works best for problems in which string transduction imposes a strong notion of locality (no long range dependencies). We experiment with spelling correction for social media, OCR correction, and morphological inflection, and we see that it behaves better than seq2seq models and yields state-of-the-art results in several cases.

Files

C18-1115.pdf

Files (225.9 kB)

Name Size Download all
md5:645782b62f380d38c82fc26842cc20bc
225.9 kB Preview Download

Additional details

Funding

European Commission
SUMMA - Scalable Understanding of Multilingual Media 688139