Conference paper Open Access
Lakew, Surafel M.; Karakanta, Alina; Federico, Marcello; Negri, Matteo; Turchi, Marco
<?xml version='1.0' encoding='utf-8'?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:adms="http://www.w3.org/ns/adms#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dct="http://purl.org/dc/terms/" xmlns:dctype="http://purl.org/dc/dcmitype/" xmlns:dcat="http://www.w3.org/ns/dcat#" xmlns:duv="http://www.w3.org/ns/duv#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:frapo="http://purl.org/cerif/frapo/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:gsp="http://www.opengis.net/ont/geosparql#" xmlns:locn="http://www.w3.org/ns/locn#" xmlns:org="http://www.w3.org/ns/org#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:vcard="http://www.w3.org/2006/vcard/ns#" xmlns:wdrs="http://www.w3.org/2007/05/powder-s#"> <rdf:Description rdf:about="https://doi.org/10.5281/zenodo.3525486"> <dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://doi.org/10.5281/zenodo.3525486</dct:identifier> <foaf:page rdf:resource="https://doi.org/10.5281/zenodo.3525486"/> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Lakew, Surafel M.</foaf:name> <foaf:givenName>Surafel M.</foaf:givenName> <foaf:familyName>Lakew</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>University of Trento & Fondazione Bruno Kessler, Trento, Italy</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Karakanta, Alina</foaf:name> <foaf:givenName>Alina</foaf:givenName> <foaf:familyName>Karakanta</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>University of Trento & Fondazione Bruno Kessler, Trento, Italy</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Federico, Marcello</foaf:name> <foaf:givenName>Marcello</foaf:givenName> <foaf:familyName>Federico</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Fondazione Bruno Kessler, Trento, Italy</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Negri, Matteo</foaf:name> <foaf:givenName>Matteo</foaf:givenName> <foaf:familyName>Negri</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Fondazione Bruno Kessler, Trento, Italy</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:creator> <rdf:Description> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/> <foaf:name>Turchi, Marco</foaf:name> <foaf:givenName>Marco</foaf:givenName> <foaf:familyName>Turchi</foaf:familyName> <org:memberOf> <foaf:Organization> <foaf:name>Fondazione Bruno Kessler, Trento, Italy</foaf:name> </foaf:Organization> </org:memberOf> </rdf:Description> </dct:creator> <dct:title>Adapting Multilingual Neural Machine Translation to Unseen Languages</dct:title> <dct:publisher> <foaf:Agent> <foaf:name>Zenodo</foaf:name> </foaf:Agent> </dct:publisher> <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear">2019</dct:issued> <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2019-11-02</dct:issued> <dct:language rdf:resource="http://publications.europa.eu/resource/authority/language/ENG"/> <owl:sameAs rdf:resource="https://zenodo.org/record/3525486"/> <adms:identifier> <adms:Identifier> <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://zenodo.org/record/3525486</skos:notation> <adms:schemeAgency>url</adms:schemeAgency> </adms:Identifier> </adms:identifier> <dct:isVersionOf rdf:resource="https://doi.org/10.5281/zenodo.3525485"/> <dct:isPartOf rdf:resource="https://zenodo.org/communities/iwslt2019"/> <dct:description><p>Multilingual Neural Machine Translation (MNMT) for low- resource languages (LRL) can be enhanced by the presence of related high-resource languages (HRL), but the relatedness of HRL usually relies on predefined linguistic assumptions about language similarity. Recently, adapting MNMT to a LRL has shown to greatly improve performance. In this work, we explore the problem of adapting an MNMT model to an unseen LRL using data selection and model adapta- tion. In order to improve NMT for LRL, we employ perplexity to select HRL data that are most similar to the LRL on the basis of language distance. We extensively explore data selection in popular multilingual NMT settings, namely in (zero-shot) translation, and in adaptation from a multilingual pre-trained model, for both directions (LRL&harr;en). We further show that dynamic adaptation of the model&rsquo;s vocabulary results in a more favourable segmentation for the LRL in comparison with direct adaptation. Experiments show re- ductions in training time and significant performance gains over LRL baselines, even with&nbsp;zero&nbsp;LRL data (+13.0 BLEU), up to +17.0&nbsp;BLEU for pre-trained multilingual model dynamic adaptation with related data selection. Our method outperforms current approaches, such as massively multilingual models and data augmentation, on four LRL.</p></dct:description> <dct:accessRights rdf:resource="http://publications.europa.eu/resource/authority/access-right/PUBLIC"/> <dct:accessRights> <dct:RightsStatement rdf:about="info:eu-repo/semantics/openAccess"> <rdfs:label>Open Access</rdfs:label> </dct:RightsStatement> </dct:accessRights> <dct:license rdf:resource="https://creativecommons.org/licenses/by/4.0/legalcode"/> <dcat:distribution> <dcat:Distribution> <dcat:accessURL rdf:resource="https://doi.org/10.5281/zenodo.3525486"/> <dcat:byteSize>235536</dcat:byteSize> <dcat:downloadURL rdf:resource="https://zenodo.org/record/3525486/files/IWSLT2019_paper_27.pdf"/> <dcat:mediaType>application/pdf</dcat:mediaType> </dcat:Distribution> </dcat:distribution> </rdf:Description> </rdf:RDF>
All versions | This version | |
---|---|---|
Views | 176 | 176 |
Downloads | 113 | 113 |
Data volume | 26.6 MB | 26.6 MB |
Unique views | 152 | 152 |
Unique downloads | 107 | 107 |