Dataset Open Access

Catalan United Nations v1.0 test set

Marta R. Costa-jussà


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.3888414</identifier>
  <creators>
    <creator>
      <creatorName>Marta R. Costa-jussà</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-5703-520X</nameIdentifier>
      <affiliation>Universitat Politècnica de Catalunya</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Catalan United Nations v1.0 test set</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2020</publicationYear>
  <subjects>
    <subject>Multilingual Parallel Data</subject>
    <subject>Benchmark</subject>
    <subject>Catalan</subject>
    <subject>United Nations</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2020-06-10</date>
  </dates>
  <language>ca</language>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3888414</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsCompiledBy" resourceTypeGeneral="Text">10.1145/3312575</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="Cites" resourceTypeGeneral="Text">https://www.aclweb.org/anthology/L16-1561</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3888413</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;Catalan version [1] of the test set from the United Nations v1.0 [2]. The translation was performed in two steps: we did a first automatic translation from the Spanish test set version into Catalan and then a professional translator post-edited the output.&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
[1] Marta R. Costa-Juss&amp;agrave;, No&amp;eacute; Casas, Carlos Escolano, and Jos&amp;eacute; A. R. Fonollosa. 2019. Chinese-Catalan: A Neural Machine Translation Approach Based on Pivoting and Attention Mechanisms. &lt;em&gt;ACM Trans. Asian Low-Resour. Lang. Inf. Process.&lt;/em&gt; 18, 4, Article 43 (August 2019), 8 pages. DOI:https://doi.org/10.1145/3312575&lt;/p&gt;

&lt;p&gt;[2] Michal Ziemski, Marcin Junczys-Dowmunt, and Bruno Pouliquen. 2016. The United Nations parallel corpus v1.0. In&lt;br&gt;
Proceedings of the LREC, 2016&lt;/p&gt;</description>
    <description descriptionType="Other">This work is supported by the Spanish Ministerio de Economía y Competitividad and European Regional Development
Fund, through the postdoctoral senior grant Ramón y Cajal.</description>
    <description descriptionType="Other">{"references": ["Costa-juss\u00e0, M.R., Casas, N., Escolano, C. and Fonollosa, J.A.R., Chinese-Catalan: A Neural Machine Translation Approach based on Pivoting and Attention Mechanisms, ACM Transactions on Asian and Low-Resource Language Information Processing, Vol 18, No 4, Art. 43, 2019", "Michal Ziemski, Marcin Junczys-Dowmunt, and Bruno Pouliquen. 2016. The United Nations parallel corpus v1.0. In Proceedings of the LREC, 2016"]}</description>
  </descriptions>
</resource>
40
10
views
downloads
All versions This version
Views 4040
Downloads 1010
Data volume 8.2 MB8.2 MB
Unique views 3838
Unique downloads 99

Share

Cite as