Conference paper Open Access

SIMPITIKI: A Simplification Corpus for Italian

Tonelli, Sara; Palmero Aprosio, Alessio; Saltori, Francesca


JSON-LD (schema.org) Export

{
  "description": "<p>In this work, we analyse whether&nbsp;Wikipedia can be used to leverage simplification pairs instead of Simple Wikipedia,&nbsp;which has proved unreliable for assessing automatic simplification systems, and&nbsp;is available only in English. We focus&nbsp;on sentence pairs in which the target sentence is the outcome of a Wikipedia edit&nbsp;marked as &lsquo;simplified&rsquo;, and manually annotate simplification phenomena following an existing scheme proposed for previous simplification corpora in Italian.&nbsp;The outcome of this work is the SIMPITIKI corpus, which we make freely available, with pairs of sentences extracted&nbsp;from Wikipedia edits and annotated with&nbsp;simplification types. The resource contains also another corpus with roughly&nbsp;the same number of simplifications, which&nbsp;was manually created by simplifying documents in the administrative domain</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "Fondazione Bruno Kessler", 
      "@type": "Person", 
      "name": "Tonelli, Sara"
    }, 
    {
      "affiliation": "Fondazione Bruno Kessler", 
      "@type": "Person", 
      "name": "Palmero Aprosio, Alessio"
    }, 
    {
      "affiliation": "Fondazione Bruno Kessler", 
      "@type": "Person", 
      "name": "Saltori, Francesca"
    }
  ], 
  "headline": "SIMPITIKI: A Simplification Corpus for Italian", 
  "image": "https://zenodo.org/static/img/logos/zenodo-gradient-round.svg", 
  "datePublished": "2019-01-08", 
  "url": "https://zenodo.org/record/2534132", 
  "version": "2.", 
  "@type": "ScholarlyArticle", 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.5281/zenodo.2534132", 
  "@id": "https://doi.org/10.5281/zenodo.2534132", 
  "workFeatured": {
    "alternateName": "CLIC-it", 
    "location": "Naples, Italy", 
    "@type": "Event", 
    "name": "Third Italian Conference on Computational Linguistics"
  }, 
  "name": "SIMPITIKI: A Simplification Corpus for Italian"
}
78
78
views
downloads
All versions This version
Views 7878
Downloads 7878
Data volume 27.4 MB27.4 MB
Unique views 7272
Unique downloads 7171

Share

Cite as