Published January 8, 2019
| Version v1
Dataset
Open
SIMPITIKI corpus for simplification in Italian
Description
SIMPITIKI is a Simplification corpus for Italian and it consists of two sets of simplified pairs: the first one is harvested from the Italian Wikipedia in a semi-automatic way; the second one is manually annotated sentence-by-sentence from documents in the administrative domain.
For more details, see https://github.com/dhfbk/simpitiki
Files
simpitiki.zip
Files
(365.6 kB)
Name | Size | Download all |
---|---|---|
md5:0925a605d41219bf6196696dba5ab147
|
365.6 kB | Preview Download |
Additional details
Funding
References
- Sara Tonelli, Alessio Palmero Aprosio, Francesca Saltori. SIMPITIKI: a Simplification corpus for Italian extracted from Wikipedia. In Proceedings of the Third Italian Conference on Computational Linguistics, Naples, Italy.