Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published January 24, 2019 | Version v1
Dataset Open

Simple Italian sentences ranked by readability

  • 1. FBK

Description

The dataset contains 500,000 sentences extracted from the Paisà corpus (https://www.corpusitaliano.it/) which have been selected for being easy to read according to four parameters: token number, average word length, depth of the parse tree and verb "arity". The sentences are ranked by readability.

Files

IT-simple-monolingual.txt

Files (27.0 MB)

Name Size Download all
md5:e352f51f8d6032176c5b0f4402f7e446
27.0 MB Preview Download

Additional details

Funding

SIMPATICO – SIMplifying the interaction with Public Administration Through Information technology for Citizens and cOmpanies 692819
European Commission