TRUNAJOD: A text complexity library to enhance natural language processing
- 1. University of Concepción
Description
We present TRUNAJOD
, a text complexity analysis tool that includes a wide variety of linguistics measurements that can be extracted from texts as an approximation for readability, coherence, and cohesion. The features that TRUNAJOD
can extract from the text are based on the literature and can be separated into the following categories: discourse markers, emotions, entity grid-based measurements, givenness, lexical-semantic norms, semantic measures, surface proxies, etc. In this first version of TRUNAJOD
, we mainly support the Spanish language, but several features support any language that has proper natural language processing POS tagging and dependency parsing capabilities. Finally, we show how TRUNAJOD could be used in applied research.
Notes
Files
Files
(23.7 MB)
Name | Size | Download all |
---|---|---|
md5:c50f02cfe58963277accb2e51d199d0f
|
23.7 MB | Download |