Published April 21, 2021 | Version v0.1.2
Software Open

TRUNAJOD: A text complexity library to enhance natural language processing

  • 1. University of Concepción

Description

We present TRUNAJOD, a text complexity analysis tool that includes a wide variety of linguistics measurements that can be extracted from texts as an approximation for readability, coherence, and cohesion. The features that TRUNAJOD can extract from the text are based on the literature and can be separated into the following categories: discourse markers, emotions, entity grid-based measurements, givenness, lexical-semantic norms, semantic measures, surface proxies, etc. In this first version of TRUNAJOD, we mainly support the Spanish language, but several features support any language that has proper natural language processing POS tagging and dependency parsing capabilities. Finally, we show how TRUNAJOD could be used in applied research.

Notes

https://github.com/dpalmasan/TRUNAJOD2.0

Files

Files (23.7 MB)

Name Size Download all
md5:c50f02cfe58963277accb2e51d199d0f
23.7 MB Download