Towards ELTeC-LLOD: European Literary Text Collection Linguistic Linked Open Data
Authors/Creators
Description
This paper describes a case study on the generation of Linked Data text corpora using the NLP Interchange Format (NIF). The ELTEC corpus subset, which consists of 900 novels from the period 1840-1920 for 9 European languages, served as the basis for this research. The annotated version of the novels, in the so-called TEI level-2 format, was transformed into NIF, an RDF/OWL-based format that aims to achieve interoperability between NLP tools, language resources, and annotations. In this paper, we present our approach for transformation, and the implemented pipeline, and offer the code and results for similar use cases.
Files
2023.ldk-1.16.pdf
Files
(407.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:fad65ef8ab44d410cc779f03d3415402
|
407.3 kB | Preview Download |