CLS INFRA D8.3 Report on Applied NLP Named Entity Recognition
Authors/Creators
Description
The following report documents the work of Work Package 8 in the CLS Infrastructure Project. The general goals of this work package are to increase the ease of access and application to NLP tools, including for less-well-resourced languages, as well as their standardization. The report is organized as follows: a generation explaination of named entity recognition tasks, technical boundaries, challenges for literary scholar (and or those working with unstructured texts) and thus proposed tools for these tasks. This includes machine learning pipeline for automatically extracting pre-defined mentions of known objects, such as people, places or organizations to generative AI solutions and in multiple languages appliacle to a wide set of scholars. These tools integrate work in both WP 6 and 7 which facilitates integration of the pipeline from data preparation, programmable corpora, to analysis and back.
This research was conducted within the framework of the European-funded Computational Literary Studies Infrastructure (CLS INFRA, https://clsinfra.io/) project, funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101004984, which aims to build a shared and sustainable infrastructure for literary studies with digital tools.
Files
D8.3 Report on NLP for NER.pdf
Files
(1.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:97a95df2febc5607e20610ab2c0b3c62
|
1.4 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/GhentCDH/CLSinfra/tree/main