Published May 14, 2021
| Version 2.0.0
Dataset
Open
AnCora Catalan 2.0.0
Description
AnCora Catalan 2.0.0 consists of 500,000 words. The corpus is annotated at different levels:
- Lemma and Part of Speech
- Syntactic constituents and functions
- Argument structure and thematic roles
- Semantic classes of the verb
- Nouns related to WordNet synsets
- Named Entities
- Coreference relations
AnCora Catalan 2.0.0 is mainly based on journalist texts. For more information, click AnCora-corpus.
The annotators of AnCora Catalan 2.0.0 are:
Oriol Borrega, Isabel Briz, Núria Bufí, Montserrat Civit, María Jesús Díaz, Silvia Garcia, Raquel Hernández, Marina Lloberes, Raquel Marcos, Difda Monterde, Montserrat Nofre, Aina Peris, Lourdes Puiggròs, Marta Recasens, Bàrbara Soriano, Rita Zaragoza.
Files
AnCora Catalan 2.0.0.zip
Files
(11.7 MB)
Name | Size | Download all |
---|---|---|
md5:86cb31b3cd56a261e8b4be836756400a
|
11.7 MB | Preview Download |