Published May 14, 2021 | Version 2.0.0
Dataset Open

AnCora Catalan 2.0.0

  • 1. Universitat de Barcelona

Description

AnCora Catalan 2.0.0 consists of 500,000 words. The corpus is annotated at different levels:

  • Lemma and Part of Speech
  • Syntactic constituents and functions
  • Argument structure and thematic roles
  • Semantic classes of the verb
  • Nouns related to WordNet synsets
  • Named Entities
  • Coreference relations

AnCora Catalan 2.0.0 is mainly based on journalist texts. For more information, click AnCora-corpus.

The annotators of AnCora Catalan 2.0.0 are:

Oriol Borrega, Isabel Briz, Núria Bufí, Montserrat Civit, María Jesús Díaz, Silvia Garcia, Raquel Hernández, Marina Lloberes, Raquel Marcos, Difda Monterde, Montserrat Nofre, Aina Peris, Lourdes Puiggròs, Marta Recasens, Bàrbara Soriano, Rita Zaragoza.

Files

AnCora Catalan 2.0.0.zip

Files (11.7 MB)

Name Size Download all
md5:86cb31b3cd56a261e8b4be836756400a
11.7 MB Preview Download