Published March 1, 2018
| Version v1
Dataset
Open
Datasets of "An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition"
Description
This is an Albanian named entities annotation corpus generated automatically (silver-standard) from Wikipedia and WikiData. It is offered in Apache OpenNLP annotation format.
Details of the generation approach may be found in the respective published paper: https://doi.org/10.2478/cait-2018-0009
Attached are also the files that were used for generating the Albanian named entities gazetteer and the gazetteer itself in JSON format.
Files
albanian-ne-gazetteer.zip
Files
(19.0 MB)
Name | Size | Download all |
---|---|---|
md5:a02e364d6f7be6a71f9b365887b27226
|
370.1 kB | Preview Download |
md5:d2df897f047f8227246bee5fa862faad
|
18.6 MB | Preview Download |
Additional details
Related works
- Is part of
- Journal article: 10.2478/cait-2018-0009 (DOI)