Semantically tagged Europarl-it.v7
Description
Semantically tagged Europarl-it.v7
54.2+ M lines
Lexical coverage of the tagging: 94.06%
No semantic ambiguity resolving, all the tags marked
POS tagging for semantic tagging performed with Treetagger: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
Output in base form
Documentation of the semantic tagger (for Finnish, but same principles hold for Italian, too):
https://www.aclweb.org/anthology/W19-0306/
https://zenodo.org/record/3676372#.YFNwIa8zY2w
Semantic tagging
Tagging of the data was performed in Puhti computing environment of the CSC – IT CENTER FOR SCIENCE LTD. https://research.csc.fi/-/puhti
Format: base form POS Semtag
Unknown words marked with tag Z99
Example output
ripresa noun I1.1+
del art Z5
sessione noun Q3 T1.3
dichiarare verb Q2.2
riprendere verb M2
IL noun Z5
sessione noun Q3 T1.3
del art Z5
parlamento noun G2.1
europeo adj Z2
PON PUNCT
interrompere verb T2-
Venerdì abr T1.3
17 NUMB
Files
Files
(902.3 MB)
Name | Size | Download all |
---|---|---|
md5:f6164577e3440a66891e21d9f647212e
|
902.3 MB | Download |