Semantically tagged Europarl-cs.v7
Description
Semantically tagged Europarl-cz.v7
14.9+ M lines
Lexical coverage of the tagging: 83.90%
No semantic ambiguity resolving, all the tags marked
POS tagging for semantic tagging performed with Treetagger: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
Output in base form
Documentation of the semantic tagger (for Finnish, but same principles hold for Czech, too):
https://www.aclweb.org/anthology/W19-0306/
https://zenodo.org/record/3676372#.YFNwIa8zY2w
Format: base form POS Semtag
Unknown words marked with tag Z99
Semantic tagging
Tagging of the data was performed in Puhti computing environment of the CSC – IT CENTER FOR SCIENCE LTD. https://research.csc.fi/-/puhti
Example output
následný A N4
postup N X4.2
na R Z5
základ N A2.2 T2+ X4.2
usnesení N X6+ X9.2+
parlament# Z99
: PUNCT
viz V X3.4 X2.1 S1.1.1 X2.5+ X2.3+ X3 A7+ Z4 S3.2
zápis N M7 Q1.2 S7.3 M1 T2+
předložení N A9- A2.2 Q2.2 S1.1.3+ Q4.3 O4.1 K4
dokument N Q1.2 X2.2+ Y2
: PUNCT
viz V X3.4 X2.1 S1.1.1 X2.5+ X2.3+ X3 A7+ Z4 S3.2
zápis N M7 Q1.2 S7.3 M1 T2+
písemný A Q1.2
Files
Files
(251.1 MB)
Name | Size | Download all |
---|---|---|
md5:5ef2aa183deccefc95e0e69556223779
|
251.1 MB | Download |