Semantically tagged Europarl-sv.v7
Description
Semantically tagged Europarl-sv.v7
45.6+ M lines
Lexical coverage of the tagging: 83.90%
No semantic ambiguity resolving, all the tags marked
POS tagging for semantic tagging performed with Treetagger: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
Output in base form
Documentation of the semantic tagger (for Finnish, but same principles hold for Swedish, too):
https://www.aclweb.org/anthology/W19-0306/
https://zenodo.org/record/3676372#.YFNwIa8zY2w
Semantic tagging
Tagging of the data was performed in Puhti computing environment of the CSC – IT CENTER FOR SCIENCE LTD. https://research.csc.fi/-/puhti
Format: base form POS Semtag
Unknown words marked with tag Z99
Example output:
Återupptagande# Z99
av pp Z5
sessionen# Z99
jag nn S1.2.3+ Q4.1
förklara vb Q2.2 K5.1%
Europaparlamentets# Z99
session nn T1.3
återuppta# Z99
efter av X9.1-
avbrottet# Z99
en nl N1 Z8
17 NUMB
december nn T1.3
. PUNCT
jag nn S1.2.3+ Q4.1
Files
Files
(843.9 MB)
Name | Size | Download all |
---|---|---|
md5:867643ea6b035cd4f2a50b444a2dd37c
|
843.9 MB | Download |