SVKCorp: Corpus of Debates in the National Council of the Slovak Republic
Creators
Description
This is a repository for the corpus of transcripts of parliamentary debates in the National Council of the Slovak Republic (https://www.nrsr.sk/web/). Transcripts of speeches were text-mined and cleaned from machine-readable word documents and the official online database of the Parliament. The corpus covers the period of 1994-2020 and counts seven complete terms with 375 024 speeches. The repository contains two types of data files. The corpus is stored on per term basis in .RDS files with the following taxonomy – SK_term_*.RDS. For further information on data structure, see the accompanying codebook. Apart from the main corpus, the repository also contains annotated speeches in the full CoNNL-U format (see SK_speeches_anno_*.RDS). The annotation was done using Trankit analytical pipeline with the default Slovak language model.
Notes
Files
Files
(2.0 GB)
Name | Size | Download all |
---|---|---|
md5:e1bc0aebc03ac3c10627c46514ac10e2
|
178.1 MB | Download |
md5:9dd497380a50e97968a001f20965befe
|
275.9 MB | Download |
md5:d2cdb81f209ae66c36cb0d07e7d1c659
|
179.5 MB | Download |
md5:51f281c99e6ca9e5e6e54663de4c7d2a
|
217.8 MB | Download |
md5:c49d50b92f60ce5ae2ed7ebe9e49f572
|
99.8 MB | Download |
md5:96e2f08c67d6b94cb4204f06873cba81
|
413.6 MB | Download |
md5:fb7d773777194d17dd0d5494d402bda5
|
292.6 MB | Download |
md5:6755bbbfece56bc2f19d67e3fa3f2000
|
33.1 MB | Download |
md5:b2a1edcdb0a0e0ee7a5ef54549eb386b
|
50.7 MB | Download |
md5:9491865113d42a68a018467108675f43
|
31.4 MB | Download |
md5:2ec388d440c4f4c52be84284cdc667ef
|
39.9 MB | Download |
md5:3721f137ea4d1676a1d841f27b196661
|
18.4 MB | Download |
md5:a31a6d59eea898d5febf5491cfd05f0c
|
75.9 MB | Download |
md5:7c34a07cbd8be60f10f7cabaf47267fe
|
53.6 MB | Download |