There is a newer version of the record available.

Published March 23, 2024 | Version 1.3.3
Dataset Open

PaRuS

Description

PaRuS is a morphologically tagged and dependency-parsed 2.5 B token corpus of Russian sentences. It consists of more than 150 M isolated sentences taken from open-source texts. The annotation scheme is that of the SynTagRus corpus.

Notes

See https://parus-proj.github.io/PaRuS for more details. The work was funded by RFBR through the research project No. 19-07-00779.

Files

Files (32.0 GB)

Name Size Download all
md5:cec185542e4a2f49aa33423962cbccb8
15.2 GB Download
md5:2d65f881f35932b4e1c8c1271108b52e
16.8 GB Download

Additional details

Dates

Created
2024-03