Dataset Open Access

PrevDistro - Preverb Distributions in Hungarian

Kalivoda, Ágnes

JSON-LD ( Export

  "inLanguage": {
    "alternateName": "hun", 
    "@type": "Language", 
    "name": "Hungarian"
  "description": "<p>PrevDistro (Preverb Distributions) is an open-source dataset containing 41.5 million corpus occurrences of 49 preverb-verb construction types. It consists of the following columns:</p>\n\n<ul>\n\t<li>1 <em>sid</em>: ID</li>\n\t<li>2 <em>constype</em>: construction type</li>\n\t<li>3 <em>subtype</em>: construction subtype</li>\n\t<li>4 <em>prevpos</em>: preverb position</li>\n\t<li>5 <em>prev</em>: preverb</li>\n\t<li>6 <em>verb</em>: verb lemma</li>\n\t<li>7 <em>intervening</em>: intervening words (as lemmas)</li>\n\t<li>8 <em>actform</em>: actual form (the same content as in column 10, but this column is lowercase)</li>\n\t<li>9 <em>left</em>: left context</li>\n\t<li>10 <em>kwic</em>: keyword in context</li>\n\t<li>11 <em>right</em>: right context</li>\n\t<li>12 <em>docid</em>: document ID from the Hungarian Gigaword Corpus</li>\n\t<li>13 <em>title</em>: document title</li>\n\t<li>14 <em>style</em>: document style (e.g. official, press, ...)</li>\n\t<li>15 <em>region</em>: document region (e.g. Transylvania, Subcarpathia, ...)</li>\n\t<li>16 <em>year</em>: year of publication (sometimes several years can be found in one document)</li>\n</ul>\n\n<p>The first row stands for the header. If a cell&#39;s value is unspecified, it is marked with underscore (_).</p>", 
  "license": "", 
  "creator": [
      "affiliation": "Hungarian Research Centre for Linguistics", 
      "@id": "", 
      "@type": "Person", 
      "name": "Kalivoda, \u00c1gnes"
  "url": "", 
  "datePublished": "2021-06-21", 
  "version": "2.0.0", 
  "keywords": [
    "preverb constructions", 
    "verbal prefix", 
    "verbal particle", 
  "@context": "", 
  "distribution": [
      "contentUrl": "", 
      "encodingFormat": "tsv", 
      "@type": "DataDownload"
  "identifier": "", 
  "@id": "", 
  "@type": "Dataset", 
  "name": "PrevDistro - Preverb Distributions in Hungarian"
All versions This version
Views 3939
Downloads 33
Data volume 39.7 GB39.7 GB
Unique views 2828
Unique downloads 33


Cite as