Dataset Open Access

PrevDistro - Preverb Distributions in Hungarian

Kalivoda, Ágnes


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/e58d41de-bb59-46b0-91c0-9d2321db9a87/PrevDistro.tsv"
      }, 
      "checksum": "md5:686521c26f1fbbc473e210946a4ab0cb", 
      "bucket": "e58d41de-bb59-46b0-91c0-9d2321db9a87", 
      "key": "PrevDistro.tsv", 
      "type": "tsv", 
      "size": 13236091245
    }
  ], 
  "owners": [
    317404
  ], 
  "doi": "10.5281/zenodo.6349410", 
  "stats": {
    "version_unique_downloads": 3.0, 
    "unique_views": 28.0, 
    "views": 39.0, 
    "version_views": 39.0, 
    "unique_downloads": 3.0, 
    "version_unique_views": 28.0, 
    "volume": 39708273735.0, 
    "version_downloads": 3.0, 
    "downloads": 3.0, 
    "version_volume": 39708273735.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.6349410", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.6349409", 
    "bucket": "https://zenodo.org/api/files/e58d41de-bb59-46b0-91c0-9d2321db9a87", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.6349409.svg", 
    "html": "https://zenodo.org/record/6349410", 
    "latest_html": "https://zenodo.org/record/6349410", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.6349410.svg", 
    "latest": "https://zenodo.org/api/records/6349410"
  }, 
  "conceptdoi": "10.5281/zenodo.6349409", 
  "created": "2022-03-12T18:39:05.935118+00:00", 
  "updated": "2022-03-13T01:49:02.321394+00:00", 
  "conceptrecid": "6349409", 
  "revision": 2, 
  "id": 6349410, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.6349410", 
    "description": "<p>PrevDistro (Preverb Distributions) is an open-source dataset containing 41.5 million corpus occurrences of 49 preverb-verb construction types. It consists of the following columns:</p>\n\n<ul>\n\t<li>1 <em>sid</em>: ID</li>\n\t<li>2 <em>constype</em>: construction type</li>\n\t<li>3 <em>subtype</em>: construction subtype</li>\n\t<li>4 <em>prevpos</em>: preverb position</li>\n\t<li>5 <em>prev</em>: preverb</li>\n\t<li>6 <em>verb</em>: verb lemma</li>\n\t<li>7 <em>intervening</em>: intervening words (as lemmas)</li>\n\t<li>8 <em>actform</em>: actual form (the same content as in column 10, but this column is lowercase)</li>\n\t<li>9 <em>left</em>: left context</li>\n\t<li>10 <em>kwic</em>: keyword in context</li>\n\t<li>11 <em>right</em>: right context</li>\n\t<li>12 <em>docid</em>: document ID from the Hungarian Gigaword Corpus</li>\n\t<li>13 <em>title</em>: document title</li>\n\t<li>14 <em>style</em>: document style (e.g. official, press, ...)</li>\n\t<li>15 <em>region</em>: document region (e.g. Transylvania, Subcarpathia, ...)</li>\n\t<li>16 <em>year</em>: year of publication (sometimes several years can be found in one document)</li>\n</ul>\n\n<p>The first row stands for the header. If a cell&#39;s value is unspecified, it is marked with underscore (_).</p>", 
    "language": "hun", 
    "title": "PrevDistro - Preverb Distributions in Hungarian", 
    "license": {
      "id": "GPL-3.0-or-later"
    }, 
    "notes": "PrevDistro 1.0.0 (deprecated) can be found at https://science-data.hu/dataset.xhtml?persistentId=doi:10.5072/FK2/TRSD50\nIn PrevDistro 2.0.0, several new columns were added and the already existing data has undergone some fixes as well.", 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "6349409"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "6349410"
          }
        }
      ]
    }, 
    "version": "2.0.0", 
    "keywords": [
      "linguistics", 
      "Hungarian", 
      "preverb constructions", 
      "preverb", 
      "verbal prefix", 
      "verbal particle", 
      "construction"
    ], 
    "publication_date": "2021-06-21", 
    "creators": [
      {
        "orcid": "0000-0003-2520-5523", 
        "affiliation": "Hungarian Research Centre for Linguistics", 
        "name": "Kalivoda, \u00c1gnes"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.15774/PPKE.BTK.2021.019", 
        "relation": "isNewVersionOf", 
        "resource_type": "publication-thesis"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.6349409", 
        "relation": "isVersionOf"
      }
    ]
  }
}
39
3
views
downloads
All versions This version
Views 3939
Downloads 33
Data volume 39.7 GB39.7 GB
Unique views 2828
Unique downloads 33

Share

Cite as