Dataset Open Access

Webis-Simple-Sentences-17 Corpus

Kiesel, Johannes; Stein, Benno; Lucks, Stefan


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/e7ec33e3-6d4e-43cc-b199-c2d2a3a9e0f4/webis-simple-sentences-17-corpus-test.txt.gz"
      }, 
      "checksum": "md5:06099a2ec0e941080c37c8cf12bd7f75", 
      "bucket": "e7ec33e3-6d4e-43cc-b199-c2d2a3a9e0f4", 
      "key": "webis-simple-sentences-17-corpus-test.txt.gz", 
      "type": "gz", 
      "size": 1062566945
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/e7ec33e3-6d4e-43cc-b199-c2d2a3a9e0f4/webis-simple-sentences-17-corpus-training.txt.gz"
      }, 
      "checksum": "md5:7b3047871ad00bb2a83a5402f8237445", 
      "bucket": "e7ec33e3-6d4e-43cc-b199-c2d2a3a9e0f4", 
      "key": "webis-simple-sentences-17-corpus-training.txt.gz", 
      "type": "gz", 
      "size": 11588106487
    }
  ], 
  "owners": [
    65747
  ], 
  "doi": "10.5281/zenodo.205950", 
  "stats": {
    "version_unique_downloads": 208.0, 
    "unique_views": 530.0, 
    "views": 580.0, 
    "version_views": 579.0, 
    "unique_downloads": 208.0, 
    "version_unique_views": 529.0, 
    "volume": 2049409095984.0, 
    "version_downloads": 324.0, 
    "downloads": 324.0, 
    "version_volume": 2049409095984.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.205950", 
    "latest_html": "https://zenodo.org/record/205950", 
    "bucket": "https://zenodo.org/api/files/e7ec33e3-6d4e-43cc-b199-c2d2a3a9e0f4", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.205950.svg", 
    "html": "https://zenodo.org/record/205950", 
    "latest": "https://zenodo.org/api/records/205950"
  }, 
  "created": "2016-12-16T16:01:10.966570+00:00", 
  "updated": "2020-01-24T19:26:05.198676+00:00", 
  "conceptrecid": "698138", 
  "revision": 15, 
  "id": 205950, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.205950", 
    "description": "<p>A corpus of 471,085,690 English sentences extracted from the ClueWeb12 Web Crawl. The sentences were sampled from a larger corpus to achieve a level of sentence complexity similar to the one of sentences that humans make up as a memory aid for remembering passwords. Sentence complexity was determined by syllables per word.</p>\n\n<p>The corpus is split in training and test set as it is used in the associated publication.&nbsp; The test set is extracted from part 00 of the ClueWeb12, while the training set is extracted from the other parts.</p>\n\n<p>More information on the corpus can be found on the corpus web page at our university (listed under documented by).</p>", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "title": "Webis-Simple-Sentences-17 Corpus", 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "698138"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "205950"
          }
        }
      ]
    }, 
    "communities": [
      {
        "id": "webis"
      }
    ], 
    "references": [
      "Johannes Kiesel, Benno Stein, and Stefan Lucks (2017). A Large-scale Analysis of the Mnemonic Password Advice. In Proceedings of the 24th Annual Network and Distributed System Security Symposium (NDSS 17)."
    ], 
    "keywords": [
      "Web Crawl", 
      "Sentence", 
      "Readability", 
      "Password", 
      "Password Mnemonic", 
      "Mnemonic", 
      "Web"
    ], 
    "publication_date": "2017-02-27", 
    "creators": [
      {
        "orcid": "0000-0002-1617-6508", 
        "affiliation": "Bauhaus-Universit\u00e4t Weimar", 
        "name": "Kiesel, Johannes"
      }, 
      {
        "orcid": "0000-0001-9033-2217", 
        "affiliation": "Bauhaus-Universit\u00e4t Weimar", 
        "name": "Stein, Benno"
      }, 
      {
        "affiliation": "Bauhaus-Universit\u00e4t Weimar", 
        "name": "Lucks, Stefan"
      }
    ], 
    "meeting": {
      "acronym": "NDSS 2017", 
      "url": "http://www.internetsociety.org/events/ndss-symposium/ndss-symposium-2017", 
      "dates": "February 26 - March 1, 2017", 
      "place": "San Diego, California.", 
      "title": "Network and Distributed System Security Symposium 2017"
    }, 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.14722/ndss.2017.23077", 
        "relation": "isCompiledBy"
      }, 
      {
        "scheme": "url", 
        "identifier": "http://www.uni-weimar.de/en/media/chairs/webis/corpora/corpus-webis-sentences-17/", 
        "relation": "isDocumentedBy"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.398838", 
        "relation": "isSupplementedBy"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.398837", 
        "relation": "isSupplementedBy"
      }
    ]
  }
}
579
324
views
downloads
All versions This version
Views 579580
Downloads 324324
Data volume 2.0 TB2.0 TB
Unique views 529530
Unique downloads 208208

Share

Cite as