There is a newer version of the record available.

Published April 28, 2026 | Version 20260217v2
Dataset Open

Lexical frequency data from DWDS corpora for a case study on regional variation in German

  • 1. Berlin-Brandenburgische Akademie der Wissenschaften

Description

This dataset contains lexical frequency data from three DWDS corpora – the ZDL-Regionalkorpus, the Webmonitor-Korpus, and the Regionalliteratur-Korpus (not publicly accessible) – for a case study comparing the regional distribution of the following German nouns:

  1. “Samstag” and “Sonnabend”,
  2. “Komma” and “Beistrich”,
  3. “Fahrrad”, “Radl”, and “Velo”,
  4. “Aufzug”, “Fahrstuhl”, and “Lift”.

The dataset contains the following data types:

  • TXT files recording the corpus queries,
  • CSV tables with absolute and relative hit frequencies, obtained on 17 Feb. 2026 from the DWDS API,
  • SVG maps derived from the relative hit frequencies.

Files

dataset.zip

Files (962.0 kB)

Name Size Download all
md5:7a784f3b3420595a3c8d8c4ad98f8126
962.0 kB Preview Download