Published November 30, 2022 | Version 1.1.1
Dataset Open

DiscoWUG: Discovered Diachronic Word Usage Graphs for German

  • 1. University of Stuttgart
  • 2. Leibniz-Institut für Deutsche Sprache

Description

This data collection contains discovered diachronic Word Usage Graphs (WUGs) for German. Find a description of the data format, code to process the data and further datasets on the WUGsite.

Note:

  • The date given for each word use does not correspond to the exact date of the document from which the use was sampled but only to the midpoint of the respective time period (1800-1899, 1946-1990), as the exact date was not available in the SemEval corpora.

Please find more information on the provided data in the paper referenced below.

Version: 1.1.1, 30.11.2022. Assigns noise uses the cluster label '-1' instead of removing them.

Reference

Sinan Kurtyigit, Maike Park, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde. 2021. Lexical Semantic Change Discovery. Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.

Notes

Assigns noise uses the cluster label '-1' instead of removing them.

Files

discowug.zip

Files (5.5 MB)

Name Size Download all
md5:395fe892e57a42c099662c5403914331
5.5 MB Preview Download

Additional details

Related works

Continues
Dataset: 10.5281/zenodo.5541274 (DOI)
Dataset: 10.5281/zenodo.5544198 (DOI)
Is published in
Conference paper: 10.18653/v1/2021.acl-long.543 (DOI)