Published November 1, 2024 | Version 2.0.0
Dataset Open

DiscoWUG: Discovered Diachronic Word Usage Graphs for German

  • 1. University of Stuttgart
  • 2. Leibniz-Institut für Deutsche Sprache

Description

This data collection contains discovered diachronic Word Usage Graphs (WUGs) for German. Find a description of the data format, code to process the data and further datasets on the WUGsite.

Note:

  • The date given for each word use does not correspond to the exact date of the document from which the use was sampled but only to the midpoint of the respective time period (1800-1899, 1946-1990), as the exact date was not available in the SemEval corpora.

Please find more information on the provided data in the papers referenced below.

Reference

Sinan Kurtyigit, Maike Park, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde. 2021. Lexical Semantic Change Discovery. Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.

Dominik Schlechtweg, Pierluigi Cassotti, Bill Noble, David Alfter, Sabine Schulte im Walde, Nina Tahmasebi. More DWUGs: Extending and Evaluating Word Usage Graph Datasets in Multiple Languages. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

Notes

Extends previous versions with one more annotation round and new clusterings.

Republication and redistribution is prohibited.

Files

discowug.zip

Files (9.9 MB)

Name Size Download all
md5:eb4cec0638b539ae7225950e09452fb2
9.9 MB Preview Download

Additional details

Related works

Continues
Dataset: 10.5281/zenodo.5541274 (DOI)
Dataset: 10.5281/zenodo.5544198 (DOI)
Is published in
Conference paper: 10.18653/v1/2021.acl-long.543 (DOI)