There is a newer version of the record available.

Published November 5, 2022 | Version 2.1.0
Dataset Open

DWUG DE: Diachronic Word Usage Graphs for German

  • 1. University of Stuttgart
  • 2. King's College London, The Alan Turing Institute
  • 3. University of Gothenburg
  • 4. University of Cambridge

Description

This data collection contains diachronic Word Usage Graphs (WUGs) for German. Find a description of the data format, code to process the data and further datasets on the WUGsite.

See previous versions for additional testsets.

Please find more information on the provided data in the paper referenced below.

Version: 2.1.0, 05.11.2022. Contains spelling-normalization for uses from 1800-1899. Important: Version 2.0.0 extends previous versions with one more annotation round and new clusterings.

Reference

Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, Barbara McGillivray. 2021. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

Notes

Contains additional spelling-normalization for uses from 1800-1899. Important: Version 2.0.0 extends previous versions with one more annotation round and new clusterings.

Republication and redistribution is prohibited.

Files

dwug_de.zip

Files (13.9 MB)

Name Size Download all
md5:64c92229ad92561da4403fa19edab350
13.9 MB Preview Download

Additional details

Related works

Continues
Dataset: 10.5281/zenodo.5541274 (DOI)
Is published in
Conference paper: arXiv:2104.08540 (arXiv)
Is supplement to
Dataset: 10.5281/zenodo.5255227 (DOI)
Dataset: 10.5281/zenodo.5090647 (DOI)
Dataset: 10.5281/zenodo.5544443 (DOI)