Published November 30, 2022 | Version 2.0.1
Dataset Open

DWUG SV: Diachronic Word Usage Graphs for Swedish

  • 1. University of Gothenburg
  • 2. University of Stuttgart
  • 3. University of Cambridge

Description

This data collection contains diachronic Word Usage Graphs (WUGs) for Swedish. Find a description of the data format, code to process the data and further datasets on the WUGsite.

See previous versions for additional testsets.

Please find more information on the provided data in the paper referenced below.

Version: 2.0.1, 30.11.2022. Assigns noise uses the cluster label '-1' instead of removing them. One erroneous judgment of 5 was removed from 'vegetation'. Important: Version 2.0.0 extends previous versions with one more annotation round and new clusterings.

Reference

Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, Barbara McGillivray. 2021. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

Notes

Assigns noise uses the cluster label '-1' instead of removing them. Important: Version 2.0.0 extends previous versions with one more annotation round and new clusterings

Files

dwug_sv.zip

Files (12.2 MB)

Name Size Download all
md5:015646282f3a697d27c4ec4ba345a3fb
12.2 MB Preview Download

Additional details

Related works

Continues
Dataset: 10.5281/zenodo.5541274 (DOI)
Is published in
Conference paper: arXiv:2104.08540 (arXiv)
Is supplemented by
Dataset: 10.5281/zenodo.5255227 (DOI)
Dataset: 10.5281/zenodo.5544198 (DOI)
Dataset: 10.5281/zenodo.5544443 (DOI)

References

  • Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, Barbara McGillivray. 2021. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. https://arxiv.org/abs/2104.08540