Dataset Open Access
Giulianelli, Mario; Del Tredici, Marco; Fernández, Raquel
The DUPS (Diachronic Usage Pair Similarity) dataset contains similarity judgements of English word usage pairs from different time periods, as described in the paper below.
The WUG version of the DUPS dataset (version 2.0.0) contains diachronic Word Usage Graphs constructed from the similarity judgements of English word usage pairs contained in DUPS. In a word usage graph, the usages of a word are represented as nodes connected by edges weighted according to (human-annotated) semantic proximity. A description of the data format as well as the code used to generate the graphs from DUPS can be found at https://www.ims.uni-stuttgart.de/data/wugs.
Both versions of the DUPS dataset can be downloaded from the Files section of this web page.
Please cite this paper if you use any version of the dataset in your work:
Mario Giulianelli, Marco Del Tredici, and Raquel Fernández. 2020. Analysing Lexical Semantic Change with Contextualised Word Representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL-2020). Association for Computational Linguistics.