There is a newer version of this record available.

Dataset Open Access

Post-Evaluation Data for SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection

Schlechtweg, Dominik; McGillivray, Barbara; Hengchen, Simon; Dubossarsky, Haim; Tahmasebi, Nina

Authors

Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, and Nina Tahmasebi

Description

This data collection contains the post-evaluation data for SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection:

  • the starting kit to download data, and examples for competing in the CodaLab challenge including baselines
  • the true binary change scores of the targets for Subtask 1, and their true graded change scores for Subtask 2 (test_data_truth/),
  • the scoring program used to score submissions against the true test data in the evaluation and post-evaluation phase (scoring_program/),
  • the results of the evaluation phase including
    • the final rankings of the participating teams by their best submission (results/rankings_teams.csv),
    • the submitted files of each team (results/submissions/),
    • an overview of the results for each submission ordered by team (results/submissions_results.csv),
    • analysis plots (plots/) displaying the results:
      • under per_target/ we provide the gold change scores and the normalized prediction error of target words plotted against their frequency and polysemy statistics,
      • under per_team/ we provide the model predictions from the best submission per team (per subtask) plotted against frequency/polysemy statistics and performance on gold data (gray lines give the correlation with the respective variable in the gold data); we also provide plots of visualizing the teams' prediction similarities.

Some remarks:

  • the paper referenced below remains the only source for the rankings between teams,
  • some teams were disqualified, and are thus removed from the analyses and the rankings present in the paper,
  • some teams have changed names, resulting in a discrepancy between team names under results/ and team names in the paper. The paper contains a key to match old names with new names.

Test Data for SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection can be found using the links below:

Please find more information on the provided data in the paper referenced below.

Reference

Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky and Nina Tahmasebi. 2020. SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. SemEval@COLING2020.

The resources are freely available for education, research and other non-commercial purposes.

@inproceedings{schlechtweg2020semeval,
title = "{S}em{E}val-2020 {T}ask 1: {U}nsupervised {L}exical {S}emantic {C}hange {D}etection",
author = "Schlechtweg, Dominik and McGillivray, Barbara and Hengchen, Simon and Dubossarsky, Haim and Tahmasebi, Nina",
booktitle = "To appear in Proceedings of the 14th International Workshop on Semantic Evaluation",
year = "2020",
address = "Barcelona, Spain",
publisher = "Association for Computational Linguistics"}

 

The authors would like to thank Diana McCarthy for her valuable input to the genesis of this task. DS was supported by the Konrad Adenauer Foundation and the CRETA center funded by the German Ministry for Education and Research (BMBF) during the conduct of this study. This task has been funded in part by the project 'Towards Computational Lexical Semantic Change Detection' supported by the Swedish Research Council (2019–2022; dnr 2018-01184), and Nationella språkbanken (the Swedish National Language Bank) -- jointly funded by (2018--2024; dnr 2017-00626) and its 10 partner institutions, to NT. The Swedish list of potential change words were provided by the research group at the Department of Swedish, University of Gothenburg that work with the Contemporary Dictionary of the Swedish Academy. This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1, to BMcG. Additional thanks go to the annotators of our datasets, and an anonymous donor.
Files (4.2 MB)
Name Size
semeval2020_ulscd_posteval.zip
md5:1a97bf696c4c56e8ed0071c51be1e9fb
4.2 MB Download
  • Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky and Nina Tahmasebi. 2020. SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. SemEval@COLING2020. https://languagechange.org/semeval

374
188
views
downloads
All versions This version
Views 37427
Downloads 1881
Data volume 796.7 MB4.2 MB
Unique views 32823
Unique downloads 1871

Share

Cite as