Dataset Open Access

SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection

Schlechtweg, Dominik; McGillivray, Barbara; Hengchen, Simon; Dubossarsky, Haim; Tahmasebi, Nina


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.3931969</identifier>
  <creators>
    <creator>
      <creatorName>Schlechtweg, Dominik</creatorName>
      <givenName>Dominik</givenName>
      <familyName>Schlechtweg</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-0685-2576</nameIdentifier>
      <affiliation>IMS, University of Stuttgart</affiliation>
    </creator>
    <creator>
      <creatorName>McGillivray, Barbara</creatorName>
      <givenName>Barbara</givenName>
      <familyName>McGillivray</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0003-3426-8200</nameIdentifier>
      <affiliation>The Alan Turing Institute and University of Cambridge</affiliation>
    </creator>
    <creator>
      <creatorName>Hengchen, Simon</creatorName>
      <givenName>Simon</givenName>
      <familyName>Hengchen</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-8453-7221</nameIdentifier>
      <affiliation>Språkbanken, University of Gothenburg</affiliation>
    </creator>
    <creator>
      <creatorName>Dubossarsky, Haim</creatorName>
      <givenName>Haim</givenName>
      <familyName>Dubossarsky</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-2818-6113</nameIdentifier>
      <affiliation>University of Cambridge</affiliation>
    </creator>
    <creator>
      <creatorName>Tahmasebi, Nina</creatorName>
      <givenName>Nina</givenName>
      <familyName>Tahmasebi</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0003-1688-1845</nameIdentifier>
      <affiliation>Språkbanken, University of Gothenburg</affiliation>
    </creator>
  </creators>
  <titles>
    <title>SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2020</publicationYear>
  <subjects>
    <subject>unsupervised lexical semantic change detection</subject>
    <subject>semantic change</subject>
    <subject>SemEval2020 Task1</subject>
    <subject>English</subject>
    <subject>German</subject>
    <subject>Latin</subject>
    <subject>Swedish</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2020-05-27</date>
  </dates>
  <language>en</language>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3931969</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3921904</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/semeval</relatedIdentifier>
  </relatedIdentifiers>
  <version>v1</version>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;&lt;strong&gt;Authors&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, and Nina Tahmasebi&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This data collection contains the &lt;strong&gt;post-evaluation&lt;/strong&gt; data for &lt;a href="https://languagechange.org/semeval"&gt;SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;the starting kit to download data, and examples for competing in the CodaLab challenge including baselines&lt;/li&gt;
	&lt;li&gt;the true binary change scores of the targets for Subtask 1, and their true graded change scores for Subtask 2 (&lt;code&gt;test_data_truth/&lt;/code&gt;),&lt;/li&gt;
	&lt;li&gt;the scoring program used to score submissions against the true test data in the evaluation and post-evaluation phase (&lt;code&gt;scoring_program/&lt;/code&gt;),&lt;/li&gt;
	&lt;li&gt;the results of the evaluation phase including
	&lt;ul&gt;
		&lt;li&gt;the final rankings of the participating teams by their best submission (&lt;code&gt;results/rankings_teams.csv&lt;/code&gt;),&lt;/li&gt;
		&lt;li&gt;the submitted files of each team (&lt;code&gt;results/submissions/&lt;/code&gt;),&lt;/li&gt;
		&lt;li&gt;an overview of the results for each submission ordered by team (&lt;code&gt;results/submissions_results.csv&lt;/code&gt;),&lt;/li&gt;
		&lt;li&gt;analysis plots (&lt;code&gt;plots/&lt;/code&gt;) displaying the results:
		&lt;ul&gt;
			&lt;li&gt;under &lt;code&gt;per_target/&lt;/code&gt; we provide the gold change scores and the normalized prediction error of target words plotted against their frequency and polysemy statistics,&lt;/li&gt;
			&lt;li&gt;under &lt;code&gt;per_team/&lt;/code&gt; we provide the model predictions from the best submission per team (per subtask) plotted against frequency/polysemy statistics and performance on gold data (gray lines give the correlation with the respective variable in the gold data); we also provide plots of visualizing the teams&amp;#39; prediction similarities.&lt;/li&gt;
		&lt;/ul&gt;
		&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some remarks:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;the paper referenced below remains the only source for the rankings between teams,&lt;/li&gt;
	&lt;li&gt;some teams were disqualified, and are thus removed from the analyses and the rankings present in the paper,&lt;/li&gt;
	&lt;li&gt;some teams have changed names, resulting in a discrepancy between team names under &lt;code&gt;results/&lt;/code&gt; and team names in the paper. The paper contains a key to match old names with new names.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Test Data &lt;/strong&gt;for SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection can be found using the links below:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;a href="https://www.ims.uni-stuttgart.de/en/research/resources/corpora/sem-eval-ulscd-eng/"&gt;English&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://www.ims.uni-stuttgart.de/en/research/resources/corpora/sem-eval-ulscd-ger/"&gt;German&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://zenodo.org/record/3734089"&gt;Latin&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://zenodo.org/record/3730550"&gt;Swedish&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Please find more information on the provided data in the paper referenced below.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky and Nina Tahmasebi. 2020. &lt;a href="https://languagechange.org/semeval"&gt;SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection&lt;/a&gt;. SemEval@COLING2020.&lt;/p&gt;

&lt;p&gt;The resources are freely available for education, research and other non-commercial purposes.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@inproceedings{schlechtweg2020semeval,
title = "{S}em{E}val-2020 {T}ask 1: {U}nsupervised {L}exical {S}emantic {C}hange {D}etection",
author = "Schlechtweg, Dominik and McGillivray, Barbara and Hengchen, Simon and Dubossarsky, Haim and Tahmasebi, Nina",
booktitle = "To appear in Proceedings of the 14th International Workshop on Semantic Evaluation",
year = "2020",
address = "Barcelona, Spain",
publisher = "Association for Computational Linguistics"}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
    <description descriptionType="Other">The authors would like to thank Diana McCarthy for her valuable input to the genesis of this task. DS was supported by the Konrad Adenauer Foundation and the CRETA center funded by the German Ministry for Education and Research (BMBF) during the conduct of this study. This task has been funded in part by the project 'Towards Computational Lexical Semantic Change Detection' supported  by the Swedish Research Council (2019–2022; dnr 2018-01184), and Nationella språkbanken (the Swedish National Language Bank) -- jointly funded  by  (2018--2024; dnr 2017-00626) and its 10 partner institutions, to NT. The Swedish list of potential change words were provided by the research group at the Department of Swedish, University of Gothenburg that work with the Contemporary Dictionary of the Swedish Academy. This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1, to BMcG. Additional thanks go to the annotators of our datasets, and an anonymous donor.</description>
    <description descriptionType="Other">{"references": ["Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky and Nina Tahmasebi. 2020. SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. SemEval@COLING2020. https://languagechange.org/semeval"]}</description>
  </descriptions>
</resource>
839
235
views
downloads
All versions This version
Views 839628
Downloads 235229
Data volume 995.9 MB970.5 MB
Unique views 746573
Unique downloads 230224

Share

Cite as