Studying Taxonomy Enrichment on Diachronic WordNet Versions

Irina Nikishina; Alexander Panchenko; Varvara Logacheva; Natalia Loukachevitch

doi:10.5281/zenodo.4279821

Published November 12, 2020 | Version v2

Dataset Open

Studying Taxonomy Enrichment on Diachronic WordNet Versions

1. Skolkovo Institute of Science and Technology, Moscow, Russia
2. Research Computing Center, Lomonosov Moscow State University, Moscow, Russia

We choose two versions of WordNet and then select words which appear only in a newer version. For each word, we get its hypernyms from the newer WordNet version and consider them as gold standard hypernyms. We add words to the dataset if only their hypernyms appear in both snippets. We do not consider adjectives and adverbs, because they often introduce abstract concepts and are difficult to interpret by context.

Previous dataset (RUSSE'2020) does not include short words (<4 symbols), diminutives, named entities and other constraints described in the shared task paper. We remove those constraints and present a non-restricted Russian dataset and a symmetrical English dataset from WordNet database.

Files

datasets.zip

Files (1.2 MB)

Name	Size	Download all
datasets.zip md5:cc053dc6fd255044c0085c0c52ed8086	1.2 MB	Preview Download

910

Views

129

Downloads

Show more details

	All versions	This version
Views	910	639
Downloads	129	102
Data volume	141.4 MB	127.4 MB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

Russian

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 18, 2020
Modified: November 19, 2020

Studying Taxonomy Enrichment on Diachronic WordNet Versions

Authors/Creators

Description

Files

datasets.zip

Files (1.2 MB)