Published February 27, 2024
| Version 1.0.0
Dataset
Open
NRS EN/SV: Automatically detected non-recorded word senses in English and Swedish
Authors/Creators
Description
This data collection contains English and Swedish use-sense instances annotated with binary labels. Annotators were asked to judge whether the respective sense (gloss) describes the meaning of the target word in the respective use well. We provide the following files:
data/: uses, senses, instances and judgments for randomly sampled uses (phase 1) and for uses predicted to be missing from the respective dictionary (phase 2). Instances for phase 2 are missing but can be easily reconstructed by combining each use with each sense of the lemma for that use. We further provide assigned and unassigned usages aggregated over the three annotators as described in the paper below. The tutorial used for training annotators is available in the annotation_standardization repository.guidelines/: the guidelines used for annotator training.
Please find more information including limitations on the data in the paper referenced below.
Version: 1.0.0, 27.02.2024.
Reference
Jonathan Lautenschlager, Emma Sköldberg, Simon Hengchen, Dominik Schlechtweg. 2024. Detection of non-recorded word senses in English and Swedish.
Files
nrs_en_sv.zip
Files
(821.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ce0ca6b4798b772ec53f6f33bc775c81
|
821.9 kB | Preview Download |
Additional details
Funding
- Stiftelsen Riksbankens Jubileumsfond
- Change is Key! M21-0021
Software
- Repository URL
- https://github.com/ChangeIsKey/non-recorded-sense-detection