FrameNet Semantic Frame Disambiguation with CrowdTruth
Description
This repository contains a ground truth corpus for semantic frame disambiguation, acquired with crowdsourcing and processed with CrowdTruth metrics that capture ambiguity in annotations by measuring inter-annotator disagreement.
The dataset contains annotations for 433 sentence-word pairs from the FrameNet corpus v.1.7, with each sentence-word pair annotated for frame disambiguation by 15 workers. The crowdsourced data was collected from Amazon Mechanical Turk.
The corpus has been referenced in the following paper:
- Anca Dumitrache, Lora Aroyo and Chris Welty: Capturing and Interpreting Ambiguity in Crowdsourcing Frame Disambiguation. HCOMP 2018.
To replicate the data processing from the paper, use the Jupyter Notebook file CrowdTruth metrics.ipynb
. It requires the installation of the CrowdTruth metrics Python package (v >= 2.0).
The data aggregated with CrowdTruth metrics is available in folder data/output/
The raw crowdsourcing data is available in folder data/input/
If you find this data useful in your research, please consider citing:
@inproceedings{dumitrache2018frames,
Author = {Anca Dumitrache and Lora Aroyo and Chris Welty},
Title = {Capturing Ambiguity in Crowdsourcing Frame Disambiguation},
Booktitle = {The sixth AAAI Conference on Human Computation and Crowdsourcing},
Year = {2018}
}
Files
CrowdTruth/FrameDisambiguation-v.1.0.zip
Files
(3.2 MB)
Name | Size | Download all |
---|---|---|
md5:469e846309e70004ef985444e5a99b02
|
3.2 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/CrowdTruth/FrameDisambiguation/tree/v.1.0 (URL)
References
- Anca Dumitrache, Lora Aroyo and Chris Welty: Capturing and Interpreting Ambiguity in Crowdsourcing Frame Disambiguation. HCOMP 2018. arXiv:1805.00270