Medical-Relation-Extraction: CrowdTruth Ground Truth for Medical Relation Extraction

doi:10.5281/zenodo.50222

Published April 22, 2016 | Version 2.0

Dataset Open

Medical-Relation-Extraction: CrowdTruth Ground Truth for Medical Relation Extraction

1. Vrije Universiteit Amsterdam
2. Google Research

The lack of annotated datasets for training and benchmarking is one of the main challenges of Clinical Natural Language Processing. In addition, current methods for collecting annotations attempt to minimize disagreement between annotators, and therefore fail to model the ambiguity inherent in language. We propose the CrowdTruth method for collecting medical ground truth through crowdsourcing, based on the observation that disagreement between annotators can signal ambiguity in the text, target semantics, or the worker's interpretation.

This repository contains a dataset of 3,984 English sentences for medical relation extraction, centering on the cause and treat medical relations, that have been processed with CrowdTruth disagreement analytics to capture ambiguity. In addition, we provide the raw crowdsourcing data used to compile this ground truth, as well as the task templates used to collect the data on CrowdFlower.

Files

Medical-Relation-Extraction-2.0.zip

Files (6.4 MB)

Name	Size	Download all
Medical-Relation-Extraction-2.0.zip md5:57b829c48432a02b73aca7474ca01538	6.4 MB	Preview Download

Additional details

Is supplement to: https://github.com/CrowdTruth/Medical-Relation-Extraction/tree/2.0 (URL)

	All versions	This version
Views	715	119
Downloads	64	12
Data volume	388.4 MB	83.7 MB

Medical-Relation-Extraction: CrowdTruth Ground Truth for Medical Relation Extraction

Creators

Description

Files

Medical-Relation-Extraction-2.0.zip

Files (6.4 MB)

Additional details

Related works