Aggregating Labels in Crowdsourcing Data
Creators
- 1. CERN openlab Summer Student
- 2. Summer Student Supervisor
Description
Project Specification
Crowdsourcing is gaining popularity in academia with the launch of crowdsourcing platforms such as Crowdcrafting [Lombraña, 2015] and GeoTagX [UNOSAT, 2015]. There have been a number of proposed algorithms for the aggregation of true labels and a confusion matrix from crowdsourced labels for ordinal, nominal and binary labels.
The work here consists of an implementation of the Dawid Skene [Dawid 1979] adaptation of the Expectation Maximization algorithm [Dempster 1977] for the extraction of true labels from binary data.
The second part of the project is the planning of the 2015 edition of an open-source promoting coding event for CERN Summer Students called the CERN Webfest.
Abstract
Crowdsourcing is a method in which multiple individuals with possibly no prior knowledge in the field solve a number of tasks. The solutions given by the individuals are then aggregated to infer the true solution from the common knowledge of the individuals.
In this paper we give a short overview of some of the aggregation methods and hybrid crowdsourcing solutions used. We then implement the label aggregation model proposed by Dawid and Skene [Dawid 1979] for open source and open science websites such as Crowdcrafting.org [Lombraña, 2015] and the UNOSAT project GeoTagX [UNOSAT, 2015].
Finally we also discuss the organization and results of the CERN Webfest 2015, a hackathon for CERN Summer Students.
Files
Files
(606.7 kB)
Name | Size | Download all |
---|---|---|
md5:8cb2b622d9f4691215bc60e0cd7f87cc
|
606.7 kB | Download |