QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications

A crowdsourcing system, such as the Amazon Mechanical Turk (AMT), provides a platform for a large number of questions to be answered by Internet workers. Such systems have been shown to be useful to solve problems that are difficult for computers, including entity resolution, sentiment analysis, and image recognition. In this project, we investigate the online task assignment problem: Given a pool of n questions, which of the k questions should be given to a worker? A poor assignment may not only waste time and money, but may also hurt the quality of a crowdsourcing application that depends on the workers’ inputs.

We propose to consider quality measures (also known as evaluation metrics) that are relevant to an application during the task assignment process. Particularly, we explore how Accuracy and F-score, two widely-used evaluation metrics for crowdsourcing applications, can facilitate task assignment. Since these two metrics assume that the ground truth of a question is known, we study their variants that make use of the probability distributions of workers’ answers. We further investigate online assignment strategies, which enables optimal task assignments. Since these algorithms are expensive, we propose solutions that attain high quality in linear time. We develop a system called the Quality-Aware Task Assignment System for Crowdsourcing Applications (QASCA) on top of AMT. We evaluate our approaches on five real crowdsourcing applications. We found that QASCA is efficient, and attains better result quality (of more than 8% improvement) than existing methods.


Having finished installing the above required softwares, in order to deploy a real application, what you need is to
(1) configure the "config.ini" file in the publish folder, which contains the database, log and mturk information;
(2) create a new folder in the apps folder, and the new folder contains three main files to configure: the Questions file ("questions.json"), the HTML template file ("view.html", "accept.html"), and the Configuration file ("config.ini").

(a) "question.json" contains the questions needed to publish, and the questions are organized in a json format;

(b) "view.html" contains a static html file and the workers will see in the view mode at AMT;

(c) "accept.html" is a django template file and the workers will see when they accept a HIT at AMT;

(d) "config.ini" contains parameters related to your deployed app.