Dynamic Crowdsourcing Consensus Tasks with Workers That Can Learn
MetadataShow full item record
Crowdsourcing has become one of the most popular topics in both academia and industry in the past few years. By hiring workers online, task assigners can take advantage of the wisdom of the crowd and solve problems that used to only be solvable by experts or that are too hard for automated computer algorithms. Consensus tasks are some of the most common tasks in the crowdsourcing domain, e.g. labeling, classification, pattern recognition, and etc. To solve consensus tasks, a group of workers are hired to report possible outcomes, and their opinions are aggregated to produce a final prediction. When working with consensus tasks, crowd workers’ quality information is always valuable for system designers to find better workers and to devise opinion aggregation approaches that can produce predictions with higher accuracy. Traditional approaches used in many research literatures either assume every worker has a fixed quality, or the worker population as a whole has a fixed quality. They calculate worker quality from historical data, assume it stays as is and use it for estimating existing workers’ future performance and future workers’ starting performance. We argue this is not true, especially for more complicated tasks where workers may need time and experience to fathom the idea and become adequate. We think the learning curve model used in factories and plants to model traditional workers’ productivity and quality can be adopted to work in a crowdsourcing context, where workers quality should improve too as they work on more tasks and become experienced. We modified the hyperbolic learning curve and setup a sleep spindle detection experiment on Amazon Mechanical Turk to validate our argument. The results proved our hypothesis that crowd workers do achieve higher quality as they work on more tasks, and different workers have different quality and learning speed. Moreover, our linear regression test shows that hyperbolic learning curve is a good fit to model crowd workers’ learning process. This gives us a more accurate approach to estimate and project crowd workers’ quality. With the help of learning curve model, we design a new dynamic hiring system that can make smart tradeoffs between workers’ current quality and future potential, as well as the accuracy of aggregated prediction and hiring cost. It can take advantage of crowd workers’ quality improvement, find, train, stay with high potential workers, and reduce the number of hired workers when some of the top performers stand out. Our system models the hiring process as a Markov Decision Process, and uses Monte Carlo Planning to find the best hiring path which optimizes the utility of both prediction accuracy and worker training. We compared our proposed hiring mechanism against other common methods such as randomly selecting workers and a bandit based approach which lets all workers work on a small training set first and picks the best performing ones. From results in both sleep spindle detection tasks and simulated tasks, our proposed dynamic hiring system yields a competitive or better performance with significant saving in hiring cost. Moreover, to handle situations where crowd workers may not always be available and they are not free to hire to work on tutorial tasks, we extend our dynamic hiring mechanism to incorporate randomization so the system can choose between utilizing high quality workers and exploring unknown workers. From our simulation tests, the extended hiring system works as expected and can save even more in hiring cost while providing high quality predictions. Our dynamic hiring system especially outperforms traditional hiring approaches when the worker population is mixed where some workers have higher availability or learning speed. Overall, our dynamic hiring system is always able to find and take advantage of high potential workers and make hiring decisions adaptively. Our work opens up a new direction for hiring mechanism design in crowdsourcing settings where the usage of learning curves enable us to distinguish crowd workers and to model them more accurately.