ution Control for Crowd-Sourcing-

Execution Control for Crowd-SourcingDaniel S. Weld, Mausam, Peng DaiComputer Science & EngineeringUniversity of WashingtonSeattle, WA 98195fweld, mausam, daipenggcs.washington.eduABSTRACTCrowdsourcing marketplaces enable a wide range of appli-cations, but constructing any new application is challeng-ing usually requiring a complex, self-managing work-flow in order to guarantee quality results. We report on theCLOWDER project, which uses machine learning to continu-ally refine models of worker performance and task difficulty.We present decision-theoretic optimization techniques thatcan select the best parameters for a range of workflows. Ini-tial experiments show our optimized workflows are signifi-cantly more economical than with manually set parameters.ACM Classification Keywords: H5.2. Information inter-faces and presentation: User Interfaces.General terms: Algorithms, performance, experimentation.Author Keywords: Human computation, decision-theory.INTRODUCTIONAmazon Mechanical Turk and similar crowd-sourcing mar-ketplaces enable applications that seamlessly mix humancomputation with AI and other automated techniques. Ex-ample applications already span the range from product cat-egorization and photo tagging to A/V transcription and in-terlingual translation. In order to guarantee quality resultsfrom variable competency workers, most applications usecomplex, self-managing workflows with independent pro-duction and review stages. E.g., iterative improvement 7and find-fix-verify workflows 2 are popular patterns. Butdevising these patterns and adapting them to a new task isboth complex and time consuming. Existing developmentenvironments, e.g. Turkit 7 simplify important issues, suchas control flow and debugging, but many challenges remain.In order to craft an effective application, the designer must:Choose between alternative workflows for the same task.Optimize the parameters for a selected workflow.Create tuned interfaces for the expected workers.Control execution of the final workflow.We argue that AI methods such as machine learning,decision-theory, optimization can solve these problems, fa-cilitating the rapid construction of effective crowd-sourcedworkflows. Our first system, TURKONTROL 4, 5, usesdecision-theoretic control to optimize iterative workflowson Amazon Mechanical Turk. It automatically learns task-Copyright is held by the author/owner(s).UIST11, October 1619, 2011, Santa Barbara, CA, USA.ACM 978-1-4503-1014-7/11/10.HTNlibraryDT planneruser modelstask modelsworkermarketplacerendererrenderedjoblearnerFigure 1: Architecture of the CLOWDER system.dependent models of typical workers and refines this modelfor individuals over the course of interaction. More recently,we present the architecture of a successor system, CLOWDER(Figure 1), which we are starting to implement 8. Thisposter focuses on our methods for optimizing executionof two new workflow patterns: find-fix-verify 2 and theretainer-bonus model 1 for real-time crowd creation 3.OPTIMAL WORKFLOW CONTROLDecision-theoretic modeling of workflows allows CLOWDERto automatically control the different pieces of the task anddynamically allocate resources to the sub-tasks that are ex-pected to yield largest benefits. The benefits are evaluated interms of the utility that is given as the input by the requester.For example, in a find-fix-verify workflow invoked by Soy-lent, the users utility function would reward the absence oferrors and the quality of the repairing prose. In a retainer-bonus workflow designed for real-time response, the utilityfunction might follow a step function such that answers de-livered more than 2-3 seconds after requested had low utility.CLOWDER extends the decision-theoretic control method-ology used in TURKONTROL 4. Each controller runsa partially-observable Markov decision process (POMDP).The agent seeks to execute actions that maximize the utilitybased on the current belief a probability distribution overpossible world states since the true world state is hidden.For example, in an iterative improvement workflow, the (un-seen) world state comprises the quality of the current arti-fact (e.g., an English description of a picture), the qualityof a modified artifact (e.g., a potentially improved descrip-tion just returned by a worker) and the accuracy levels ofthe workers involved. By executing ballot actions (where apotentially-fallible worker reports which artifact is better) thesystem updates its belief estimates.For a find-fix-verify workflow, a world state includes thenumber of flaws and quality estimates of proposed repairs asGenerate Find HIT Find more flaws?Update posterior of flaw fGenerate verify HIT Update posteriors for all isMore verification needed? all is bksubmit the best combination of all sYNYNinitial artifact () f Pick a flaw to fix Generate Fix HITf Fix more flaws?fNYiFigure 2: Decision-theoretic computations needed to control the find-fix-verify workflow.Activat