research-article

Deadline-Aware Fair Scheduling for Multi-Tenant Crowd-Powered Systems

Authors:

Djellel Difallah,

Alessandro Checco,

Gianluca Demartini,

Philippe Cudré-MaurouxAuthors Info & Claims

ACM Transactions on Social Computing, Volume 2, Issue 1

Article No.: 3, Pages 1 - 29

https://doi.org/10.1145/3301003

Published: 21 February 2019 Publication History

Abstract

Crowdsourcing has become an integral part of many systems and services that deliver high-quality results for complex tasks such as data linkage, schema matching, and content annotation. A standard function of such crowd-powered systems is to publish a batch of tasks on a crowdsourcing platform automatically and to collect the results once the workers complete them. Currently, these systems provide limited guarantees over the execution time, which is problematic for many applications. Timely completion may even be impossible to guarantee due to factors specific to the crowdsourcing platform, such as the availability of workers and concurrent tasks. In our previous work, we presented the architecture of a crowd-powered system that reshapes the interaction mechanism with the crowd. Specifically, we studied a push-crowdsourcing model whereby the workers receive tasks instead of selecting them from a portal. Based on this interaction model, we employed scheduling techniques similar to those found in distributed computing infrastructures to automate the task assignment process. In this work, we first devise a generic scheduling strategy that supports both fairness and deadline-awareness. Second, to complement the proof-of-concept experiments previously performed with the crowd, we present an extensive set of simulations meant to analyze the properties of the proposed scheduling algorithms in an environment with thousands of workers and tasks. Our experimental results show that, by accounting for human factors, micro-task scheduling can achieve fairness for best-effort batches and boosts production batches.

References

[1]

Omar Alonso and Ricardo A. Baeza-Yates. 2011. Design and implementation of relevance assessments using crowdsourcing. In Advances in Information Retrieval, Paul Clough, Colum Foley, Cathal Gurrin, Gareth J. F. Jones, Wessel Kraaij, Hyowon Lee, and Vanessa Mudoch (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 153--164.

Digital Library

[2]

Aris Anagnostopoulos, Luca Becchetti, Carlos Castillo, Aristides Gionis, and Stefano Leonardi. 2010. Power in unity: Forming teams in large-scale community systems. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, 599--608.

Digital Library

[3]

Aris Anagnostopoulos, Luca Becchetti, Carlos Castillo, Aristides Gionis, and Stefano Leonardi. 2012. Online team formation in social networks. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). ACM, New York, NY, 839--848.

Digital Library

[4]

Moshe Babaioff, Yishay Mansour, Noam Nisan, Gali Noti, Carlo Curino, Nar Ganapathy, Ishai Menache, Omer Reingold, Moshe Tennenholtz, and Erez Timnat. 2017. ERA: A framework for economic resource allocation for the cloud. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3-7, 2017. 635--642.

Digital Library

[5]

Michael S. Bernstein, Joel Brandt, Robert C. Miller, and David R. Karger. 2011. Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of UIST’11. ACM, 33--42.

Digital Library

[6]

Michael S. Bernstein, David R. Karger, Robert C. Miller, and Joel Brandt. 2012. Analytic methods for optimizing realtime crowdsourcing. arXiv preprint arXiv:1204.2995.

[7]

Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, et al. 2010. Vizwiz: Nearly real-time answers to visual questions. In Proceedings of UIST. ACM, 333--342.

Digital Library

[8]

Alessandro Bozzon, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. 2013. Choosing the right crowd: Expert finding in social networks. In Proceedings of EDBT’13. ACM, 637--648.

Digital Library

[9]

Edwin Chen. 2013. Improving Twitter Search with Real-Time Human Computation. Retrieved December 12, 2018 from https://goo.gl/EWT7Kt.

[10]

Houssine Chetto and Maryline Chetto. 1989. Some results of the earliest deadline scheduling algorithm. IEEE Trans. Software Eng. 15, 10 (1989), 1261.

Digital Library

[11]

Lydia B. Chilton, John J. Horton, Robert C. Miller, and Shiri Azenkot. 2010. Task search in a human computation market. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP’10). ACM, New York, NY, 1--9.

Digital Library

[12]

Matthew J. C. Crump, John V. McDonnell, and Todd M. Gureckis. 2013. Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. PloS One 8, 3 (2013), 1--18.

[13]

Gianluca Demartini, Djellel Difallah, and Philippe Cudré-Mauroux. 2012. ZenCrowd: Leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). ACM, New York, NY, 469--478.

Digital Library

[14]

Gianluca Demartini, Djellel Difallah, and Philippe Cudré-Mauroux. 2013. Large-scale linked data integration using probabilistic reasoning and crowdsourcing. VLDB J. 22, 5 (2013), 665--687.

Digital Library

[15]

Gianluca Demartini, Beth Trushkowsky, Tim Kraska, Michael J. Franklin, and UC Berkeley. 2013. CrowdQ: Crowdsourced query understanding. In Proceedings CIDR. www.cidrdb.org.

[16]

Dingxiong Deng, Cyrus Shahabi, and Ugur Demiryurek. 2013. Maximizing the number of worker’s self-selected tasks in spatial crowdsourcing. In Proceedings SIGSPATIAL/GIS. ACM, 324--333.

Digital Library

[17]

Ernesto Diaz-Aviles and Ricardo Kawase. 2012. Exploiting Twitter as a social channel for human computation. In CrowdSearch. 15--19.

[18]

Djellel Difallah, Michele Catasta, Gianluca Demartini, and Philippe Cudré-Mauroux. 2014. Scaling-up the crowd: Micro-task pricing schemes for worker retention and latency improvement. In Proceedings 2nd AAAI Conference on Human Computation and Crowdsourcing. 50--58.

[19]

Djellel Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G. Ipeirotis, and Philippe Cudré-Mauroux. 2015. The dynamics of micro-task crowdsourcing: The case of Amazon MTurk. In WWW. ACM, 238--247.

Digital Library

[20]

Djellel Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. 2015. Scheduling human intelligence tasks in multi-tenant crowd-powered systems. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 855--865.

Digital Library

[21]

Djellel Difallah, Elena Filatova, and Panos Ipeirotis. 2018. Demographics and dynamics of mechanical Turk workers. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM’18). ACM, New York, NY, 135--143.

Digital Library

[22]

Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. 2013. Pick-a-crowd: Tell me what you like, and I’ll tell you what to do. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). ACM, New York, NY, 367--374.

Digital Library

[23]

Siamak Faradani, Björn Hartmann, and Panagiotis G Ipeirotis. 2011. What’s the right price? Pricing tasks for finishing on time. In Proceedings of the 11th AAAI Conference on Human Computation (AAAIWS’11-11). AAAI Press, 26--31. http://dl.acm.org/citation.cfm?id=2908698.2908703

Digital Library

[24]

Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, and Reynold Xin. 2011. CrowdDB: Answering queries with crowdsourcing. In Proceedings SIGMOD’11. ACM, 61--72.

Digital Library

[25]

Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant resource fairness: Fair allocation of multiple resource types. In Proceedings NSDI’11. USENIX Association, 24--24.

Digital Library

[26]

Daniel Haas and Michael J. Franklin. 2017. Cioppino: Multi-tenant crowd management. In Proceedings of the 5th AAAI Conference on Human Computation and Crowdsourcing (HCOMP). AAAI Press, 41--50.

[27]

Daniel Haas, Jiannan Wang, Eugene Wu, and Michael J. Franklin. 2015. Clamshell: Speeding up crowds for low-latency data labeling. Proceedings of the VLDB Endowment 9, 4 (2015), 372--383.

Digital Library

[28]

Ting-Hao K. Huang and Jeffrey P. Bigham. 2017. A 10-month-long deployment study of on-demand recruiting for low-latency crowdsourcing. In Proceedings of the 5th AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2017). AAAI, AAAI. AAAI Press, 61--70.

[29]

Panagiotis G. Ipeirotis. 2010. Analyzing the Amazon Mechanical Turk marketplace. XRDS 17, 2 (2010), 16--21.

Digital Library

[30]

Raj Jain, Arjan Durresi, and Gojko Babic. 1999. Throughput Fairness Index: An Explanation. Technical Report. Tech. rep., Department of CIS, The Ohio State University.

[31]

Radu Jurca and Boi Faltings. 2009. Mechanisms for making crowds truthful. Journal of Artificial Intelligence Research 34 (2009), 209--253.

Digital Library

[32]

Roman Khazankin, Harald Psaier, Daniel Schall, and Schahram Dustdar. 2011. QOS-based task scheduling in crowdsourcing environments. In Service-Oriented Computing, Gerti Kappel, Zakaria Maamar, and Hamid R. Motahari-Nezhad (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 297--311.

Digital Library

[33]

Tian Lan, David Kao, Mung Chiang, and Ashutosh Sabharwal. 2010. An Axiomatic Theory of Fairness in Network Resource Allocation. In 2010 Proceedings IEEE INFOCOM. 1--9.

Digital Library

[34]

Walter S. Lasecki, Adam Marcus, Jeffrey M. Rzeszotarski, and Jeffrey P. Bigham. 2014. Using microtask continuity to improve crowdsourcing. Technical Report (2014).

[35]

Edith Law and Luis von Ahn. 2011. Human computation. Synthesis Lectures on Artificial Intelligence and Machine Learning 5, 3 (2011), 1--121.

[36]

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce, et al. 2004. Semi-local affine parts for object recognition. In British Machine Vision Conference (BMVC’04). 779--788.

[37]

Adam Marcus and Aditya Parameswaran. 2015. Crowdsourced data management industry and academic perspectives. Foundations and Trends in Databases 6, 1--2 (2015), 1--161.

Digital Library

[38]

Panagiotis Mavridis, David Gross-Amblard, and Zoltán Miklós. 2016. Using hierarchical skills for optimized task assignment in knowledge-intensive crowdsourcing. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 843--853.

Digital Library

[39]

Barzan Mozafari, Purna Sarkar, Michael Franklin, Michael Jordan, and Samuel Madden. 2014. Scaling up crowd-sourcing to very large datasets: A case for active learning. In Proceedings of the VLDB Endowment 8, 2 (2014), 125--136.

Digital Library

[40]

Vikram Nunia, Bhavesh Kakadiya, Chittaranjan Hota, and Muttukrishnan Rajarajan. 2013. Adaptive task scheduling in service oriented crowd using SLURM. In Proceedings ICDCIT. 373--385.

[41]

Vaibhav Rajan, Sakyajit Bhattacharya, L. Elisa Celis, Deepthi Chander, Koustuv Dasgupta, and Saraschandra Karanam. 2013. CrowdControl: An online learning approach for optimal task scheduling in a dynamic crowd platform. In ICML Workshop on “Machine Learning Meets Crowdsourcing.” PMLR, 1--9.

[42]

Akshay Rao, Harmanpreet Kaur, and Walter S. Lasecki. 2018. Plexiglass: Multiplexing passive and active tasks for more efficient crowdsourcing. In Proceedings of the 6th AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2018, Zürich, Switzerland, July 5-8, 2018. AAAI Press, 145--153.

[43]

Senjuti Basu Roy, Ioanna Lykourentzou, Saravanan Thirumuruganathan, Sihem Amer-Yahia, and Gautam Das. 2014. Optimization in knowledge-intensive crowdsourcing. CoRR abs/1401.1302 (2014).

[44]

Jeffrey M. Rzeszotarski, Ed Chi, Praveen Paritosh, and Peng Dai. 2013. Inserting micro-breaks into crowdsourcing workflows. In HCOMP (Works in Progress / Demos) (AAAI Workshops), Vol. WS-13-18. AAAI Press.

[45]

Nitin Seemakurty, Jonathan Chu, Luis von Ahn, and Anthony Tomasic. 2010. Word sense disambiguation via human computation. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP’10). ACM, 60--63.

Digital Library

[46]

Adish Singla and Andreas Krause. 2013. Truthful incentives in crowdsourcing tasks using regret minimization mechanisms. In Proceedings WWW’13. ACM, 1167--1178.

Digital Library

[47]

Keith Vertanen and Per Ola Kristensson. 2011. A versatile dataset for text entry evaluations based on genuine mobile emails. In Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services. ACM, 295--298.

Digital Library

[48]

Luis von Ahn and Laura Dabbish. 2004. Labeling images with a computer game. In Proceedings CHI’04. ACM, 319--326.

Digital Library

[49]

Jing Wang, Siamak Faridani, and P. Ipeirotis. 2011. Estimating the completion time of crowdsourced tasks using survival analysis models. Crowdsourcing for Search and Data Mining (CSDM 2011) 31 (2011), 31--38.

[50]

Jiannan Wang, Tim Kraska, Michael J. Franklin, and Jianhua Feng. 2012. CrowdER: Crowdsourcing entity resolution. Proc. VLDB Endow. 5, 11 (July 2012), 1483--1494.

Digital Library

[51]

Ming Yin, Yiling Chen, and Yu-An Sun. 2014. Monetary interventions in crowdsourcing task switching. In Proceedings of the 2nd AAAI Conference on Human Computation (HCOMP). AAAI Press, 234--241

[52]

Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proceedings EuroSys’10. ACM, 265--278.

Digital Library

[53]

Chen Jason Zhang, Lei Chen, H. V. Jagadish, and Chen Caleb Cao. 2013. Reducing uncertainty of schema matching via crowdsourcing. Proc. VLDB Endow. 6, 9 (July 2013), 757--768.

Digital Library

Cited By

Hettiachchi DKostakos VGoncalves J(2023)A Survey on Task Assignment in CrowdsourcingACM Computing Surveys10.1145/349452255:3(1-35)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3494522
Hotaling ABagrow J(2022)Accurate inference of crowdsourcing properties when using efficient allocation strategiesScientific Reports10.1038/s41598-022-10794-912:1Online publication date: 27-Apr-2022
https://doi.org/10.1038/s41598-022-10794-9
Amer-Yahia SBasu Roy SChen LMorishima AAbello Monedero JBourhis PCharoy FDanilevsky MDas GDemartini GElbassuoni SGross-Amblard DHoareau EInoguchi MKenworthy JKitahara ILee DLi YBorromeo RPapotti PRao RRoy SSenellart PTajima KThirumuruganathan STommasi MUmemoto KWiggins AYoshida K(2020)Making AI Machines Work for Humans in FoWACM SIGMOD Record10.1145/3442322.344232749:2(30-35)Online publication date: 10-Dec-2020
https://dl.acm.org/doi/10.1145/3442322.3442327
Show More Cited By

Index Terms

Deadline-Aware Fair Scheduling for Multi-Tenant Crowd-Powered Systems
1. Information systems
  1. World Wide Web
    1. Web applications
      1. Crowdsourcing
2. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Online learning algorithms
        Scheduling algorithms

Recommendations

Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Micro-task crowdsourcing has become a popular approach to effectively tackle complex data management problems such as data linkage, missing values, or schema matching. However, the backend crowdsourced operators of crowd-powered systems typically yield ...
Preemptive Hadoop Jobs Scheduling under a Deadline
SKG '12: Proceedings of the 2012 Eighth International Conference on Semantics, Knowledge and Grids

MapReduce has become the dominant programming model in a cloud-based data processing environment, such as Hadoop. First In First Out (FIFO) is the default job scheduling policy of Hadoop, but it cannot guarantee that the job will be completed by a ...
Deadline Fair Scheduling: Bridging the Theory and Practice of Proportionate Fair Scheduling in Multiprocessor Systems
RTAS '01: Proceedings of the Seventh Real-Time Technology and Applications Symposium (RTAS '01)

Abstract: In this paper, we present Deadline Fair Scheduling (DFS), a proportionate-fair CPU scheduling algorithm for multiprocessor servers. A particular focus of our work is to investigate practical issues in instantiating proportionate-fair (P-fair) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Social Computing

ACM Transactions on Social Computing Volume 2, Issue 1

March 2019

105 pages

EISSN:2469-7826

DOI:10.1145/3309716

Editor:
Kevin Crowston
Syracuse University, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2019

Accepted: 01 December 2018

Revised: 01 December 2018

Received: 01 November 2017

Published in TSC Volume 2, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

EU's H2020 programme
Swiss National Science Foundation
Google

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
305
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hettiachchi DKostakos VGoncalves J(2023)A Survey on Task Assignment in CrowdsourcingACM Computing Surveys10.1145/349452255:3(1-35)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3494522
Hotaling ABagrow J(2022)Accurate inference of crowdsourcing properties when using efficient allocation strategiesScientific Reports10.1038/s41598-022-10794-912:1Online publication date: 27-Apr-2022
https://doi.org/10.1038/s41598-022-10794-9
Amer-Yahia SBasu Roy SChen LMorishima AAbello Monedero JBourhis PCharoy FDanilevsky MDas GDemartini GElbassuoni SGross-Amblard DHoareau EInoguchi MKenworthy JKitahara ILee DLi YBorromeo RPapotti PRao RRoy SSenellart PTajima KThirumuruganathan STommasi MUmemoto KWiggins AYoshida K(2020)Making AI Machines Work for Humans in FoWACM SIGMOD Record10.1145/3442322.344232749:2(30-35)Online publication date: 10-Dec-2020
https://dl.acm.org/doi/10.1145/3442322.3442327
Yin XWang HWang WZhu K(2020)Task recommendation in crowdsourcing systems: A bibliometric analysisTechnology in Society10.1016/j.techsoc.2020.10133763(101337)Online publication date: Nov-2020
https://doi.org/10.1016/j.techsoc.2020.101337

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents