tutorial

Aggregation Techniques in Crowdsourcing: Multiple Choice Questions and Beyond

Authors:

Djellel Difallah,

Alessandro CheccoAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 4842 - 4844

https://doi.org/10.1145/3459637.3482032

Published: 30 October 2021 Publication History

Abstract

Crowdsourcing has been leveraged in various tasks and applications, primarily to gather information from human annotators in exchange for a monetary reward. The main challenge associated with crowdsourcing is the low quality of the results, which can stem from multiple reasons, including bias, error, and adversarial behavior. Researchers and practitioners can apply quality control methods to prevent and detect low-quality responses. For example, worker selection methods utilize qualifications and attention check questions before assigning a task. Similarly, task routing identifies the workers who can provide a more accurate response to a given task type using recommender system techniques. In practice, posterior quality control methods are the most common approach to deal with noisy labels once they are obtained. Such methods require task repetition, i.e., assigning the task to multiple crowd-workers, followed by an aggregation mechanism (aka truth inference) to select the most likely answer or request an additional label. A large number of techniques have been proposed for crowdsourcing aggregation covering several types of task types. This tutorial aims to present common and recent label aggregation techniques for multiple-choice questions, multi-class labels, ratings, pairwise comparison, and image/text annotation. We believe that the audience will benefit from the focus on this specific research area to learn about the best techniques to apply in their crowdsourcing projects.

References

[1]

Yoram Bachrach, Thore Graepel, Tom Minka, and John Guiver. 2012. How to grade a test without knowing the answers--a Bayesian graphical model for adaptive crowdsourcing and aptitude testing. arXiv preprint arXiv:1206.6386 (2012).

Digital Library

[2]

Kalina Bontcheva, Ian Roberts, Leon Derczynski, and Samantha Alexander-Eames. 2014. The GATE crowdsourcing plugin: Crowdsourcing annotated corpora made easy. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 97--100.

[3]

Alessandro Checco and Gianluca Demartini. 2016. Pairwise, Magnitude, or Stars: What's the Best Way for Crowds to Rate? arXiv preprint arXiv:1609.00683 (2016).

[4]

Alessandro Checco, Kevin Roitero, Eddy Maddalena, Stefano Mizzaro, and Gianluca Demartini. 2017. Let's agree to disagree: Fixing agreement measures for crowdsourcing. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 5.

[5]

Xi Chen, Paul N Bennett, Kevyn Collins-Thompson, and Eric Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the sixth ACM international conference on Web search and data mining. 193--202.

Digital Library

[6]

Alexander Philip Dawid and Allan M Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), Vol. 28, 1 (1979), 20--28.

[7]

Gianluca Demartini, Djellel Eddine Difallah, and Philippe Cudré -Mauroux. 2012. ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In WWW. ACM, 469--478.

Digital Library

[8]

Gianluca Demartini, Djellel Eddine Difallah, Ujwal Gadiraju, and Michele Catasta. 2017. An Introduction to Hybrid Human-Machine Information Systems. Found. Trends Web Sci., Vol. 7, 1 (2017), 1--87.

Digital Library

[9]

Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. 2012. Mechanical cheat: Spamming schemes and adversarial techniques on crowdsourcing platforms. In CrowdSearch.

[10]

Alexey Drutsa, Valentina Fedorova, Dmitry Ustalov, Olga Megorskaya, Evfrosiniya Zerminova, and Daria Baidakova. 2020. Practice of Efficient Data Collection via Crowdsourcing: Aggregation, Incremental Relabelling, and Pricing. In WSDM. ACM, 873--876.

Digital Library

[11]

Susan E Embretson and Steven P Reise. 2013. Item response theory .Psychology Press.

[12]

Ulle Endriss and Umberto Grandi. 2014. Binary Aggregation by Selection of the Most Representative Voters. In AAAI. AAAI Press, 668--674.

Digital Library

[13]

Ju Fan, Guoliang Li, Beng Chin Ooi, Kian-Lee Tan, and Jianhua Feng. 2015. iCrowd: An Adaptive Crowdsourcing Framework. In SIGMOD Conference. ACM, 1015--1030.

Digital Library

[14]

Ujwal Gadiraju, Gianluca Demartini, Djellel Eddine Difallah, and Michele Catasta. 2016. It's getting crowded!: how to use crowdsourcing effectively for web science research. In WebSci. ACM, 11.

Digital Library

[15]

Florent Garcin, Boi Faltings, Radu Jurca, and Nadine Joswig. 2009. Rating aggregation in collaborative filtering systems. In Proceedings of the third ACM conference on Recommender systems. 349--352.

Digital Library

[16]

Nguyen Quoc Viet Hung, Thanh Tam Nguyen, Ngoc Tran Lam, and Karl Aberer. 2013. An Evaluation of Aggregation Techniques in Crowdsourcing. In WISE (2) (Lecture Notes in Computer Science, Vol. 8181). Springer, 1--15.

[17]

Oana Inel, Khalid Khamkham, Tatiana Cristea, Anca Dumitrache, Arne Rutjes, Jelle van der Ploeg, Lukasz Romaszko, Lora Aroyo, and Robert-Jan Sips. 2014. Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data. In International semantic web conference. Springer, 486--504.

Digital Library

[18]

Yuan Jin, Mark Carman, Ye Zhu, and Wray Buntine. 2018. Distinguishing question subjectivity from difficulty for improved crowdsourcing. In Asian Conference on Machine Learning. PMLR, 192--207.

[19]

Yuan Jin, Mark J. Carman, Ye Zhu, and Yong Xiang. 2020. A technical survey on statistical modelling and design methods for crowdsourcing quality control. Artif. Intell., Vol. 287 (2020), 103351.

[20]

Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, and Kristen Grauman. 2016. Crowdsourcing in computer vision. arXiv preprint arXiv:1611.02145 (2016).

Digital Library

[21]

Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, and Jiawei Han. 2016. A survey on truth discovery. ACM Sigkdd Explorations Newsletter, Vol. 17, 2 (2016), 1--16.

Digital Library

[22]

Qiang Liu, Jian Peng, and Alexander T. Ihler. 2012. Variational Inference for Crowdsourcing. In NIPS. 701--709.

Digital Library

[23]

Fenglong Ma, Yaliang Li, Qi Li, Minghui Qiu, Jing Gao, Shi Zhi, Lu Su, Bo Zhao, Heng Ji, and Jiawei Han. 2015. FaitCrowd: Fine Grained Truth Discovery for Crowdsourced Data Aggregation. In KDD. ACM, 745--754.

Digital Library

[24]

Eddy Maddalena, Stefano Mizzaro, Falk Scholer, and Andrew Turpin. 2015. Judging relevance using magnitude estimation. In European Conference on Information Retrieval. Springer, 215--220.

[25]

Kevin Roitero, Gianluca Demartini, Stefano Mizzaro, and Damiano Spina. 2018. How Many Truth Levels? Six? One Hundred? Even More? Validating Truthfulness of Statements via Crowdsourcing. In CIKM Workshops.

[26]

Marta Sabou, Kalina Bontcheva, Leon Derczynski, and Arno Scharl. 2014. Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines. In LREC. Citeseer, 859--866.

[27]

Edwin Simpson, Stephen Roberts, Ioannis Psorakis, and Arfon Smith. 2013. Dynamic bayesian combination of multiple imperfect classifiers. In Decision making and imperfection. Springer, 1--35.

[28]

Hao Su, Jia Deng, and Li Fei-Fei. 2012. Crowdsourcing annotations for visual object detection. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence.

[29]

Howard EA Tinsley and David J Weiss. 2000. Interrater reliability and agreement. In Handbook of applied multivariate statistics and mathematical modeling. Elsevier, 95--124.

[30]

Jinzheng Tu, Guoxian Yu, Carlotta Domeniconi, Jun Wang, Guoqiang Xiao, and Maozu Guo. 2018. Multi-label answer aggregation based on joint matrix factorization. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 517--526.

[31]

Jennifer Wortman Vaughan. 2017. Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research. J. Mach. Learn. Res., Vol. 18, 1 (2017), 7026--7071.

Digital Library

[32]

Carl Vogel, Maria Koutsombogera, and Rachel Costello. 2020. Analyzing likert scale inter-annotator disagreement. In Neural Approaches to Dynamics of Signal Exchanges. Springer, 383--393.

[33]

Jeroen Vuurens, Arjen P de Vries, and Carsten Eickhoff. 2011. How much spam can you take? an analysis of crowdsourcing results to increase accuracy. In Proc. ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR'11). 21--26.

[34]

Peter Welinder, Steve Branson, Serge J. Belongie, and Pietro Perona. 2010a. The Multidimensional Wisdom of Crowds. In NIPS. Curran Associates, Inc., 2424--2432.

Digital Library

[35]

Peter Welinder, Steve Branson, Pietro Perona, and Serge Belongie. 2010b. The multidimensional wisdom of crowds. Advances in neural information processing systems, Vol. 23 (2010), 2424--2432.

Digital Library

[36]

Jing Zhang, Xindong Wu, and Victor S Shengs. 2014. Active learning with imbalanced multiple noisy labeling. IEEE transactions on cybernetics, Vol. 45, 5 (2014), 1095--1107.

[37]

Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, and Reynold Cheng. 2017. Truth Inference in Crowdsourcing: Is the Problem Solved? Proc. VLDB Endow., Vol. 10, 5 (2017), 541--552.

Digital Library

[38]

Dengyong Zhou, John C. Platt, Sumit Basu, and Yi Mao. 2012. Learning from the Wisdom of Crowds by Minimax Entropy. In NIPS. 2204--2212.

Digital Library

[39]

Yao Zhou, Fenglong Ma, Jing Gao, and Jingrui He. 2019. Optimizing the Wisdom of the Crowd: Inference, Learning, and Teaching. In KDD. ACM, 3231--3232.

Digital Library

[40]

Yao Zhou, Arun Reddy Nelakurthi, and Jingrui He. 2018. Unlearn what you have learned: Adaptive crowd teaching with exponentially decayed memory learners. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2817--2826.

Digital Library

Cited By

Thomas PKazai GCraswell NSpielman SHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)What Matters in a Measure? A Perspective from Large-Scale Search EvaluationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657845(282-292)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657845
Wu GZhuo XBao XHu XHong RWu X(2023)Crowdsourcing Truth Inference via Reliability-Driven Multi-View Graph EmbeddingACM Transactions on Knowledge Discovery from Data10.1145/356557617:5(1-26)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3565576
Li Dde Rijke MChen HDuh WHuang HKato MMothe JPoblete B(2023)Extending Label Aggregation Models with a Gaussian Process to Denoise Crowdsourcing LabelsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591685(729-738)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591685

Index Terms

Aggregation Techniques in Crowdsourcing: Multiple Choice Questions and Beyond
1. Human-centered computing
  1. Collaborative and social computing
2. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
  2. World Wide Web
    1. Web applications
      1. Crowdsourcing

Recommendations

Label Aggregation with Clustering for Biased Crowdsourced Labeling
ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

With the rapid development of crowdsourcing learning, amount of label aggregation methods are proposed to infer the true labels of instances from multiple noisy labels provided by inexpert crowd workers. Most of the label aggregation methods take the ...
Debiased Label Aggregation for Subjective Crowdsourcing Tasks
CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems

Human Intelligence Tasks (HITs) allow people to collect and curate labeled data from multiple annotators. Then labels are often aggregated to create an annotated dataset suitable for supervised machine learning tasks. The most popular label aggregation ...
Learning from biased crowdsourced labeling with deep clustering
Highlights
- The phenomenon of biased labeling usually existing in the scenario of crowdsourcing.
- Biased labeling is a critical factor that effects label aggregation performance.
- Deep clustering estimates the underlying label distribution and ...
Abstract
With the rapid development of crowdsourcing learning, amount of labels can be obtained from crowd workers fast and cheaply. However, crowdsourcing learning also faces challenges due to the varied qualities of amateurish crowd workers. To improve ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
290
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Thomas PKazai GCraswell NSpielman SHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)What Matters in a Measure? A Perspective from Large-Scale Search EvaluationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657845(282-292)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657845
Wu GZhuo XBao XHu XHong RWu X(2023)Crowdsourcing Truth Inference via Reliability-Driven Multi-View Graph EmbeddingACM Transactions on Knowledge Discovery from Data10.1145/356557617:5(1-26)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3565576
Li Dde Rijke MChen HDuh WHuang HKato MMothe JPoblete B(2023)Extending Label Aggregation Models with a Gaussian Process to Denoise Crowdsourcing LabelsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591685(729-738)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591685

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents