Abstract
Crowdsourcing is a powerful concept that typically takes advantage of human intelligence to deal with problems in many fields most importantly in machine learning. Indeed, it enables to collect training labels in a fast and cheap way for supervised algorithms. The only major challenge is that the quality of the contributions is not always guaranteed because of the expertise heterogeneity of the participants. One of the basic strategies to overcome this problem is to assign each task to multiple workers and then combine their answers in order to obtain a single reliable one. This paper provides a new iterative approach that aggregates imperfect labels using the supervision of few gold labels under the evidence theory. Besides of inferring the consensus answers, the workers’ accuracies and the questions difficulties are as well estimated. A comparative evaluation on synthetic and real datasets confirms the effectiveness of our semi-supervised approach over the baselines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zheng, Y., Wang, J., Li, G., Feng, J.: QASCA: a quality-aware task assignment system for crowdsourcing applications. In: International Conference on Management of Data, pp. 1031–1046 (2015)
Yan, T., Kumar, V., Ganesan, D.: Designing games with a purpose. Commun. ACM 51(8), 58–67 (2008)
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast but is it good? Evaluation non-expert annotations for natural language tasks. In: The Conference on Empirical Methods in Natural Languages Processing, pp. 254–263 (2008)
Kuncheva, L., Whitaker, C., Shipp, C., Duin, R.: Limits on the majority vote accuracy in classifier fusion. Pattern Anal. Appl. 6, 22–31 (2003)
Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality and data mining using multiple, noisy labellers. In: International Conference on Knowledge Discovery and Data Mining, pp. 614–622 (2008)
Shafer, G.: A Mathematical Theory of Evidence, vol. 1. Princeton University Press, Princeton (1976)
Dempster, A.P.: Upper and lower probabilities induced by a multivalued mapping. Ann. Math. Stat. 38, 325–339 (1967)
Jousselme, A.-L., Grenier, D., Bossé, É.: A new distance between two bodies of evidence. Inf. Fusion 2, 91–101 (2001)
Lefèvre, E., Elouedi, Z.: How to preserve the confict as an alarm in the combination of belief functions? Decis. Support Syst. 56, 326–333 (2013)
Smets, P.: The combination of evidence in the transferable belief model. IEEE Trans. Pattern Anal. Mach. Intell. 12(5), 447–458 (1990)
Raykar, V.C., Yu, S.: Eliminating spammers and ranking annotators for crowdsourced labelling tasks. J. Mach. Learn. Res. 13, 491–518 (2012)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28, 20–28 (2010)
Karger, D.R., Oh, S., Shah, D.: Budget-optimal task allocation for reliable crowdsourcing systems. Oper. Res. 62, 1–24 (2014)
Raykar, V.C., et al.: Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 889–896 (2009)
Khattak, F.K., Salleb, A.: Quality control of crowd labelling through expert evaluation. In: The Neural Information Processing Systems 2nd Workshop on Computational Social Science and the Wisdom of Crowds, pp. 27–29 (2011)
Lee, K., Caverlee, J., Webb, S.: The social honeypot project: protecting online communities from spammers. In: International World Wide Web Conference, pp. 1139–1140 (2010)
Smets, P., Mamdani, A., Dubois, D., Prade, H.: Non Standard Logics for Automated Reasoning, pp. 253–286. Academic Press, London (1988)
Ben Rjab, A., Kharoune, M., Miklos, Z., Martin, A.: Characterization of experts in crowdsourcing platforms. In: Vejnarová, J., Kratochvíl, V. (eds.) BELIEF 2016. LNCS (LNAI), vol. 9861, pp. 97–104. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45559-4_10
Watanabe, M., Yamaguchi, K.: The EM Algorithm and Related Statistical Models, p. 250. CRC Press, Boca Raton (2003)
Li, J., Li, X., Yang, B., Sun, X.: Segmentation-based image copy-move forgery detection scheme. IEEE Trans. Inf. Forensics Secur. 10, 507–518 (2015)
Liu, K., Cheung, W.K., Liu, J.: Detecting multiple stochastic network motifs in network data. Knowl. Inf. Syst. 42, 49–74 (2015)
Whitehill, J., Wu, T., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: optimal integration of labels from labellers of unknown expertise. In: Neural Information Processing Systems, pp. 2035–2043 (2009)
Abassi, L., Boukhris, I.: Crowd label aggregation under a belief function framework. In: Lehner, F., Fteimi, N. (eds.) KSEM 2016. LNCS (LNAI), vol. 9983, pp. 185–196. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47650-6_15
Abassi, L., Boukhris, I.: A gold standards-based crowd label aggregation within the belief function theory. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10351, pp. 97–106. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60045-1_12
Abassi, L., Boukhris, I.: Iterative aggregation of crowdsourced tasks within the belief function theory. In: Antonucci, A., Cholvy, L., Papini, O. (eds.) ECSQARU 2017. LNCS (LNAI), vol. 10369, pp. 159–168. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61581-3_15
Abassi, L., Boukhris, I.: A worker clustering-based approach of label aggregation under the belief function theory. Appl. Intell. 49, 53–62 (2018)
Abassi, L., Boukhris, I.: Imprecise label aggregation approach under the belief function theory. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) Intelligent Systems Design and Applications, vol. 941, pp. 607–616. Springer, Cham (2018)
Koulougli, D., HadjAli, A., Rassoul, I.: Handling query answering in crowdsourcing systems: a belief function-based approach. In: Annual Conference of the North American Fuzzy Information Processing Society (NAFIPS), pp. 1–6 (2016)
Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Neural Information Processing Systems, pp. 2424–2432 (2010)
Frank, A.: UCI machine learning repository (1987). http://archive.ics.uci.edu/ml
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Abassi, L., Boukhris, I. (2019). An Evidential Semi-supervised Label Aggregation Approach. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11775. Springer, Cham. https://doi.org/10.1007/978-3-030-29551-6_60
Download citation
DOI: https://doi.org/10.1007/978-3-030-29551-6_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29550-9
Online ISBN: 978-3-030-29551-6
eBook Packages: Computer ScienceComputer Science (R0)