research-article

A Crowdsourcing Repeated Annotations System for Visual Object Detection

Authors:

Meina SongAuthors Info & Claims

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

Article No.: 14, Pages 1 - 6

https://doi.org/10.1145/3387168.3387242

Published: 25 May 2020 Publication History

Abstract

As a fundamental task in compute vision, object detection has been developed rapidly driven by the deep learning. The lack of a large number of images with ground truth annotations has become a chief obstacle to object detection applications in many fields. Eliciting labels from crowds is a potential way to obtain large labeled data. Nonetheless, existing crowdsourced techniques, e.g., Amazon Mechanical Turk (MTurk), often fail to guarantee the quality of the annotations, which have a bad influence on the accuracy of the deep detector. A variety of methods have been developed for ground truth inference and learning from crowds. In this paper, we study strategies to crowd-source repeated labels in support for these methods. The core challenge of building such a system is to reduce the difficulty to annotate multiple objects of interest and improve the data quality as much as possible. We present a system that adopts the turn-based annotation mechanism and consists of three simple sub-tasks: a single object annotation, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate and can assist the detector of obtaining higher accuracy.

References

[1]

Albarqouni, S., Baur, C., Achilles, F., Belagiannis, V., Demirci, S., & Navab, N. (2016). Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE transactions on medical imaging, 35(5), 1313--1321.

[2]

Aroyo, L., & Welty, C. (2015). Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1), 15--24.

Digital Library

[3]

Basson, S., & Kanevsky, D. (2018). Crowdsourcing Training Data For Real-Time Transcription Models.

[4]

Bell, S., Upchurch, P., Snavely, N., & Bala, K. (2013). Opensurfaces: A richly annotated catalog of surface appearance. ACM Transactions on graphics (TOG), 32(4), 111.

[5]

Biewald, L., & Van Pelt, C. (2013). Crowdflower.

[6]

Dawid, A. P., & Skene, A. M. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1), 20--28.

[7]

Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2007). The PASCAL visual object classes challenge 2007 (VOC2007) results.

[8]

Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440--1448).

Digital Library

[9]

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580--587).

Digital Library

[10]

Gurari, D., Theriault, D., Sameki, M., Isenberg, B., Pham, T. A., Purwada, A., ... & Betke, M. (2015, January). How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. In 2015 IEEE winter conference on applications of computer vision (pp. 1169--1176). IEEE.

[11]

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770--778).

[12]

Howe, J. (2006). The rise of crowdsourcing. Wired magazine, 14(6), 1--4.

[13]

Hu, Y., & Song M. (2019). Crowd R-CNN: An Object Detection Model Utilizing Crowdsourced Labels. 2019 International Conference on Vision, Image and Signal Processing (ICVISP). In Press.

[14]

Inel, O., Khamkham, K., Cristea, T., Dumitrache, A., Rutjes, A., van der Ploeg, J., ... & Sips, R. J. (2014, October). Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data. In International Semantic Web Conference (pp. 486--504). Springer, Cham.

[15]

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105).

Digital Library

[16]

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91--99).

Digital Library

[17]

Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). LabelMe: a database and web-based tool for image annotation. International journal of computer vision, 77(1--3), 157--173.

[18]

Shao, S., Li, Z., Zhang, T., Peng, C., Yu, G., Zhang, X., ... & Sun, J. (2019). Objects365: A Large-Scale, High-Quality Dataset for Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 8430--8439).

[19]

Sheng, V. S., Provost, F., & Ipeirotis, P. G. (2008, August). Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 614--622). ACM.

Digital Library

[20]

Su, H., Deng, J., & Fei-Fei, L. (2012, July). Crowdsourcing annotations for visual object detection. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence.

[21]

Turk, A. M. (2012). Amazon mechanical turk. Retrieved August, 17, 2012.

[22]

Von Ahn, L., & Dabbish, L. (2004, April). Labeling images with a computer game. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 319--326). ACM.

Digital Library

[23]

Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008). recaptcha: Human-based character recognition via web security measures. Science, 321(5895), 1465--1468.

[24]

Welinder, P., & Perona, P. (2010, June). Online crowdsourcing: rating annotators and obtaining cost-effective labels. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops (pp. 25--32). IEEE.

[25]

Welinder, P., Branson, S., Perona, P., & Belongie, S. J. (2010). The multidimensional wisdom of crowds. In Advances in neural information processing systems (pp. 2424--2432).

Index Terms

A Crowdsourcing Repeated Annotations System for Visual Object Detection

Recommendations

Crowd R-CNN: An Object Detection Model Utilizing Crowdsourced Labels
ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

Accuracy of object detection has increased significantly in recent years because of the rapid development of deep learning techniques. Nevertheless, its applications in many fields are still limited, mainly due to the lack of large datasets, especially ...
Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization
IUI '14: Proceedings of the 19th international conference on Intelligent User Interfaces

Research that involves human behavior analysis usually requires laborious and costly efforts for obtaining micro-level behavior annotations on a large video corpus. With the emerging paradigm of crowdsourcing however, these efforts can be considerably ...
A crowdsourcing semi-automatic image segmentation platform for cell biology
Abstract
State-of-the-art computer-vision algorithms rely on big and accurately annotated data, which are expensive, laborious and time-consuming to generate. This task is even more challenging when it comes to microbiological images, because they require ...
Highlights
- Our assistive tool enables non-expert annotators to perform annotation of microbiological images accurately and quickly.
- Our study sheds some light on important behavioral features of non-expert annotators.
- Our findings can help ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

August 2019

584 pages

ISBN:9781450376259

DOI:10.1145/3387168

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 May 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China
National Key R&D Program of China

Conference

ICVISP 2019

ICVISP 2019: 3rd International Conference on Vision, Image and Signal Processing

August 26 - 28, 2019

BC, Vancouver, Canada

Acceptance Rates

ICVISP 2019 Paper Acceptance Rate 126 of 277 submissions, 45%;

Overall Acceptance Rate 186 of 424 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
120
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)1

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents