research-article

One-Bit Supervision for Image Classification: Problem, Solution, and Beyond

Authors:

Qi TianAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 4

Article No.: 113, Pages 1 - 22

https://doi.org/10.1145/3633779

Published: 11 January 2024 Publication History

Abstract

This article presents one-bit supervision, a novel setting of learning with fewer labels, for image classification. Instead of the training model using the accurate label of each sample, our setting requires the model to interact with the system by predicting the class label of each sample and learn from the answer whether the guess is correct, which provides one bit (yes or no) of information. An intriguing property of the setting is that the burden of annotation largely is alleviated in comparison to offering the accurate label. There are two keys to one-bit supervision: (i) improving the guess accuracy and (ii) making good use of the incorrect guesses. To achieve these goals, we propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm. Theoretical analysis shows that one-bit annotation is more efficient than full-bit annotation in most cases and gives the conditions of combining our approach with active learning. Inspired by this, we further integrate the one-bit supervision framework into the self-supervised learning algorithm, which yields an even more efficient training schedule. Different from training from scratch, when self-supervised learning is used for initialization, both hard example mining and class balance are verified to be effective in boosting the learning performance. However, these two frameworks still need full-bit labels in the initial stage. To cast off this burden, we utilize unsupervised domain adaptation to train the initial model and conduct pure one-bit annotations on the target dataset. In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.

References

[1]

Les E. Atlas, David A. Cohn, and Richard E. Ladner. 1990. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems. 566–573.

[2]

David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A. Raffel. 2019. MixMatch: A holistic approach to semi-supervised learning. In Advances in Neural Information Processing Systems. 5050–5060.

[3]

Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, and Vicente Ordonez. 2020. Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning. arXiv preprint arXiv:2001.06001 (2020).

[4]

Chao Chen, Zhihang Fu, Zhihong Chen, Sheng Jin, Zhaowei Cheng, Xinyu Jin, and Xian-Sheng Hua. 2020. HoMM: Higher-order moment matching for unsupervised domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3422–3429.

[5]

John Chen, Vatsal Shah, and Anastasios Kyrillidis. 2020. Negative sampling in semi-supervised learning. In Proceedings of the International Conference on Machine Learning. 1704–1714.

[6]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. 1597–1607.

[7]

Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey Hinton. 2020. Big self-supervised models are strong semi-supervised learners. arXiv preprint arXiv:2006.10029 (2020).

[8]

Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020).

[9]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[10]

Rob Fergus, Yair Weiss, and Antonio Torralba. 2009. Semi-supervised learning in gigantic image collections. In Advances in Neural Information Processing Systems. 522–530.

Digital Library

[11]

Alexander Freytag, Erik Rodner, and Joachim Denzler. 2014. Selecting influential examples: Active learning with expected model output changes. In Proceedings of the European Conference on Computer Vision. 562–577.

[12]

Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born again neural networks. arXiv preprint arXiv:1805.04770 (2018).

[13]

Yarin Gal, Riashat Islam, and Zoubin Ghahramani. 2017. Deep Bayesian active learning with image data. In Proceedings of the International Conference on Machine Learning. 1183–1192.

[14]

Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan Ö Arık, Larry S. Davis, and Tomas Pfister. 2020. Consistency-based semi-supervised active learning: Towards minimizing labeling cost. In Proceedings of the European Conference on Computer Vision. 510–526.

Digital Library

[15]

Xavier Gastaldi. 2017. Shake-shake regularization. arXiv preprint arXiv:1705.07485 (2017).

[16]

Spyros Gidaris, Praveer Singh, and Nikos Komodakis. 2018. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018).

[17]

Yves Grandvalet and Yoshua Bengio. 2005. Semi-supervised learning by entropy minimization. In Advances in Neural Information Processing Systems. 529–536.

Digital Library

[18]

Matthieu Guillaumin, Jakob Verbeek, and Cordelia Schmid. 2010. Multimodal semi-supervised learning for image classification. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 902–909.

[19]

Tao Han, Wei-Wei Tu, and Yu-Feng Li. 2021. Explanation consistency training: Facilitating consistency-based semi-supervised learning with interpretability. In Proceedings of the AAAI Conference on Artificial Intelligence.

[20]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729–9738.

[21]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[22]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

[23]

Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. 2011. Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745 (2011).

[24]

Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, and Qi Tian. 2020. One-bit supervision for image classification. In Advances in Neural Information Processing Systems. 1–11.

[25]

Hengtong Hu, Lingxi Xie, Richang Hong, and Qi Tian. 2020. Creating something from nothing: Unsupervised knowledge distillation for cross-modal hashing. arXiv preprint arXiv:2004.00280 (2020).

[26]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708.

[27]

Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. 2019. Label propagation for deep semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5070–5079.

[28]

Youngdong Kim, Junho Yim, Juseung Yun, and Junmo Kim. 2019. NLNL: Negative learning for noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 101–110.

[29]

Andreas Kirsch, Joost van Amersfoort, and Yarin Gal. 2019. BatchBALD: Efficient and diverse batch acquisition for deep Bayesian active learning. In Advances in Neural Information Processing Systems. 7024–7035.

[30]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. University of Toronto.

[31]

Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, and Zsolt Kira. 2020. FeatMatch: Feature-based augmentation for semi-supervised learning. In Proceedings of the European Conference on Computer Vision. 479–495.

Digital Library

[32]

Samuli Laine and Timo Aila. 2016. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016).

[33]

Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2017. Colorization as a proxy task for visual understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6874–6883.

[34]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.

[35]

Dong-Hyun Lee. 2013. Pseudo-Label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning (ICML ’13), Vol. 3. 2.

[36]

David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’94). 3–12.

[37]

Shuang Li, Mixue Xie, Fangrui Lv, Chi Harold Liu, Jian Liang, Chen Qin, and Wei Li. 2021. Semantic concentration for domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9102–9111.

[38]

Wenjie Luo, Alex Schwing, and Raquel Urtasun. 2013. Latent structured active learning. In Advances in Neural Information Processing Systems. 728–736.

[39]

Tomasz Malisiewicz and Alyosha Efros. 2009. Beyond categories: The Visual Memex model for reasoning about object relationships. In Advances in Neural Information Processing Systems. 1222–1230.

[40]

Takeru Miyato, Shin-Ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: A regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 8 (2018), 1979–1993.

[41]

Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised learning of visual representations by solving jigsaw puzzles. In Proceedings of the European Conference on Computer Vision. 69–84.

[42]

Mehdi Noroozi, Hamed Pirsiavash, and Paolo Favaro. 2017. Representation learning by learning to count. In Proceedings of the IEEE International Conference on Computer Vision. 5898–5906.

[43]

Mehdi Noroozi, Ananth Vinjimoor, Paolo Favaro, and Hamed Pirsiavash. 2018. Boosting self-supervised learning via knowledge transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9359–9367.

[44]

Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, and Vittorio Ferrari. 2016. We don’t need no bounding-boxes: Training object class detectors using only human verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 854–863.

[45]

Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, and Vittorio Ferrari. 2017. Training object class detectors with click supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[46]

Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, and Bharath Hariharan. 2017. Learning features by watching objects move. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2701–2710.

[47]

Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. 2019. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1406–1415.

[48]

Robert Pinsler, Jonathan Gordon, Eric Nalisnick, and José Miguel Hernández-Lobato. 2019. Bayesian batch active learning as sparse subset approximation. In Advances in Neural Information Processing Systems. 6356–6367.

[49]

Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo Wang, and Alan Yuille. 2018. Deep co-training for semi-supervised image recognition. In Proceedings of the European Conference on Computer Vision (ECCV ’18). 135–152.

Digital Library

[50]

Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, and Tapani Raiko. 2015. Semi-supervised learning with ladder networks. In Advances in Neural Information Processing Systems. 3546–3554.

[51]

Sachin Ravi and Hugo Larochelle. 2016. Optimization as a model for few-shot learning. In Proceedings of the 4th International Conference on Learning Representations (ICLR ’16). 1–11.

[52]

Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S. Rawat, and Mubarak Shah. 2021. In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. arXiv preprint arXiv:2101.06329 (2021).

[53]

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. FitNets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014).

[54]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211–252.

Digital Library

[55]

Ozan Sener and Silvio Savarese. 2017. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489 (2017).

[56]

Weishi Shi and Qi Yu. 2019. Integrating Bayesian and discriminative sparse kernel machines for multi-class active learning. In Advances in Neural Information Processing Systems. 2282–2291.

[57]

Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. 2020. FixMatch: Simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020).

[58]

Yu Sun, Eric Tzeng, Trevor Darrell, and Alexei A. Efros. 2019. Unsupervised domain adaptation through self-supervision. arXiv preprint arXiv:1909.11825 (2019).

[59]

Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, and Moritz Hardt. 2020. Test-time training with self-supervision for generalization under distribution shifts. In Proceedings of the International Conference on Machine Learning. 9229–9248.

[60]

Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in Neural Information Processing Systems. 1–10.

[61]

Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7167–7176.

[62]

Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, and Liang Lin. 2016. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology 27, 12 (2016), 2591–2600.

Digital Library

[63]

Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le. 2019. Unsupervised data augmentation for consistency training. arXiv preprint arXiv:1904.12848 (2019).

[64]

Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V. Le. 2020. Self-training with noisy student improves ImageNet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10687–10698.

[65]

Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, and Qi Tian. 2020. Hierarchical semantic aggregation for contrastive representation learning. arXiv preprint arXiv:2012.02733 (2020).

[66]

Ning Xu, Brian Price, Scott Cohen, Jimei Yang, and Thomas S. Huang. 2016. Deep interactive object selection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 373–381.

[67]

Chenglin Yang, Lingxi Xie, Siyuan Qiao, and Alan Yuille. 2018. Knowledge distillation in generations: More tolerant teachers educate better students. arXiv preprint arXiv:1805.05551 (2018).

[68]

Donggeun Yoo and In So Kweon. 2019. Learning loss for active learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[69]

Bing Yu, Jingfeng Wu, Jinwen Ma, and Zhanxing Zhu. 2019. Tangent-normal adversarial regularization for semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10676–10684.

[70]

Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).

[71]

Werner Zellinger, Thomas Grubinger, Edwin Lughofer, Thomas Natschläger, and Susanne Saminger-Platz. 2017. Central moment discrepancy (CMB) for domain-invariant representation learning. arXiv preprint arXiv:1702.08811 (2017).

[72]

Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, and Lucas Beyer. 2019. S4L: Self-supervised semi-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1476–1485.

[73]

Liheng Zhang and Guo-Jun Qi. 2020. WCP: Worst-case perturbations for semi-supervised deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3912–3921.

[74]

Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In Proceedings of the European Conference on Computer Vision. 649–666.

[75]

Zhuotun Zhu, Lingxi Xie, and Alan L. Yuille. 2017. Object recognition with and without objects. In Proceedings of the International Joint Conference on Artificial Intelligence.

Cited By

Huo XXie LHu HZhou WLi HTian Q(2024)Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain GeneralizationInternational Journal of Computer Vision10.1007/s11263-024-02041-7Online publication date: 27-Apr-2024
https://doi.org/10.1007/s11263-024-02041-7

Index Terms

One-Bit Supervision for Image Classification: Problem, Solution, and Beyond
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition

Recommendations

Collaborative learning for hyperspectral image classification

Recently, collaborative learning (CL) is introduced to combine active learning (AL) with semi-supervised learning (SSL), and solve the problem of limited training samples. In this paper, we proposed a novel CL framework for hyperspectral image ...
Vibration-Based Uncertainty Estimation for Learning from Limited Supervision
Computer Vision – ECCV 2022
Abstract
We investigate the problem of estimating uncertainty for training data, so that deep neural networks can make use of the results for learning from limited supervision. However, both prediction probability and entropy estimate uncertainty from the ...
Better Self-training for Image Classification Through Self-supervision
AI 2021: Advances in Artificial Intelligence
Abstract
Self-training is a simple semi-supervised learning approach: Unlabelled examples that attract high-confidence predictions are labelled with their predictions and added to the training set, with this process being repeated multiple times. Recently, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 4

April 2024

676 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3613617

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2024

Online AM: 24 November 2023

Accepted: 14 November 2023

Revised: 20 September 2023

Received: 18 March 2023

Published in TOMM Volume 20, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key Research and Development Program of China
National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
166
Total Downloads

Downloads (Last 12 months)166
Downloads (Last 6 weeks)4

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Huo XXie LHu HZhou WLi HTian Q(2024)Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain GeneralizationInternational Journal of Computer Vision10.1007/s11263-024-02041-7Online publication date: 27-Apr-2024
https://doi.org/10.1007/s11263-024-02041-7

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents