Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

One-Bit Supervision for Image Classification: Problem, Solution, and Beyond

Published: 11 January 2024 Publication History
  • Get Citation Alerts
  • Abstract

    This article presents one-bit supervision, a novel setting of learning with fewer labels, for image classification. Instead of the training model using the accurate label of each sample, our setting requires the model to interact with the system by predicting the class label of each sample and learn from the answer whether the guess is correct, which provides one bit (yes or no) of information. An intriguing property of the setting is that the burden of annotation largely is alleviated in comparison to offering the accurate label. There are two keys to one-bit supervision: (i) improving the guess accuracy and (ii) making good use of the incorrect guesses. To achieve these goals, we propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm. Theoretical analysis shows that one-bit annotation is more efficient than full-bit annotation in most cases and gives the conditions of combining our approach with active learning. Inspired by this, we further integrate the one-bit supervision framework into the self-supervised learning algorithm, which yields an even more efficient training schedule. Different from training from scratch, when self-supervised learning is used for initialization, both hard example mining and class balance are verified to be effective in boosting the learning performance. However, these two frameworks still need full-bit labels in the initial stage. To cast off this burden, we utilize unsupervised domain adaptation to train the initial model and conduct pure one-bit annotations on the target dataset. In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.

    References

    [1]
    Les E. Atlas, David A. Cohn, and Richard E. Ladner. 1990. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems. 566–573.
    [2]
    David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A. Raffel. 2019. MixMatch: A holistic approach to semi-supervised learning. In Advances in Neural Information Processing Systems. 5050–5060.
    [3]
    Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, and Vicente Ordonez. 2020. Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning. arXiv preprint arXiv:2001.06001 (2020).
    [4]
    Chao Chen, Zhihang Fu, Zhihong Chen, Sheng Jin, Zhaowei Cheng, Xinyu Jin, and Xian-Sheng Hua. 2020. HoMM: Higher-order moment matching for unsupervised domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3422–3429.
    [5]
    John Chen, Vatsal Shah, and Anastasios Kyrillidis. 2020. Negative sampling in semi-supervised learning. In Proceedings of the International Conference on Machine Learning. 1704–1714.
    [6]
    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. 1597–1607.
    [7]
    Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey Hinton. 2020. Big self-supervised models are strong semi-supervised learners. arXiv preprint arXiv:2006.10029 (2020).
    [8]
    Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020).
    [9]
    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    [10]
    Rob Fergus, Yair Weiss, and Antonio Torralba. 2009. Semi-supervised learning in gigantic image collections. In Advances in Neural Information Processing Systems. 522–530.
    [11]
    Alexander Freytag, Erik Rodner, and Joachim Denzler. 2014. Selecting influential examples: Active learning with expected model output changes. In Proceedings of the European Conference on Computer Vision. 562–577.
    [12]
    Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born again neural networks. arXiv preprint arXiv:1805.04770 (2018).
    [13]
    Yarin Gal, Riashat Islam, and Zoubin Ghahramani. 2017. Deep Bayesian active learning with image data. In Proceedings of the International Conference on Machine Learning. 1183–1192.
    [14]
    Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan Ö Arık, Larry S. Davis, and Tomas Pfister. 2020. Consistency-based semi-supervised active learning: Towards minimizing labeling cost. In Proceedings of the European Conference on Computer Vision. 510–526.
    [15]
    Xavier Gastaldi. 2017. Shake-shake regularization. arXiv preprint arXiv:1705.07485 (2017).
    [16]
    Spyros Gidaris, Praveer Singh, and Nikos Komodakis. 2018. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018).
    [17]
    Yves Grandvalet and Yoshua Bengio. 2005. Semi-supervised learning by entropy minimization. In Advances in Neural Information Processing Systems. 529–536.
    [18]
    Matthieu Guillaumin, Jakob Verbeek, and Cordelia Schmid. 2010. Multimodal semi-supervised learning for image classification. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 902–909.
    [19]
    Tao Han, Wei-Wei Tu, and Yu-Feng Li. 2021. Explanation consistency training: Facilitating consistency-based semi-supervised learning with interpretability. In Proceedings of the AAAI Conference on Artificial Intelligence.
    [20]
    Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729–9738.
    [21]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
    [22]
    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
    [23]
    Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. 2011. Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745 (2011).
    [24]
    Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, and Qi Tian. 2020. One-bit supervision for image classification. In Advances in Neural Information Processing Systems. 1–11.
    [25]
    Hengtong Hu, Lingxi Xie, Richang Hong, and Qi Tian. 2020. Creating something from nothing: Unsupervised knowledge distillation for cross-modal hashing. arXiv preprint arXiv:2004.00280 (2020).
    [26]
    Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708.
    [27]
    Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. 2019. Label propagation for deep semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5070–5079.
    [28]
    Youngdong Kim, Junho Yim, Juseung Yun, and Junmo Kim. 2019. NLNL: Negative learning for noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 101–110.
    [29]
    Andreas Kirsch, Joost van Amersfoort, and Yarin Gal. 2019. BatchBALD: Efficient and diverse batch acquisition for deep Bayesian active learning. In Advances in Neural Information Processing Systems. 7024–7035.
    [30]
    Alex Krizhevsky and Geoffrey Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. University of Toronto.
    [31]
    Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, and Zsolt Kira. 2020. FeatMatch: Feature-based augmentation for semi-supervised learning. In Proceedings of the European Conference on Computer Vision. 479–495.
    [32]
    Samuli Laine and Timo Aila. 2016. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016).
    [33]
    Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2017. Colorization as a proxy task for visual understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6874–6883.
    [34]
    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.
    [35]
    Dong-Hyun Lee. 2013. Pseudo-Label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning (ICML ’13), Vol. 3. 2.
    [36]
    David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’94). 3–12.
    [37]
    Shuang Li, Mixue Xie, Fangrui Lv, Chi Harold Liu, Jian Liang, Chen Qin, and Wei Li. 2021. Semantic concentration for domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9102–9111.
    [38]
    Wenjie Luo, Alex Schwing, and Raquel Urtasun. 2013. Latent structured active learning. In Advances in Neural Information Processing Systems. 728–736.
    [39]
    Tomasz Malisiewicz and Alyosha Efros. 2009. Beyond categories: The Visual Memex model for reasoning about object relationships. In Advances in Neural Information Processing Systems. 1222–1230.
    [40]
    Takeru Miyato, Shin-Ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: A regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 8 (2018), 1979–1993.
    [41]
    Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised learning of visual representations by solving jigsaw puzzles. In Proceedings of the European Conference on Computer Vision. 69–84.
    [42]
    Mehdi Noroozi, Hamed Pirsiavash, and Paolo Favaro. 2017. Representation learning by learning to count. In Proceedings of the IEEE International Conference on Computer Vision. 5898–5906.
    [43]
    Mehdi Noroozi, Ananth Vinjimoor, Paolo Favaro, and Hamed Pirsiavash. 2018. Boosting self-supervised learning via knowledge transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9359–9367.
    [44]
    Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, and Vittorio Ferrari. 2016. We don’t need no bounding-boxes: Training object class detectors using only human verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 854–863.
    [45]
    Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, and Vittorio Ferrari. 2017. Training object class detectors with click supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    [46]
    Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, and Bharath Hariharan. 2017. Learning features by watching objects move. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2701–2710.
    [47]
    Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. 2019. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1406–1415.
    [48]
    Robert Pinsler, Jonathan Gordon, Eric Nalisnick, and José Miguel Hernández-Lobato. 2019. Bayesian batch active learning as sparse subset approximation. In Advances in Neural Information Processing Systems. 6356–6367.
    [49]
    Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo Wang, and Alan Yuille. 2018. Deep co-training for semi-supervised image recognition. In Proceedings of the European Conference on Computer Vision (ECCV ’18). 135–152.
    [50]
    Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, and Tapani Raiko. 2015. Semi-supervised learning with ladder networks. In Advances in Neural Information Processing Systems. 3546–3554.
    [51]
    Sachin Ravi and Hugo Larochelle. 2016. Optimization as a model for few-shot learning. In Proceedings of the 4th International Conference on Learning Representations (ICLR ’16). 1–11.
    [52]
    Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S. Rawat, and Mubarak Shah. 2021. In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. arXiv preprint arXiv:2101.06329 (2021).
    [53]
    Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. FitNets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014).
    [54]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211–252.
    [55]
    Ozan Sener and Silvio Savarese. 2017. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489 (2017).
    [56]
    Weishi Shi and Qi Yu. 2019. Integrating Bayesian and discriminative sparse kernel machines for multi-class active learning. In Advances in Neural Information Processing Systems. 2282–2291.
    [57]
    Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. 2020. FixMatch: Simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020).
    [58]
    Yu Sun, Eric Tzeng, Trevor Darrell, and Alexei A. Efros. 2019. Unsupervised domain adaptation through self-supervision. arXiv preprint arXiv:1909.11825 (2019).
    [59]
    Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, and Moritz Hardt. 2020. Test-time training with self-supervision for generalization under distribution shifts. In Proceedings of the International Conference on Machine Learning. 9229–9248.
    [60]
    Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in Neural Information Processing Systems. 1–10.
    [61]
    Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7167–7176.
    [62]
    Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, and Liang Lin. 2016. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology 27, 12 (2016), 2591–2600.
    [63]
    Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le. 2019. Unsupervised data augmentation for consistency training. arXiv preprint arXiv:1904.12848 (2019).
    [64]
    Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V. Le. 2020. Self-training with noisy student improves ImageNet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10687–10698.
    [65]
    Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, and Qi Tian. 2020. Hierarchical semantic aggregation for contrastive representation learning. arXiv preprint arXiv:2012.02733 (2020).
    [66]
    Ning Xu, Brian Price, Scott Cohen, Jimei Yang, and Thomas S. Huang. 2016. Deep interactive object selection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 373–381.
    [67]
    Chenglin Yang, Lingxi Xie, Siyuan Qiao, and Alan Yuille. 2018. Knowledge distillation in generations: More tolerant teachers educate better students. arXiv preprint arXiv:1805.05551 (2018).
    [68]
    Donggeun Yoo and In So Kweon. 2019. Learning loss for active learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    [69]
    Bing Yu, Jingfeng Wu, Jinwen Ma, and Zhanxing Zhu. 2019. Tangent-normal adversarial regularization for semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10676–10684.
    [70]
    Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).
    [71]
    Werner Zellinger, Thomas Grubinger, Edwin Lughofer, Thomas Natschläger, and Susanne Saminger-Platz. 2017. Central moment discrepancy (CMB) for domain-invariant representation learning. arXiv preprint arXiv:1702.08811 (2017).
    [72]
    Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, and Lucas Beyer. 2019. S4L: Self-supervised semi-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1476–1485.
    [73]
    Liheng Zhang and Guo-Jun Qi. 2020. WCP: Worst-case perturbations for semi-supervised deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3912–3921.
    [74]
    Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In Proceedings of the European Conference on Computer Vision. 649–666.
    [75]
    Zhuotun Zhu, Lingxi Xie, and Alan L. Yuille. 2017. Object recognition with and without objects. In Proceedings of the International Joint Conference on Artificial Intelligence.

    Cited By

    View all
    • (2024)Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain GeneralizationInternational Journal of Computer Vision10.1007/s11263-024-02041-7Online publication date: 27-Apr-2024

    Index Terms

    1. One-Bit Supervision for Image Classification: Problem, Solution, and Beyond

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 4
      April 2024
      676 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3613617
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 January 2024
      Online AM: 24 November 2023
      Accepted: 14 November 2023
      Revised: 20 September 2023
      Received: 18 March 2023
      Published in TOMM Volume 20, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. One-bit supervision
      2. semi-supervised learning
      3. active learning
      4. self-supervised learning
      5. unsupervised domain adaptation

      Qualifiers

      • Research-article

      Funding Sources

      • National Key Research and Development Program of China
      • National Natural Science Foundation of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)166
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain GeneralizationInternational Journal of Computer Vision10.1007/s11263-024-02041-7Online publication date: 27-Apr-2024

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media