Semi-supervised transformable architecture search for feature distillation

Zhang, Man; Zhou, Yong; Liu, Bing; Zhao, Jiaqi; Yao, Rui; Shao, Zhiwen; Zhu, Hancheng; Chen, Hao

doi:10.1007/s10044-022-01122-y

Semi-supervised transformable architecture search for feature distillation

Short Paper
Published: 01 November 2022

Volume 26, pages 669–677, (2023)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Man Zhang¹,
Yong Zhou ORCID: orcid.org/0000-0001-6207-0299¹,
Bing Liu¹,
Jiaqi Zhao¹,
Rui Yao¹,
Zhiwen Shao¹,
Hancheng Zhu¹ &
…
Hao Chen²

289 Accesses
1 Altmetric
Explore all metrics

Abstract

The designed method aims to perform image classification tasks efficiently and accurately. Different from the traditional CNN-based image classification methods, which are greatly affected by the number of labels and the depth of the network. Although the deep network can improve the accuracy of the model, the training process is usually time-consuming and laborious. We explained how to use only a few of labels, design a more flexible network architecture and combine feature distillation method to improve model efficiency while ensuring high accuracy. Specifically, we integrate different network structures into independent individuals to make the use of network structures more flexible. Based on knowledge distillation, we extract the channel features and establish a feature distillation connection from the teacher network to the student network. By comparing the experimental results with other related popular methods on commonly used data sets, the effectiveness of the method is proved. The code can be found at https://github.com/ZhangXinba/Semi_FD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Self-Regulated Feature Learning via Teacher-free Feature Distillation

Using Less but Important Information for Feature Distillation

Matching Guided Distillation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Wang L, Yoon K-J (2022) Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. In: IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 6, pp 3048-3068. https://doi.org/10.1109/TPAMI.2021.3055564
BuciluǍ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541
Liu Z, Sun M, Zhou T, Huang G, Darrell T (2018) Rethinking the value of network pruning. In: International conference on learning representations
Yang J, Shen X, Xing J, Tian X, Li H, Deng B, Huang J, Hua X-s (2019) Quantization networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7308–7316
Yu X, Liu T, Wang X, Tao D (2017) On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7370–7379
Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B (2017) Efficient knowledge distillation from an ensemble of teachers. Interspeech 25:3697–3701
Article Google Scholar
Wang L, Ho Y-S, Yoon K-J et al. (2019) Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10081–10090
Cai H, Zheng VW, Chang KC-C (2018) A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Article Google Scholar
Chen L-C, Collins MD, Zhu Y, Papandreou G, Zoph B, Schroff F, Adam H, Shlens J (2018) Searching for efficient multi-scale architectures for dense image prediction. Adv Neural Inf Process Syst 31:8713–8724
Google Scholar
Cui J, Chen P, Li R, Liu S, Shen X, Jia J (2019) Fast and practical neural architecture search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6509–6518
Li C, Peng J, Yuan L, Wang G, Liang X, Lin L, Chang X (2020) Block-wisely supervised neural architecture search with knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1989–1998
Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 82–92
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp. 4095–4104. PMLR
Cai H, Chen T, Zhang W, Yu Y, Wang J (2018) Efficient architecture search by network transformation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32
Cai H, Yang J, Zhang W, Han S, Yu Y (2018) Path-level network transformation for efficient architecture search. In: International Conference on Machine Learning, pp. 678–687. PMLR
Dong X, Yang Y (2019) Network pruning via transformable architecture search. Conference on Neural Information Processing Systems
Peng J, Sun M, Zhang Z, Tan T, Yan J (2019) Efficient neural architecture transformation searchin channel-level for object detection. Conference on Neural Information Processing Systems
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710
Cho J, Lee M (2019) Building a compact convolutional neural network for embedded intelligent sensor systems using group sparsity and knowledge distillation. Sensors 19(19):4307
Article Google Scholar
Wu M-C, Chiu C-T (2020) Multi-teacher knowledge distillation for compressed video action recognition based on deep learning. J Syst Arch 103:101695
Article Google Scholar
Gao L, Lan X, Mi H, Feng D, Xu K, Peng Y (2019) Multistructure-based collaborative online distillation. Entropy 21(4):357
Article MathSciNet Google Scholar
Thoker FM, Gall J (2019) Cross-modal knowledge distillation for action recognition. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 6–10. IEEE
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Chen G, Choi W, Yu X, Han T, Chandraker M (2017) Learning efficient object detection models with knowledge distillation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 742–751
Tang Y, Wang Y, Xu Y, Chen H, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) A semi-supervised assessor of neural architectures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1810–1819
Luo R, Tan X, Wang R, Qin T, Chen E, Liu T-Y (2020) Semi-supervised neural architecture search. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Xu G, Liu Z, Li X, Loy CC (2020) Knowledge distillation meets self-supervision. In: European Conference on Computer Vision, pp. 588–604. Springer
Zhang C, Peng Y (2018) Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification. http://arxiv.org/abs/1804.10069
Rajasegaran J, Khan S, Hayat M, Khan FS, Shah M (2020) Self-supervised knowledge distillation for few-shot learning. http://arxiv.org/abs/2006.09785
Xie Q, Luong M-T, Hovy E, Le QV (2020) Self-training with noisy student improves imagenet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Handbook System Autoimmune Dis 1:4
Google Scholar
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. Neural Inf Process Syst 25:2234–2242
Google Scholar
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. In: International Conference on Learning Representations (ICLR)
Komodakis N, Zagoruyko S (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR
Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International Conference on Machine Learning, pp. 4723–4731. PMLR
Kim J, Park S, Kwak N (2018) Paraphrasing complex network: Network compression via factor transfer. In: Neural Information Processing Systems (NIPS)
Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3779–3787
Heo B, Kim J, Yun S, Park H, Kwak N, Choi JY (2019) A comprehensive overhaul of feature distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1921–1930
Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel C (2019) Mixmatch: a holistic approach to semi-supervised learning. Adv Neural Inf Process Syst 68:5050–5060
Google Scholar
Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2019) Unsupervised data augmentation for consistency training. http://arxiv.org/abs/1904.12848
Berthelot D, Carlini N, Cubuk ED, Kurakin A, Sohn K, Zhang H, Raffel C (2020) Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. In: Eighth International Conference on Learning Representations
Sohn K, Berthelot D, Carlini N et al (2020) Fixmatch: Simplifying semi-supervised learning with consistency and confidence[J]. Adv Neural Inf Process Syst 33:596–608
Google Scholar
Zhong Z, Yan J, Liu C-L (2019) Practical network blocks design with q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le Q, Kurakin A (2017) Large-scale evolution of image classifiers. Proc Mach Learn Res 84:2902–2911
Google Scholar
Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 497–504
Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv Neural Inf Process Syst 43:1195–1204
Google Scholar
Rasmus A, Valpola H, Honkala M, Berglund M, Raiko T (2015) Semi-supervised learning with ladder networks. Neural Inf Process Syst 25:3546–3554
Google Scholar
Zhai X, Oliver A, Kolesnikov A, Beyer L (2019) S4l: Self-supervised semi-supervised learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1476–1485
Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2001–2010

Download references

Acknowledgements

This work is funded by the National Natural Science Foundation of China (grant No.62272461, No.62276266), the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX22_2566) and the Graduate Innovation Program of China University of Mining and Technology (2022WLKXJ116).

Author information

Authors and Affiliations

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, Jiangsu, China
Man Zhang, Yong Zhou, Bing Liu, Jiaqi Zhao, Rui Yao, Zhiwen Shao & Hancheng Zhu
Guanglian Technology Co.,Ltd, Xuzhou, 221116, Jiangsu, China
Hao Chen

Authors

Man Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Bing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Rui Yao
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwen Shao
View author publications
You can also search for this author in PubMed Google Scholar
Hancheng Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, M., Zhou, Y., Liu, B. et al. Semi-supervised transformable architecture search for feature distillation. Pattern Anal Applic 26, 669–677 (2023). https://doi.org/10.1007/s10044-022-01122-y

Download citation

Received: 07 July 2021
Accepted: 17 October 2022
Published: 01 November 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10044-022-01122-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised transformable architecture search for feature distillation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Self-Regulated Feature Learning via Teacher-free Feature Distillation

Using Less but Important Information for Feature Distillation

Matching Guided Distillation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Semi-supervised transformable architecture search for feature distillation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Self-Regulated Feature Learning via Teacher-free Feature Distillation

Using Less but Important Information for Feature Distillation

Matching Guided Distillation

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation