Abstract
The designed method aims to perform image classification tasks efficiently and accurately. Different from the traditional CNN-based image classification methods, which are greatly affected by the number of labels and the depth of the network. Although the deep network can improve the accuracy of the model, the training process is usually time-consuming and laborious. We explained how to use only a few of labels, design a more flexible network architecture and combine feature distillation method to improve model efficiency while ensuring high accuracy. Specifically, we integrate different network structures into independent individuals to make the use of network structures more flexible. Based on knowledge distillation, we extract the channel features and establish a feature distillation connection from the teacher network to the student network. By comparing the experimental results with other related popular methods on commonly used data sets, the effectiveness of the method is proved. The code can be found at https://github.com/ZhangXinba/Semi_FD.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Wang L, Yoon K-J (2022) Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. In: IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 6, pp 3048-3068. https://doi.org/10.1109/TPAMI.2021.3055564
BuciluǍ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541
Liu Z, Sun M, Zhou T, Huang G, Darrell T (2018) Rethinking the value of network pruning. In: International conference on learning representations
Yang J, Shen X, Xing J, Tian X, Li H, Deng B, Huang J, Hua X-s (2019) Quantization networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7308–7316
Yu X, Liu T, Wang X, Tao D (2017) On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7370–7379
Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B (2017) Efficient knowledge distillation from an ensemble of teachers. Interspeech 25:3697–3701
Wang L, Ho Y-S, Yoon K-J et al. (2019) Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10081–10090
Cai H, Zheng VW, Chang KC-C (2018) A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Chen L-C, Collins MD, Zhu Y, Papandreou G, Zoph B, Schroff F, Adam H, Shlens J (2018) Searching for efficient multi-scale architectures for dense image prediction. Adv Neural Inf Process Syst 31:8713–8724
Cui J, Chen P, Li R, Liu S, Shen X, Jia J (2019) Fast and practical neural architecture search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6509–6518
Li C, Peng J, Yuan L, Wang G, Liang X, Lin L, Chang X (2020) Block-wisely supervised neural architecture search with knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1989–1998
Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 82–92
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp. 4095–4104. PMLR
Cai H, Chen T, Zhang W, Yu Y, Wang J (2018) Efficient architecture search by network transformation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32
Cai H, Yang J, Zhang W, Han S, Yu Y (2018) Path-level network transformation for efficient architecture search. In: International Conference on Machine Learning, pp. 678–687. PMLR
Dong X, Yang Y (2019) Network pruning via transformable architecture search. Conference on Neural Information Processing Systems
Peng J, Sun M, Zhang Z, Tan T, Yan J (2019) Efficient neural architecture transformation searchin channel-level for object detection. Conference on Neural Information Processing Systems
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710
Cho J, Lee M (2019) Building a compact convolutional neural network for embedded intelligent sensor systems using group sparsity and knowledge distillation. Sensors 19(19):4307
Wu M-C, Chiu C-T (2020) Multi-teacher knowledge distillation for compressed video action recognition based on deep learning. J Syst Arch 103:101695
Gao L, Lan X, Mi H, Feng D, Xu K, Peng Y (2019) Multistructure-based collaborative online distillation. Entropy 21(4):357
Thoker FM, Gall J (2019) Cross-modal knowledge distillation for action recognition. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 6–10. IEEE
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Chen G, Choi W, Yu X, Han T, Chandraker M (2017) Learning efficient object detection models with knowledge distillation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 742–751
Tang Y, Wang Y, Xu Y, Chen H, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) A semi-supervised assessor of neural architectures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1810–1819
Luo R, Tan X, Wang R, Qin T, Chen E, Liu T-Y (2020) Semi-supervised neural architecture search. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Xu G, Liu Z, Li X, Loy CC (2020) Knowledge distillation meets self-supervision. In: European Conference on Computer Vision, pp. 588–604. Springer
Zhang C, Peng Y (2018) Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification. http://arxiv.org/abs/1804.10069
Rajasegaran J, Khan S, Hayat M, Khan FS, Shah M (2020) Self-supervised knowledge distillation for few-shot learning. http://arxiv.org/abs/2006.09785
Xie Q, Luong M-T, Hovy E, Le QV (2020) Self-training with noisy student improves imagenet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Handbook System Autoimmune Dis 1:4
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. Neural Inf Process Syst 25:2234–2242
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. In: International Conference on Learning Representations (ICLR)
Komodakis N, Zagoruyko S (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR
Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International Conference on Machine Learning, pp. 4723–4731. PMLR
Kim J, Park S, Kwak N (2018) Paraphrasing complex network: Network compression via factor transfer. In: Neural Information Processing Systems (NIPS)
Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3779–3787
Heo B, Kim J, Yun S, Park H, Kwak N, Choi JY (2019) A comprehensive overhaul of feature distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1921–1930
Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel C (2019) Mixmatch: a holistic approach to semi-supervised learning. Adv Neural Inf Process Syst 68:5050–5060
Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2019) Unsupervised data augmentation for consistency training. http://arxiv.org/abs/1904.12848
Berthelot D, Carlini N, Cubuk ED, Kurakin A, Sohn K, Zhang H, Raffel C (2020) Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. In: Eighth International Conference on Learning Representations
Sohn K, Berthelot D, Carlini N et al (2020) Fixmatch: Simplifying semi-supervised learning with consistency and confidence[J]. Adv Neural Inf Process Syst 33:596–608
Zhong Z, Yan J, Liu C-L (2019) Practical network blocks design with q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le Q, Kurakin A (2017) Large-scale evolution of image classifiers. Proc Mach Learn Res 84:2902–2911
Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 497–504
Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv Neural Inf Process Syst 43:1195–1204
Rasmus A, Valpola H, Honkala M, Berglund M, Raiko T (2015) Semi-supervised learning with ladder networks. Neural Inf Process Syst 25:3546–3554
Zhai X, Oliver A, Kolesnikov A, Beyer L (2019) S4l: Self-supervised semi-supervised learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1476–1485
Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2001–2010
Acknowledgements
This work is funded by the National Natural Science Foundation of China (grant No.62272461, No.62276266), the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX22_2566) and the Graduate Innovation Program of China University of Mining and Technology (2022WLKXJ116).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, M., Zhou, Y., Liu, B. et al. Semi-supervised transformable architecture search for feature distillation. Pattern Anal Applic 26, 669–677 (2023). https://doi.org/10.1007/s10044-022-01122-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-022-01122-y