Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Semi-supervised transformable architecture search for feature distillation

  • Short Paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The designed method aims to perform image classification tasks efficiently and accurately. Different from the traditional CNN-based image classification methods, which are greatly affected by the number of labels and the depth of the network. Although the deep network can improve the accuracy of the model, the training process is usually time-consuming and laborious. We explained how to use only a few of labels, design a more flexible network architecture and combine feature distillation method to improve model efficiency while ensuring high accuracy. Specifically, we integrate different network structures into independent individuals to make the use of network structures more flexible. Based on knowledge distillation, we extract the channel features and establish a feature distillation connection from the teacher network to the student network. By comparing the experimental results with other related popular methods on commonly used data sets, the effectiveness of the method is proved. The code can be found at https://github.com/ZhangXinba/Semi_FD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Wang L, Yoon K-J (2022) Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. In: IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 6, pp 3048-3068. https://doi.org/10.1109/TPAMI.2021.3055564

  2. BuciluǍ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541

  3. Liu Z, Sun M, Zhou T, Huang G, Darrell T (2018) Rethinking the value of network pruning. In: International conference on learning representations

  4. Yang J, Shen X, Xing J, Tian X, Li H, Deng B, Huang J, Hua X-s (2019) Quantization networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7308–7316

  5. Yu X, Liu T, Wang X, Tao D (2017) On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7370–7379

  6. Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B (2017) Efficient knowledge distillation from an ensemble of teachers. Interspeech 25:3697–3701

    Article  Google Scholar 

  7. Wang L, Ho Y-S, Yoon K-J et al. (2019) Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10081–10090

  8. Cai H, Zheng VW, Chang KC-C (2018) A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637

    Article  Google Scholar 

  9. Chen L-C, Collins MD, Zhu Y, Papandreou G, Zoph B, Schroff F, Adam H, Shlens J (2018) Searching for efficient multi-scale architectures for dense image prediction. Adv Neural Inf Process Syst 31:8713–8724

    Google Scholar 

  10. Cui J, Chen P, Li R, Liu S, Shen X, Jia J (2019) Fast and practical neural architecture search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6509–6518

  11. Li C, Peng J, Yuan L, Wang G, Liang X, Lin L, Chang X (2020) Block-wisely supervised neural architecture search with knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1989–1998

  12. Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 82–92

  13. Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp. 4095–4104. PMLR

  14. Cai H, Chen T, Zhang W, Yu Y, Wang J (2018) Efficient architecture search by network transformation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32

  15. Cai H, Yang J, Zhang W, Han S, Yu Y (2018) Path-level network transformation for efficient architecture search. In: International Conference on Machine Learning, pp. 678–687. PMLR

  16. Dong X, Yang Y (2019) Network pruning via transformable architecture search. Conference on Neural Information Processing Systems

  17. Peng J, Sun M, Zhang Z, Tan T, Yan J (2019) Efficient neural architecture transformation searchin channel-level for object detection. Conference on Neural Information Processing Systems

  18. Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710

  19. Cho J, Lee M (2019) Building a compact convolutional neural network for embedded intelligent sensor systems using group sparsity and knowledge distillation. Sensors 19(19):4307

    Article  Google Scholar 

  20. Wu M-C, Chiu C-T (2020) Multi-teacher knowledge distillation for compressed video action recognition based on deep learning. J Syst Arch 103:101695

    Article  Google Scholar 

  21. Gao L, Lan X, Mi H, Feng D, Xu K, Peng Y (2019) Multistructure-based collaborative online distillation. Entropy 21(4):357

    Article  MathSciNet  Google Scholar 

  22. Thoker FM, Gall J (2019) Cross-modal knowledge distillation for action recognition. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 6–10. IEEE

  23. Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  24. Chen G, Choi W, Yu X, Han T, Chandraker M (2017) Learning efficient object detection models with knowledge distillation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 742–751

  25. Tang Y, Wang Y, Xu Y, Chen H, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) A semi-supervised assessor of neural architectures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1810–1819

  26. Luo R, Tan X, Wang R, Qin T, Chen E, Liu T-Y (2020) Semi-supervised neural architecture search. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  27. Xu G, Liu Z, Li X, Loy CC (2020) Knowledge distillation meets self-supervision. In: European Conference on Computer Vision, pp. 588–604. Springer

  28. Zhang C, Peng Y (2018) Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification. http://arxiv.org/abs/1804.10069

  29. Rajasegaran J, Khan S, Hayat M, Khan FS, Shah M (2020) Self-supervised knowledge distillation for few-shot learning. http://arxiv.org/abs/2006.09785

  30. Xie Q, Luong M-T, Hovy E, Le QV (2020) Self-training with noisy student improves imagenet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698

  31. Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Handbook System Autoimmune Dis 1:4

    Google Scholar 

  32. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. Neural Inf Process Syst 25:2234–2242

    Google Scholar 

  33. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop

  34. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. In: International Conference on Learning Representations (ICLR)

  35. Komodakis N, Zagoruyko S (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR

  36. Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International Conference on Machine Learning, pp. 4723–4731. PMLR

  37. Kim J, Park S, Kwak N (2018) Paraphrasing complex network: Network compression via factor transfer. In: Neural Information Processing Systems (NIPS)

  38. Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3779–3787

  39. Heo B, Kim J, Yun S, Park H, Kwak N, Choi JY (2019) A comprehensive overhaul of feature distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1921–1930

  40. Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel C (2019) Mixmatch: a holistic approach to semi-supervised learning. Adv Neural Inf Process Syst 68:5050–5060

    Google Scholar 

  41. Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2019) Unsupervised data augmentation for consistency training. http://arxiv.org/abs/1904.12848

  42. Berthelot D, Carlini N, Cubuk ED, Kurakin A, Sohn K, Zhang H, Raffel C (2020) Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. In: Eighth International Conference on Learning Representations

  43. Sohn K, Berthelot D, Carlini N et al (2020) Fixmatch: Simplifying semi-supervised learning with consistency and confidence[J]. Adv Neural Inf Process Syst 33:596–608

    Google Scholar 

  44. Zhong Z, Yan J, Liu C-L (2019) Practical network blocks design with q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence

  45. Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le Q, Kurakin A (2017) Large-scale evolution of image classifiers. Proc Mach Learn Res 84:2902–2911

    Google Scholar 

  46. Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 497–504

  47. Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv Neural Inf Process Syst 43:1195–1204

    Google Scholar 

  48. Rasmus A, Valpola H, Honkala M, Berglund M, Raiko T (2015) Semi-supervised learning with ladder networks. Neural Inf Process Syst 25:3546–3554

    Google Scholar 

  49. Zhai X, Oliver A, Kolesnikov A, Beyer L (2019) S4l: Self-supervised semi-supervised learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1476–1485

  50. Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2001–2010

Download references

Acknowledgements

This work is funded by the National Natural Science Foundation of China (grant No.62272461, No.62276266), the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX22_2566) and the Graduate Innovation Program of China University of Mining and Technology (2022WLKXJ116).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, M., Zhou, Y., Liu, B. et al. Semi-supervised transformable architecture search for feature distillation. Pattern Anal Applic 26, 669–677 (2023). https://doi.org/10.1007/s10044-022-01122-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-022-01122-y

Keywords