Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3532213.3532225acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaiConference Proceedingsconference-collections
research-article

Sparse DARTS with Various Recovery Algorithms

Published: 13 July 2022 Publication History

Abstract

Designing an efficient neural architecture search method is an open and challenging problem over the last few years. A typical and well-performed strategy is gradient-based methods (i.e., Differentiable Architecture Search (DARTS)), which mainly searches the target sparse child graph from a trainable dense super graph. However, during the searching phrase, training the dense super graph usually requires excessively computational resources. Besides, the training based on a dense graph is excessively inefficient, and the memory consumption is prohibitively high. To alleviate this shortcoming, recently Iterative Shrinkage Thresholding Algorithm (ISTA), a sparse coding recovery algorithm, has been applied to DARTS, which directly optimizes the compressed representation of the super graph, and saves the memory and time consumption. Indeed, there are several such kinds of sparse coding recovery algorithms, and ISTA is not the best one in terms of recovery efficiency and effectiveness. To investigate the impact of different sparse coding recovery algorithm on performance in DARTS and provide some insights. Firstly, we design several sparse DARTS based on different sparse coding recovery algorithms (i.e., LISTA, CoD, and Lars). Then a series of controlled experiments on selected algorithms are conducted. The accuracy, search time and other indicators of the model are collected and compared. Sufficient theoretical analysis and experimental exploration reveal that the different compression algorithms show different characteristics on the sparse DARTS. Specifically, Lars-NAS tends to choose the operation with fewer parameters, while Cod-NAS is the simplest of the four recovery algorithms, and its consuming time is very short, but the CoD-NAS model is unstable. Particularly, LISTA-NAS achieves the accurate results with stable recovery time. Thus, it can be seen that all compression algorithms are available to utilized according to different environments and requirements.

References

[1]
Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. 2016. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167(2016).
[2]
Amir Beck and Marc Teboulle. 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences 2, 1 (2009), 183–202.
[3]
F Guillaume Blanchet, Pierre Legendre, and Daniel Borcard. 2008. Forward selection of explanatory variables. Ecology 89, 9 (2008), 2623–2632.
[4]
Han Cai, Ligeng Zhu, and Song Han. 2018. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332(2018).
[5]
Liang-Chieh Chen, Maxwell Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, and Jon Shlens. 2018. Searching for efficient multi-scale architectures for dense image prediction. In Advances in neural information processing systems. 8699–8710.
[6]
Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. 2019. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE International Conference on Computer Vision. 1294–1303.
[7]
Ingrid Daubechies, Michel Defrise, and Christine De Mol. 2004. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 57, 11 (2004), 1413–1457.
[8]
Bradley Efron, Trevor Hastie, Iain Johnstone, Robert Tibshirani, 2004. Least angle regression. The Annals of statistics 32, 2 (2004), 407–499.
[9]
Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2018. Neural Architecture Search: A Survey. arXiv (2018).
[10]
Karol Gregor and Yann LeCun. 2010. Learning fast approximations of sparse coding. In Proceedings of the 27th international conference on international conference on machine learning. 399–406.
[11]
Trevor Hastie, Jonathan Taylor, Robert Tibshirani, Guenther Walther, 2007. Forward stagewise regression and the monotone lasso. Electronic Journal of Statistics 1 (2007), 1–29.
[12]
Xin He, Kaiyong Zhao, and Xiaowen Chu. 2020. AutoML: A Survey of the State-of-the-Art. Knowledge-Based Systems(2020), 106622.
[13]
Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, and Eric P Xing. 2018. Neural architecture search with bayesian optimisation and optimal transport. Advances in neural information processing systems 31 (2018), 2016–2025.
[14]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[15]
Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055(2018).
[16]
Yurii Nesterov. 2013. Introductory lectures on convex optimization: A basic course. Vol. 87. Springer Science & Business Media.
[17]
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, Vol. 33. 4780–4789.
[18]
Yanan Sun, Bing Xue, Mengjie Zhang, and Gary G. Yen. 2017. Evolving Deep Convolutional Neural Networks for Image Classification. (2017).
[19]
Yanan Sun, Bing Xue, Mengjie Zhang, Gary G. Yen, and Jiancheng Lv. 2020. Automatically Designing CNN Architectures Using the Genetic Algorithm for Image Classification. IEEE Transactions on Cybernetics PP, 99 (2020), 1–15.
[20]
Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1(1996), 267–288.
[21]
Stephen J Wright. 2015. Coordinate descent algorithms. Mathematical Programming 151, 1 (2015), 3–34.
[22]
Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. 2018. SNAS: stochastic neural architecture search. arXiv preprint arXiv:1812.09926(2018).
[23]
Yong Xu, David Zhang, Jian Yang, and Jing-Yu Yang. 2011. A two-phase test sample sparse representation method for use with face recognition. IEEE Transactions on circuits and systems for video technology 21, 9(2011), 1255–1262.
[24]
Yibo Yang, Hongyang Li, Shan You, Fei Wang, Chen Qian, and Zhouchen Lin. 2020. Ista-nas: Efficient and consistent neural architecture search by sparse coding. Advances in Neural Information Processing Systems 33 (2020).
[25]
Quanming Yao, Ju Xu, Wei-Wei Tu, and Zhanxing Zhu. 2020. Efficient Neural Architecture Search via Proximal Iterations. In AAAI. 6664–6671.
[26]
Zheng Zhang, Yong Xu, Jian Yang, Xuelong Li, and David Zhang. 2015. A survey of sparse representation: algorithms and applications. IEEE access 3(2015), 490–530.
[27]
Hongpeng Zhou, Minghao Yang, Jun Wang, and Wei Pan. 2019. Bayesnas: A bayesian approach for neural architecture search. arXiv preprint arXiv:1905.04919(2019).
[28]
Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578(2016).
[29]
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8697–8710.

Cited By

View all
  • (2023)DASS: Differentiable Architecture Search for Sparse Neural NetworksACM Transactions on Embedded Computing Systems10.1145/360938522:5s(1-21)Online publication date: 9-Sep-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence
March 2022
809 pages
ISBN:9781450396110
DOI:10.1145/3532213
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Differentiable Architecture Search
  2. Neural architecture search
  3. Sparse coding recovery

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICCAI '22

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)DASS: Differentiable Architecture Search for Sparse Neural NetworksACM Transactions on Embedded Computing Systems10.1145/360938522:5s(1-21)Online publication date: 9-Sep-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media