research-article

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

Authors:

Xiaojiang Chen,

Xin WangAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 54, Issue 4

Article No.: 76, Pages 1 - 34

https://doi.org/10.1145/3447582

Published: 24 May 2021 Publication History

Abstract

Deep learning has made substantial breakthroughs in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers’ prior knowledge and experience. And due to the limitations of humans’ inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture. Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. In addition, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.

References

[1]

S. Hochreiter and J. Schmidhuber. 1997. Lonort-term memory. Neural Computation 9, 8 (1997), 1735–1780.

Digital Library

[2]

M. X. Chen, O. Firat, A. Bapna, M. Johnson, W. Macherey, G. Foster, L. Jones, N. Parmar, M. Schuster, Z. Chen, Y. Wu, and M. Hughes. 2018. The best of both worlds: Combining recent advances in neural machine translation. arXiv:1804.09849. Retrieved from https://arxiv.org/pdf/1804.09849.pdf.

[3]

Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144. https://arxiv.org/pdf/1609.08144.pdf.

[4]

K. Simonyan and A. Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. ICLR.

[5]

M. Suganuma, S. Shirakawa, and T. Nagao. 2017. A genetic programming approach to designing convolutional neural network architectures. In Proceedings of the Genetic and Evolutionary Computation Conference. 497–504.

[6]

Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. https://arxiv.org/pdf/1704.04861.pdf.

[7]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097–1105.

[8]

S. Ren, K. He, R. Girshick, and J. Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91–99.

[9]

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, Cham., 21–37.

[10]

T. Y. Lin, P. Goyal, R. Girshick, K. He, and Dollár, P. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980–2988.

[11]

B. Zoph and Q. V. Le. 2017. Neural architecture search with reinforcement learning. ICLR.

[12]

B. Baker, O. Gupta, N. Naik, and R. Raskar. 2016. Designing neural network architectures using reinforcement learning. arXiv:1611.02167.

[13]

N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). IEEE, Vol. 1, 886–893.

[14]

D. G. Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision. IEEE, Vol. 2, 1150–1157.

[15]

E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, ... and A. Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning, Volume 70, 2902–2911. JMLR. org.

[16]

L. Xie and A. Yuille. 2017. Genetic CNN. In Proceedings of the IEEE International Conference on Computer Vision. 1379–1388.

[17]

H. Liu, K. Simonyan, and Y. Yang. 2018. Darts: Differentiable architecture search. arXiv:1806.09055.

[18]

Y. Shu, W. Wang, and S. Cai. 2019. Understanding architectures learnt by cell-based neural architecture search. arXiv:1909.09569.

[19]

H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean. 2018. Efficient neural architecture search via parameter sharing. arXiv:1802.03268.

[20]

B. Baker, O. Gupta, R. Raskar, and N. Naik. 2017. Accelerating neural architecture search using performance prediction. arXiv:1705.10823.

[21]

C. Li, J. Peng, L. Yuan, G. Wang, X. Liang, L. Lin, and X. Chang. 2020. Block-wisely supervised neural architecture search with knowledge distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1989–1998.

[22]

G. Bender, P. Kindermans, B. Zoph, V. Vasudevan, and Q. Le. 2018. Understanding and simplifying one-shot architecture search. In Proceedings of the 35th International Conference on Machine Learning. PMLR, 80:550–559.

[23]

A. Brock, T. Lim, J. M. Ritchie, and N. Weston. 2017. Smash: One-shot model architecture search through hypernetworks. arXiv:1708.05344.

[24]

C. Sciuto, K. Yu, M. Jaggi, C. Musat, and M. Salzmann. 2019. Evaluating the search phase of neural architecture search. arXiv:1902.08142.

[25]

M. Zhang, H. Li, S. Pan, X. Chang, and S. Su. 2020 Overcoming multi-model forgetting in one-shot NAS with diversity maximization. In Advances in Neural Information Processing Systems.

[26]

X. Cheng, Y. Zhong, M. Harandi, Y. Dai, X. Chang, H. Li, T. Drummond, and Z. Ge. 2020. Hierarchical neural architecture search for deep stereo matching. In Advances in Neural Information Processing Systems.

[27]

T. Elsken, J. H. Metzen, and F. Hutter. 2018. Neural architecture search: A survey. arXiv:1808.05377.

[28]

M. Wistuba, A. Rawat, and T. Pedapati. 2019. A survey on neural architecture search. arXiv:1905.01392.

[29]

M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le. 2019. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2820–2828.

[30]

K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[31]

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, ... and A. Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9.

[32]

B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8697–8710.

[33]

Z. Zhong, J. Yan, W. Wu, J. Shao, and C. L. Liu. 2018. Practical block-wise neural network architecture generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2423–2432.

[34]

H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. arXiv:1711.00436.

[35]

J. D. Dong, A. C. Cheng, D. C. Juan, W. Wei, and M. Sun. 2018. DPP-Net: Device-aware progressive search for Pareto-optimal neural architectures. In Proceedings of the European Conference on Computer Vision (ECCV’18). 517–531.

[36]

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708.

[37]

C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, ... and K. Murphy. 2018. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV’18). 19–34.

[38]

T. Saikia, Y. Marrakchi, A. Zela, F. Hutter, and T. Brox. 2019. AutoDispNet: Improving disparity estimation with AutoML. In Proceedings of the IEEE International Conference on Computer Vision. 1812–1823.

[39]

J. Cui, P. Chen, R. Li, S. Liu, X. Shen, and J. Jia. 2019. Fast and practical neural architecture search. In Proceedings of the IEEE International Conference on Computer Vision. 6509–6518.

[40]

Y. Xiong, R. Mehta, and V. Singh. 2019. Resource constrained neural network architecture search: Will a submodularity assumption help? In Proceedings of the IEEE International Conference on Computer Vision. 1901–1910.

[41]

M. S. Ryoo, A. J. Piergiovanni, M. Tan, and A. Angelova. 2019. AssembleNet: Searching for multi-stream neural connectivity in video architectures. arXiv:1905.13209.

[42]

J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, ... and Q. V. Le. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems. 1223–1231.

[43]

E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 4780–4789.

[44]

X. Chen, L. Xie, J. Wu, and Q. Tian. 2019. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE International Conference on Computer Vision. 1294–1303.

[45]

A. J. Piergiovanni, A. Angelova, A. Toshev, and M. S. Ryoo. 2019. Evolving space-time neural architectures for videos. In Proceedings of the IEEE International Conference on Computer Vision. 1793–1802.

[46]

S. Xie, H. Zheng, C. Liu, and L. Lin. 2018. SNAS: Stochastic neural architecture search. arXiv:1812.09926.

[47]

T. Chen, I. Goodfellow, and J. Shlens. 2015. Net2net: Accelerating learning via knowledge transfer. arXiv:1511.05641.

[48]

M. Schuster and K. K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673–2681.

Digital Library

[49]

P. Bashivan, M. Tensen, and J. J. DiCarlo. 2019. Teacher guided architecture search. In Proceedings of the IEEE International Conference on Computer Vision. 5320–5329.

[50]

X. Zheng, R. Ji, L. Tang, B. Zhang, J. Liu, and Q. Tian. 2019. Multinomial distribution learning for effective neural architecture search. In Proceedings of the IEEE International Conference on Computer Vision. 1304–1313.

[51]

H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang. 2018. Efficient architecture search by network transformation. In 32nd AAAI Conference on Artificial Intelligence. 2787–2794.

[52]

A. Ashok, N. Rhinehart, F. Beainy, and K. M. Kitani. 2017. N2N learning: Network to network compression via policy gradient reinforcement learning. arXiv:1709.06030.

[53]

J. Mei, Y. Li, X. Lian, X. Jin, L. Yang, A. Yuille, and J. Yang. 2019. AtomNAS: Fine-grained end-to-end neural architecture search. arXiv:1912.09640.

[54]

X. Gong, S. Chang, Y. Jiang, and Z. Wang. 2019. AutoGAN: Neural architecture search for generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 3224–3234.

[55]

R. Pasunuru and M. Bansal. 2019. Continual and multi-task architecture search. arXiv:1906.05226.

[56]

G. Hinton, O. Vinyals, and J. Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531.

[57]

H. Cai, J. Yang, W. Zhang, S. Han, and Y. Yu. 2018. Path-level network transformation for efficient architecture search. arXiv:1806.02639.

[58]

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi. 2017. Inception-v4, inception-ResNet and the impact of residual connections on learning. In 31st AAAI Conference on Artificial Intelligence.

[59]

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826.

[60]

J. Fang, Y. Sun, K. Peng, Q. Zhang, Y. Li, W. Liu, and X. Wang. 2020. Fast neural network adaptation via parameter remapping and architecture search. arXiv:2001.02525.

[61]

K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, and E. P. Xing. 2018. Neural architecture search with Bayesian optimisation and optimal transport. In Advances in Neural Information Processing Systems. 2016–2025.

[62]

R. Negrinho and G. Gordon. 2017. DeepArchitect: Automatically designing and training deep architectures. arXiv:1704.08792.

[63]

C. Liu, L. C. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, and L. Fei-Fei. 2019. Auto-DeepLab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 82–92.

[64]

S. Ding, T. Chen, X. Gong, W. Zha, and Z. Wang. 2020. AutoSpeech: Neural architecture search for speaker recognition. arXiv:2005.03215.

[65]

Y. Zhang, Z. Qiu, J. Liu, T. Yao, D. Liu, and T. Mei. 2019. Customizable architecture search for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11641–11650.

[66]

Y. Chen, T. Yang, X. Zhang, G. Meng, C. Pan, and J. Sun. 2019. DetNAS: Neural architecture search on object detection. arXiv:1903.10979.

[67]

G. Anandalingam and T. L. Friesz. 1992. Hierarchical optimization: An introduction. Annals of Operations Research 34, 1 (1992), 1–11.

Digital Library

[68]

B. Colson, P. Marcotte, and G. Savard. 2007. An overview of bilevel optimization. Annals of Operations Research 153, 1 (2007), 235–256.

[69]

R. Shin, C. Packer, and D. Song. 2018. Differentiable Neural Network Architecture Search. ICLR.

[70]

K. Ahmed and L. Torresani. 2018. MaskConnect: Connectivity learning by gradient descent. In Proceedings of the European Conference on Computer Vision (ECCV’18). 349–365.

[71]

S. Saxena and J. Verbeek. 2016. Convolutional neural fabrics. In Advances in Neural Information Processing Systems. 4053–4061.

[72]

K. Ahmed and L. Torresani. 2017. Connectivity learning in multi-branch networks. arXiv:1709.09582.

[73]

T. Veniat and L. Denoyer. 2018. Learning time/memory-efficient deep architectures with budgeted super networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3492–3500.

[74]

R. Luo, F. Tian, T. Qin, E. Chen, and T. Y. Liu. 2018. Neural architecture optimization. In Advances in Neural Information Processing Systems. 7816–7827.

[75]

J. Chang, Y. Guo, G. Meng, S. Xiang, and C. Pan. 2019. DATA: Differentiable ArchiTecture approximation. In Advances in Neural Information Processing Systems. 874–884.

[76]

G. Ghiasi, T. Y. Lin, and Q. V. Le. 2019. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7036–7045.

[77]

D. Tran, J. Ray, Z. Shou, S. F. Chang, and M. Paluri. 2017. ConvNet architecture search for spatiotemporal feature learning. arXiv:1708.05038.

[78]

L. C. Chen, M. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, ... and J. Shlens. 2018. Searching for efficient multi-scale architectures for dense image prediction. In Advances in Neural Information Processing Systems. 8699–8710.

[79]

C. Ying, A. Klein, E. Real, E. Christiansen, K. Murphy, and F. Hutter. 2019. NAS-bench-101: Towards reproducible neural architecture search. arXiv:1902.09635.

[80]

Y. Jiang, C. Hu, T. Xiao, C. Zhang, and J. Zhu. 2019. Improved differentiable architecture search for language modeling and named entity recognition. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3576–3581.

[81]

J. Lafferty, A. McCallum, and F. C. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML 2001 (2001), 282–289.

[82]

D. Koller and N. Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.

[83]

X. Dong and Y. Yang. 2019. Searching for a robust neural architecture in four GPU hours. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1761–1770.

[84]

Y. Xu, L. Xie, X. Zhang, X. Chen, G. Qi, Q. Tian, and H. Xiong. 2019. PC-DARTS: Partial channel connections for memory-efficient architecture search. arXiv:abs/1907.05737.

[85]

T. Elsken, J. H. Metzen, and F. Hutter. 2017. Simple and efficient architecture search for convolutional neural networks. arXiv:1711.04528.

[86]

A. Sharif Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. 2014. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 806–813.

[87]

B. Zoph, D. Yuret, J. May, and K. Knight. 2016. Transfer learning for low-resource neural machine translation. arXiv:1604.02201.

[88]

M. T. Luong, Q. V. Le, I. Sutskever, O. Vinyals, and L. Kaiser. 2015. Multi-task sequence to sequence learning. arXiv:1511.06114.

[89]

D. Ha, A. Dai, and Q. V. Le. 2016. Hypernetworks. arXiv:1609.09106.

[90]

L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar. 2017. Hyperband: Bandit-based configuration evaluation for hyperparameter optimization. In International Conference on Learning Representations 2017 (ICLR’17).

[91]

C. Zhang, M. Ren, and R. Urtasun. 2018. Graph hypernetworks for neural architecture search. arXiv:1810.05749.

[92]

X. Dong and Y. Yang. 2019. One-shot neural architecture search via self-evaluated template network. In Proceedings of the IEEE International Conference on Computer Vision. 3681–3690.

[93]

M. Mirza and S. Osindero. 2014. Conditional generative adversarial nets. arXiv:1411.1784.

[94]

T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. 2016. Improved techniques for training gans. In Advances in Neural Information Processing Systems. 2234–2242.

[95]

T. Karras, T. Aila, S. Laine, and J. Lehtinen. 2017. Progressive growing of GANs for improved quality, stability, and variation. arXiv:1710.10196.

[96]

N. T. Tran, T. A. Bui, and N. M. Cheung. 2018. Dist-GAN: An improved GAN using distance constraints. In Proceedings of the European Conference on Computer Vision (ECCV’18). 370–385.

[97]

W. Wang, Y. Sun, and S. Halgamuge. 2018. Improving MMD-GAN training with repulsive loss function. arXiv:1812.09916.

[98]

Q. Hoang, T. D. Nguyen, T. Le, and D. Phung. 2018. MGAN: Training generative adversarial nets with multiple generators. In ICLR.

[99]

G. Ghiasi, T. Y. Lin, and Q. V. Le. 2019. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7036–7045.

[100]

H. Cai, C. Gan, and S. Han. 2019. Once for all: Train one network and specialize it for efficient deployment. arXiv:1908.09791.

[101]

X. Chu, B. Zhang, R. Xu, and J. Li. 2019. FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. arXiv:1907.01845.

[102]

X. Li, C. Lin, C. Li, M. Sun, W. Wu, J. Yan, and W. Ouyang. 2019. Improving one-shot NAS by suppressing the posterior fading. arXiv:1910.02543.

[103]

B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, ... and K. Keutzer. 2019. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10734–10742.

[104]

H. Cai, L. Zhu, and S. Han. 2018. ProxylessNAS: Direct neural architecture search on target task and hardware. arXiv:1812.00332.

[105]

L. Li and A. Talwalkar. 2019. Random search and reproducibility for neural architecture search. arXiv:1902.07638.

[106]

Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun. 2019. Single path one-shot neural architecture search with uniform sampling. arXiv:1904.00420.

[107]

T. Domhan, J. T. Springenberg, and F. Hutter. 2015. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In 24th International Joint Conference on Artificial Intelligence.

[108]

A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter. 2016. Learning curve prediction with Bayesian neural networks. In International Conference on Learning Representation. 184--194.

[109]

A. Chandrashekaran and I. R. Lane. 2017. Speeding up hyper-parameter optimization by extrapolation of learning curves using previous builds. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 477–492.

[110]

B. Deng, J. Yan, and D. Lin. 2017. Peephole: Predicting network performance before training. arXiv:1712.03351.

[111]

A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, and F. Hutter. 2020. Understanding and robustifying differentiable architecture search. In ICLR.

[112]

J. Peng, M. Sun, Z. X. Zhang, T. Tan, and J. Yan. 2019. Efficient neural architecture transformation search in channel-level for object detection. In Advances in Neural Information Processing Systems. 14290–14299.

[113]

Y. Zhu et al. [n.d.]. Deep subdomain adaptation network for image classification. In IEEE Transactions on Neural Networks and Learning Systems.

[114]

Miao Zhang, Huiqi Li, Shirui Pan, Xiaojun Chang, Chuan Zhou, Zongyuan Ge, and Steven W. Su. One-shot neural architecture search: Maximising diversity to overcome catastrophic forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence.

[115]

N. Nayman, A. Noy, T. Ridnik, I. Friedman, R. Jin, and L. Zelnik. 2019. XNAS: Neural architecture search with expert advice. In Advances in Neural Information Processing Systems. 1975–1985.

[116]

S. Cao, X. Wang, and K. M. Kitani. 2019. Learnable embedding space for efficient neural architecture compression. arXiv:1902.00383.

[117]

T. Elsken, J. H. Metzen, and F. Hutter. 2018. Efficient multi-objective neural architecture search via Lamarckian evolution. arXiv:1804.09081.

[118]

X. Li, Y. Zhou, Z. Pan, and J. Feng. 2019. Partial order pruning: For best speed/accuracy trade-off in neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9145–9153.

[119]

X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, ... and P. Vajda. 2019. ChamNet: Towards efficient network design through platform-aware model adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11398–11407.

[120]

F. Liang, C. Lin, R. Guo, M. Sun, W. Wu, J. Yan, and W. Ouyang. 2019. Computation reallocation for object detection. arXiv:1912.11234.

[121]

I. Fedorov, R. P. Adams, M. Mattina, and P. Whatmough. 2019. Sparse: Sparse architecture search for CNNs on resource-constrained microcontrollers. In Advances in Neural Information Processing Systems. 4978–4990.

[122]

V. Nekrasov, H. Chen, C. Shen, and I. Reid. 2019. Fast neural architecture search of compact semantic segmentation models via auxiliary cells. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9126–9135.

[123]

E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le. 2019. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 113–123.

[124]

M. Tan and Q. V. Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. arXiv:1905.11946.

[125]

X. Zhang, Q. Wang, J. Zhang, and Z. Zhong. 2019. Adversarial AutoAugment. arXiv:1912.11188.

[126]

X. Dong and Y. Yang. 2019. Network pruning via transformable architecture search. In Advances in Neural Information Processing Systems. 759–770.

[127]

Z. Lu, I. Whalen, V. Boddeti, Y. D. Dhebar, K. Deb, E. D. Goodman, and W. Banzhaf. 2018. NSGA-NET: A multi-objective genetic algorithm for neural architecture search. Computer Vision and Pattern Recognition.

[128]

X. Dong, L. Liu, K. Musial, and B. Gabrys. 2020. NATS-Bench: Benchmarking NAS algorithms for architecture topology and size. arXiv:2009.00437.

[129]

A. Yang, P. M. Esperança, and F. M. Carlucci. 2019. NAS evaluation is frustratingly hard. arXiv:1912.12522.

[130]

M. Wistuba. 2018. Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 243–258.

[131]

T. DeVries and G. W. Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552.

[132]

F. P. Casale, J. Gordon, and N. Fusi. 2019. Probabilistic neural architecture search. arXiv:abs/1902.05116.

[133]

S. Zagoruyko and N. Komodakis. 2016. Wide residual networks. arXiv:1605.07146.

[134]

X. Gastaldi. 2017. Shake-shake regularization. arXiv:1705.07485.

[135]

Y. Yamada, M. Iwamura, and K. Kise. 2016. Deep pyramidal residual networks with separated stochastic depth. arXiv:1612.01230.

[136]

G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger. 2016. Deep networks with stochastic depth. In European Conference on Computer Vision. Springer, Cham, 646–661.

[137]

G. Larsson, M. Maire, and G. Shakhnarovich. 2016. FractalNet: Ultra-deep neural networks without residuals. arXiv:1605.07648.

[138]

H. Zhou, M. Yang, J. Wang, and W. Pan. 2019. BayesNAS: A Bayesian approach for neural architecture search. arXiv:1905.04919.

[139]

X. Zhang, X. Zhou, M. Lin, and J. Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6848–6856.

[140]

S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1492–1500.

[141]

X. Zhang, Z. Li, C. Change Loy, and D. Lin. 2017. PolyNet: A pursuit of structural diversity in very deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 718–726.

[142]

Y. Chen, J. Li, H. Xiao, X. Jin, S. Yan, and J. Feng. 2017. Dual path networks. In Advances in Neural Information Processing Systems. 4467–4475.

[143]

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.

[144]

J. Long, E. Shelhamer, and T. Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.

[145]

W. Sun, Z. Huang, M. Liang, T. Shao, and H. Bi. 2020. Cocoon image segmentation method based on fully convolutional networks. In Proceedings of the 7th Asia International Symposium on Mechatronics. Springer, Singapore, 832–843.

[146]

G. Vecchio, S. Palazzo, D. Giordano, F. Rundo, and C. Spampinato. [n.d.]. MASK-RL: Multiagent video object segmentation framework through reinforcement learning. In IEEE Transactions on Neural Networks and Learning Systems.

[147]

Z. Ji, Y. Zhao, Y. Pang, X. Li and J. Han. [n.d.]. Deep attentive video summarization with distribution consistency learning. In IEEE Transactions on Neural Networks and Learning Systems.

[148]

D. Zhang, J. Han, L. Zhao, and T. Zhao. [n.d.]. From discriminant to complete: Reinforcement searching-agent learning for weakly supervised object detection. In IEEE Transactions on Neural Networks and Learning Systems.

[149]

K. Shih, C. Chiu, J. Lin, and Y. Bu. 2020. Real-time object detection with reduced-region proposal network via multi-feature concatenation. In IEEE Transactions on Neural Networks and Learning Systems. 31, 6 (June 2020, pp. 2164–2173.

[150]

Y. Zhou, G. G. Yen, and Z. Yi. [n.d.]. Evolutionary compression of deep neural networks for biomedical image segmentation. In IEEE Transactions on Neural Networks and Learning Systems.

[151]

B. Zhang, D. Xiong, J. Xie, and J. Su. [n.d.]. Neural machine translation with GRU-gated attention model. In IEEE Transactions on Neural Networks and Learning Systems.

[152]

M. Guo, Y. Yang, R. Xu, and Z. Liu. 2019. When NAS meets robustness: In search of robust architectures against adversarial attacks. arXiv:abs/1911.10695.

[153]

G. Li, G. Qian, I. C. Delgadillo, M. Müller, A. Thabet, and B. Ghanem. 2019. SGAS: Sequential greedy architecture search. arXiv:1912.00195.

[154]

J. Fang, Y. Sun, Q. Zhang, Y. Li, W. Liu, and X. Wang. 2019. Densely connected search space for more flexible neural architecture search. arXiv:1906.09607.

[155]

D. So, C. Liang, and Q. V. Le. 2019. The evolved transformer. arXiv:abs/1901.11117.

[156]

A. Zela, J. Siems, and F. Hutter. 2020. NAS-Bench-1Shot1: Benchmarking and dissecting one-shot neural architecture search. arXiv:2001.10422.

[157]

X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, ... and J. E. Gonzalez. 2020. FBNetV3: Joint architecture-recipe search using neural acquisition function. arXiv:2006.02049.

[158]

X. Dong, M. Tan, A. W. Yu, D. Peng, B. Gabrys, and Q. V. Le. 2020. AutoHAS: Differentiable hyper-parameter and architecture search. arXiv:2006.03656.

[159]

H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv:1710.09412.

[160]

G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger. 2016. Deep networks with stochastic depth. In European Conference on Computer Vision. Springer, Cham, 646–661.

[161]

D. R. Jones. 2001. A taxonomy of global optimization methods based on response surfaces. Journal of Global Optimization 21, 4 (2001), 345–383.

Digital Library

[162]

F. Hutter, H. H. Hoos, and K. Leyton-Brown. 2011. Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization. Springer, Berlin, 507–523.

Cited By

Valentin SKleinegesse SBramley NSeriès PGutmann MLucas C(2024)Designing optimal behavioral experiments using machine learningeLife10.7554/eLife.8622413Online publication date: 23-Jan-2024
https://doi.org/10.7554/eLife.86224
Xue YZhang ZNeri F(2024)Similarity surrogate-assisted evolutionary neural architecture search with dual encoding strategyElectronic Research Archive10.3934/era.202405032:2(1017-1043)Online publication date: 2024
https://doi.org/10.3934/era.2024050
Su YAng LSeng KSmith J(2024)Deep Learning and Neural Architecture Search for Optimizing Binary Neural Network Image Super ResolutionBiomimetics10.3390/biomimetics90603699:6(369)Online publication date: 18-Jun-2024
https://doi.org/10.3390/biomimetics9060369
Show More Cited By

Index Terms

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms

Recommendations

Neural architecture search: a survey

Deep Learning has enabled remarkable progress over the last years on a variety of tasks, such as image recognition, speech recognition, and machine translation. One crucial aspect for this progress are novel neural architectures. Currently employed ...
A Training-free Genetic Neural Architecture Search
ACM ICEA '21: Proceedings of the 2021 ACM International Conference on Intelligent Computing and its Emerging Applications

The so-called neural architecture search (NAS) provides an alternative way to construct a "good neural architecture," which would normally outperform hand-made architectures, for solving complex problems without domain knowledge. However, a critical ...
Graph neural architecture prediction
Abstract
Graph neural networks (GNNs) have shown their superiority in the modeling of graph data. Recently, increasing attention has been paid to automatic graph neural architecture search, aiming to overcome the shortcomings of manually constructing GNN ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 54, Issue 4

May 2022

782 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3464463

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2021

Accepted: 01 January 2021

Revised: 01 January 2021

Received: 01 June 2020

Published in CSUR Volume 54, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

ARC DECRA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

267
Total Citations
View Citations
5,276
Total Downloads

Downloads (Last 12 months)1,630
Downloads (Last 6 weeks)127

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Valentin SKleinegesse SBramley NSeriès PGutmann MLucas C(2024)Designing optimal behavioral experiments using machine learningeLife10.7554/eLife.8622413Online publication date: 23-Jan-2024
https://doi.org/10.7554/eLife.86224
Xue YZhang ZNeri F(2024)Similarity surrogate-assisted evolutionary neural architecture search with dual encoding strategyElectronic Research Archive10.3934/era.202405032:2(1017-1043)Online publication date: 2024
https://doi.org/10.3934/era.2024050
Su YAng LSeng KSmith J(2024)Deep Learning and Neural Architecture Search for Optimizing Binary Neural Network Image Super ResolutionBiomimetics10.3390/biomimetics90603699:6(369)Online publication date: 18-Jun-2024
https://doi.org/10.3390/biomimetics9060369
Hayes BShier JFazekas GMcPherson ASaitis C(2024)A review of differentiable digital signal processing for music and speech synthesisFrontiers in Signal Processing10.3389/frsip.2023.12841003Online publication date: 11-Jan-2024
https://doi.org/10.3389/frsip.2023.1284100
S MN C S(2024)Review on scene graph generation methodsMultiagent and Grid Systems10.3233/MGS-23013220:2(129-160)Online publication date: 12-Aug-2024
https://doi.org/10.3233/MGS-230132
Trzciński MŁukasik SGandomi A(2024)Optimizing the Structures of Transformer Neural Networks Using Parallel Simulated AnnealingJournal of Artificial Intelligence and Soft Computing Research10.2478/jaiscr-2024-001514:3(267-282)Online publication date: 11-Jun-2024
https://doi.org/10.2478/jaiscr-2024-0015
Yang JJiang GWang YChen Y(2024)An Intelligent End-to-End Neural Architecture Search Framework for Electricity Forecasting Model DevelopmentINFORMS Journal on Computing10.1287/ijoc.2023.0034Online publication date: 30-May-2024
https://doi.org/10.1287/ijoc.2023.0034
Zhou HWan XVulić IKorhonen A(2024) AutoPEFT : Automatic Configuration Search for Parameter-Efficient Fine-Tuning Transactions of the Association for Computational Linguistics10.1162/tacl_a_0066212(525-542)Online publication date: 3-May-2024
https://doi.org/10.1162/tacl_a_00662
Liu HGalindo MXie HWong LShuai HLi YCheng W(2024)Lightweight Deep Learning for Resource-Constrained Environments: A SurveyACM Computing Surveys10.1145/365728256:10(1-42)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3657282
Wu DWang XQiao YWang ZJiang JCui SWang FSekar VYu MSeneviratne AVeitch D(2024)NetLLM: Adapting Large Language Models for NetworkingProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672268(661-678)Online publication date: 4-Aug-2024
https://dl.acm.org/doi/10.1145/3651890.3672268
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents