Bn-nas: Neural architecture search with batch normalization

B Chen, P Li, B Li, C Lin, C Li, M Sun… - Proceedings of the …, 2021 - openaccess.thecvf.com
B Chen, P Li, B Li, C Lin, C Li, M Sun, J Yan, W Ouyang
Proceedings of the IEEE/CVF international conference on …, 2021openaccess.thecvf.com
Abstract Model training and evaluation are two main time-consuming processes during
neural architecture search (NAS). Although weight-sharing based methods have been
proposed to reduce the number of trained networks, these methods still need to train the
supernet for hundreds of epochs and evaluate thousands of subnets to find the optimal
network architecture. In this paper, we propose NAS with Batch Normalization (BN), which
we refer to as BN-NAS, to accelerate both the evaluation and training process. For fast …
Abstract
Model training and evaluation are two main time-consuming processes during neural architecture search (NAS). Although weight-sharing based methods have been proposed to reduce the number of trained networks, these methods still need to train the supernet for hundreds of epochs and evaluate thousands of subnets to find the optimal network architecture. In this paper, we propose NAS with Batch Normalization (BN), which we refer to as BN-NAS, to accelerate both the evaluation and training process. For fast evaluation, we propose a novel BN-based indicator that predicts subnet performance at a very early training stage. We further improve the training efficiency by only training the BN parameters during the supernet training. This is based on our observation that training the whole supernet is not necessary while training only BN parameters accelerates network convergence for network architecture search. Extensive experiments show that our method can significantly shorten the time of training supernet by more than 10 times and evaluating subnets by more than 600,000 times without losing accuracy.
openaccess.thecvf.com