Abstract
Computational complexity of state of the art Convolutional Neural Networks (CNNs) makes their integration in embedded systems with low power consumption requirements a challenging task. This requires the joint design and adaptation of hardware and algorithms. In this paper, we propose a new general CNN compression method to reduce both the number of parameters and operations. To solve this, we introduce a new Principal Component Analysis (PCA) based compression, which relies on an optimal transformation (in the mean squared error sense) of the filters on each layer into a new representation space where convolutions are then applied. Compression is achieved by dimensioning this new representation space, with an arbitrarily controlled accuracy degradation of the new CNN. PCA compression is evaluated on multiple networks and datasets from the state of the art and applied to a binary face classification network. To show the versatility of the method and its usefulness to adapt a CNN to a hardware computing system, the compressed face classification network is implemented and evaluated on a custom embedded multiprocessor. Results show that for example, an overall compression rates of 2x can be achieved on a compact ResNet-32 model on the CIFAR-10 dataset, with only a negligible loss of 2% of the network accuracy, while up to 11x compression rates can be achieved on specific layers with negligible accuracy loss.
Similar content being viewed by others
References
Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. pp 1097–1105.
Shlens, J. (2014). A tutorial on principal component analysis. arXiv:1404.1100.
Canziani, A., Adam P., & Eugenio C. (2016). An analysis of deep neural network models for practical applications. arXiv:1605.07678.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. In Proceedings of international conference on learning representations.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778.
Huang, G., Liu, Z., Maaten, L.V.D., & Weinberger, K.Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4700–4708.
Howard, A.G., Menglong Z., Bo C., Dmitry K., Weijun W., Tobias W., Marco A., & Hartwig A. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861.
Liu, L., & Jia, D. (2018). Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution. In Thirty-second AAAI conference on artificial intelligence.
Iandola, F.N, Song, H., Matthew , W. M., Khalid, A., William, J.D., & Kurt, K. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv:1602.07360.
Zoph, B., Vasudevan, V., Shlens, J., & Le, Q.V. (2018). Learning transferable architectures for scalable image recognition.. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8697–8710.
He, Y., Ji, L., Liu, Z., Wang, H., Li, Li-Jia, & Han, S. (2018). Amc: Automl for model compression and acceleration on mobile devices.. In Proceedings of the european conference on computer vision (ECCV). pp 784–800.
Jacob, B., Kligys, S., Bo, C., Zhu, M., Tang, M., Howard, A., Adam, H., & Kalenichenko, D. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference.. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2704–2713.
Courbariaux, M., Bengio, Y., & David, J.P. (2015). Binaryconnect: training deep neural networks with binary weights during propagations. In Advances in neural information processing systems. pp 3123–3131.
Rastegari, M., Vicente, O., Joseph, R., & Ali, F. (2016). Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision (pp. 525–542). Cham: Springer.
Courbariaux, M., & Bengio, Y. (2016). Binarynet: Training deep neural networks with weights and activations constrained to + 1 or -1. In Proceedings of the advances in neural information processing systems (NIPS).
Li, F., & Liu, B. (2016). Ternary weight networks. In NIPS workshop on efficient methods for deep neural networks.
Lin, X., Zhao, C., & Pan, W. (2017). Towards accurate binary convolutional neural network. In Advances in Neural Information Processing Systems. pp 345–353.
Han, S., Mao, H., & Dally, W.J. (2016). Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. In 4th International conference on learning representations.
Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., & Chen, Y. (2016). Cambricon-X: An accelerator for sparse neural networks proceedings of the international symposium on microarchitecture (MICRO.
Chen, Y.H., Joel E., & Vivienne S. (2018). Eyeriss v2: A flexible and high-performance accelerator for emerging deep neural networks. arXiv:1807.07928.
Hinton, G., Vinyals, O., & Dean, J. (2014). Distilling the knowledge in a neural network. Deep Learning and Representation Learning Workshop, NIPS.
Ensemble knowledge distillation for learning improved and efficient networks.
Luan, S., Chen, C., Zhang, B., Han, J., & Liu, J. (2018). Gabor convolutional networks. IEEE Transactions on Image Processing, 27(9), 4357–4366.
Shang, W., Sohn, K., Almeida, D., & Lee, H. (2016). Understanding and improving convolutional neural networks via concatenated rectified linear units. In International conference on machine learning. pp 2217–2225.
Cohen, T., & Welling, M. (2016). Group equivariant convolutional networks. In International conference on machine learning. pp 2990–2999.
Jaderberg, M., Simonyan, K., & Zisserman, A. (2015). Spatial transformer networks. In Advances in neural information processing systems. pp 2017–2025.
Tai, K.S., Peter B., & Gregory V. (2019). Equivariant Transformer Networks. arXiv:1901.11399.
Sabour, S., Frosst, N., & Hinton, G.E. (2017). Dynamic routing between capsules. In Advances in neural information processing systems. pp 3856–3866.
Kosiorek, A.R., Sara S., Yee W.T., & Geoffrey E.H. (2019). Stacked Capsule Autoencoders. arXiv:1906.06818.
Rippel, O., Snoek, J., & Adams, R.P. (2015). Spectral representations for convolutional neural networks. In Advances in neural information processing systems. pp 2449–2457.
Lavin, A., & Gray, S. (2016). Fast algorithms for convolutional neural networks.
Huang, G.B., Bai, Z., Kasun, L.L.C., & Vong, C.M. (2015). Local receptive fields based extreme learning machine. IEEE Computational Intelligence Magazine, 10(2), 18–29.
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in neural information processing systems. pp 1269–1277.
Zhang, C., Qianli L., Alexander R., Brando M., Noah G., & Tomaso P. (2018). Theory of deep learning IIb: Optimization properties of SGD. arXiv:1801.02254.
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. (Technical Report) University of Toronto.
Abadi, M., Agarwal, A., & et al. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. 12th Symposium on Operating Systems Design and Implementation.
Liu, X., Pool, J., Han, S., & Dally, W.J. (2018). Efficient sparse-winograd convolutional neural networks. In The 2018 International conference on learning representations.
Triantafyllidou, D., & Tefas, A. (2016). A fast deep convolutional neural network for face detection in big visual data. In INNS conference on big data. pp 61–70.
Jain, V., & Learned-Miller, E. (2010). Fddb: A benchmark for face detection in unconstrained settings. Technical Report UMCS-2010-009.
Koestinger, M., Wohlhart, P., Roth, P.M., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization.
Schwambach, V., Cleyet-Merle, S., Issard, A., & Mancini, S. (2015). Estimating the potential speedup of computer vision applications on embedded multiprocessors. arXiv:1502.07446.
Stoutchinin, A., & Benini L. (2019). StreamDrive: A dynamic dataflow framework for clustered embedded architectures. Journal of Signal Processing Systems, 91(3-4), 275–301.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fernández Brillet, L., Leclaire, N., Mancini, S. et al. Compression and Speed-up of Convolutional Neural Networks Through Dimensionality Reduction for Efficient Inference on Embedded Multiprocessor. J Sign Process Syst 94, 263–281 (2022). https://doi.org/10.1007/s11265-020-01616-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-020-01616-0