Abstract
Constructive algorithm provides a gradually building mechanism by increasing nodes from zero. By this means, the neural network can independently and efficiently determine its structure. However, this mechanism has an essential issue: the algorithm that adds nodes one by one is too greedy to keep an efficient construction way and the global optimal solution may be missed. Therefore, this paper proposes a novel grafting mechanism to add block nodes of any number by training a sub-network during the construction. Then, a fast-training approach of the added block neurons is presented by selecting a small sub-network from the large initialized network and the corresponding grafting constructive algorithm (GCA) is established. To obtain a compact network structure, a fine-tuning scheme is developed according to GCA to adjust all parameters as a hybrid fashion and the hidden weights are extended to deal with matrix input in image classification. The experimental results on regression and classification tasks demonstrate that the proposed GCA can achieve a more compact network than other constructive algorithms and a faster error convergence rate than traditional gradient-based optimization algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Mustafa MK, Allen T, Appiah K (2017) A comparative review of dynamic neural networks and hidden markov model methods for mobile on-device speech recognition. Neural Comput Appl 31:891–899. https://doi.org/10.1007/s00521-017-3028-2
Ma Y, Wang X, Wei L (2021) Multi-level spatial and semantic enhancement network for expression recognition. Appl Intell 51(12):8565–8578. https://doi.org/10.1007/s10489-021-02254-0
Bianucci AM, Micheli A, Sperduti A, Starita A (2000) Application of cascade correlation networks for structures to chemistry. Appl Intell 12(1-2):117–147. https://doi.org/10.1023/A:1008368105614
Muzhou H, Taohua L, Yunlei Y, Hao Z, Hongjuan L, Xiugui Y, Xinge L (2017) A new hybrid constructive neural network method for impacting and its application on tungsten price prediction. Appl Intell 47(1):28–43. https://doi.org/10.1007/s10489-016-0882-zhttps://doi.org/10.1007/s10489-016-0882-z
Kwok TY, Yeung DY (1997) Objective functions for training new hidden units in constructive neural networks. IEEE Trans Neural Netw 8(5):1131–1148. https://doi.org/10.1109/72.623214
Islam MM, Murase K (2001) A new algorithm to design compact two-hidden-layer artificial neural networks. Neural Netw 14(9):1265–1278. https://doi.org/10.1016/S0893-6080(01)00075-2https://doi.org/10.1016/S0893-6080(01)00075-2
Ma L, Khorasani K (2005) Constructive feedforward neural networks using hermite polynomial activation functions. IEEE Trans Neural Netw 16(4):821–833. https://doi.org/10.1109/TNN.2005.851786https://doi.org/10.1109/TNN.2005.851786
Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A new constructive algorithm for architectural and functional adaptation of artificial neural networks. IEEE Trans Syst Man Cybern B Cybern 39 (6):1590–1605. https://doi.org/10.1109/TSMCB.2009.2021849https://doi.org/10.1109/TSMCB.2009.2021849
Wu X, Rozycki P, Wilamowski BM (2017) A hybrid constructive algorithm for single-layer feedforward networks learning. IEEE Trans Neural Netw Learn Syst 26(8):1659–1668. https://doi.org/10.1109/TNNLS.2014.2350957
Scardapane S, Wang D (2017) Randomness in neural networks: an overview. Wires Data Min Knowl 7(2):1200. https://doi.org/10.1002/widm.1200https://doi.org/10.1002/widm.1200
Zhang PB, Yang ZX (2017) A new learning paradigm for random vector functional-link network: RVFL+. Neural Netw 122:94–105. https://doi.org/10.1016/j.neunet.2019.09.039
Li M, Wang D (2017) Insights into randomized algorithms for neural networks: practical issues and common pitfalls. Inf Sci 382:170–178. https://doi.org/10.1016/j.ins.2016.12.007
Wang D, Li M (2017) Stochastic configuration networks: fundamentals and algorithms. IEEE Trans Cybern 47(10):3466–3479. https://doi.org/10.1109/TCYB.2017.2734043
Li M (2019) Wang, d.: 2-d stochastic configuration networks for image data analytics. IEEE Trans Cybern 51(1):359–372. https://doi.org/10.1109/TCYB.2019.2925883
Dai W, Ao Y, et al. (2022) Incremental learning paradigm with privileged information for random vector functional-link networks: Irvfl+. Neural Comput Appl:1–13. https://doi.org/10.1007/s00521-021-06793-yhttps://doi.org/10.1007/s00521-021-06793-y
Chen CLP, Liu Z (2018) Broad learning system: an effective and efficient incremental learning system without the need for deep architecture. IEEE Trans Neural Netw Learn Syst 29(99):10–24. https://doi.org/10.1109/TCYB.2018.2857815
Gong X, Zhang T, Chen CLP, Liu Z (2021) Research review for broad learning system: algorithms, theory, and applications. IEEE Trans Cybern:1–29. https://doi.org/10.1109/TCYB.2021.3061094
Rao CR, Mitra SK (1972) Generalized inverse of a matrix and its applications. Oper Ree Q vol 1(4)
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2014) Intriguing properties of neural networks. arXiv:1312.6199
Frankle J, Carbin M (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: International conference on learning representations, ICLR. https://openreview.net/forum?id=rJl-b3RcF7
Belkin M, Hsu D, Ma S, Mandal S (2019) Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proc Natl Acad Sci 116(32):201903070. https://doi.org/10.1073/pnas.1903070116
Zhao YB, Li D (2012) Reweighted l1-minimization for sparse solutions to underdetermined linear systems. SIAM J Optim 22(3):1065–1088. https://doi.org/10.1137/0914044
Saarinen S, Bramley R, Cybenko G (1993) Ill-conditioning in neural network training problems. SIAM J Optim, vol 14(3). https://doi.org/10.1137/0914044
Du SS, Lee JD (2018) On the power of over-parametrization in neural networks with quadratic activation. In: Proceedings of international conference on machine learning, ICML, pp 1328–1337. http://proceedings.mlr.press/v80/du18a.html
Gao F, Deng X, et al. (2022) Multi-modal convolutional dictionary learning. IEEE Trans Image Process 31:1325–1339. https://doi.org/10.1109/TIP.2022.3141251https://doi.org/10.1109/TIP.2022.3141251
Li S, Xiao L, Jiang T (2021) An efficient matching pursuit based compressive sensing detector for uplink grant-free noma. IEEE Trans Veh 70(2):2012–2017. https://doi.org/10.1109/TVT.2021.3056462https://doi.org/10.1109/TVT.2021.3056462
Wen J, Zhang R, Yu W (2020) Signal-dependent performance analysis of orthogonal matching pursuit for exact sparse recovery. IEEE Trans Signal Process 68:5031–5046. https://doi.org/10.1109/TSP.2020.3016571
Rubio JDJ (2021) Stability analysis of the modified levenberg–marquardt algorithm for the artificial neural network training. IEEE Trans Neural Netw Learn Syst 32(8):3510–3524. https://doi.org/10.1109/TNNLS.2020.3015200
Jing L, Zhao J, Cao F (2014) Extended feed forward neural networks with random weights for face recognition. Neurocomputing 136(20):96–102. https://doi.org/10.1016/j.neucom.2014.01.022
Verma BK, Mulawka JJ (1994) A modified backpropagation algorithm. In: Proceedings of 1994 IEEE international conference on neural networks (ICNN’94), vol 2, pp 840–844. https://doi.org/10.1109/ICNN.1994.374289
Shamir O, Zhang T (2013) Stochastic gradient descent for non-smooth optimization: convergence results and optimal averaging schemes. In: Proceedings of the 30th international conference on machine learning, vol 28. Atlanta, Georgia, USA, pp 71–79
Lee K, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision, pp 138–142. https://doi.org/10.1109/ACV.1994.341300
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, S., Xie, L. Grafting constructive algorithm in feedforward neural network learning. Appl Intell 53, 11553–11570 (2023). https://doi.org/10.1007/s10489-022-04082-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04082-2