Abstract
The loss function expresses the differences between the predicted values from regression or classification models and the actual instances in machine learning. Regularization also plays an important role in machine learning, and it can mitigate overfitting problems, perform variable selection, and produce sparse models. The hyperparameter in these models controls the trade-off between the loss function and the regularization term, as well as the bias-variance trade-off. The choice of hyperparameter will influence the performance of the models. Thus, the hyperparameter needs to be tuned for effective learning from data. In some machine learning models, the optimal values for estimated coefficients are piecewise linear with respect to the hyperparameter. Efficient algorithms can be developed to compute all solutions, and these kinds of methods are called solution path algorithms. They can significantly reduce the efforts for cross-validation and highly speed up hyperparameter tuning. In this paper, we review the solution path algorithms widely used in regression and classification machine learning problems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
Not applicable.
Code Availability
Not applicable.
References
Shi Y (2021) Advances in big data analytics: theory, algorithms and practices. Springer, Singapore. https://doi.org/10.1007/978-981-16-3607-3
Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York
Shi Y, Tian Y, Kou G, Peng Y, Li J (2011) Optimization based data mining: theory and applications. Springer, Berlin. https://doi.org/10.1007/978-0-85729-504-0
Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260. https://doi.org/10.1126/science.aaa8415
Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178. https://doi.org/10.1007/s40745-017-0112-5
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Bennett KP, Hu J, Ji X, Kunapuli G, Pang JS (2006) Model selection via bilevel optimization. In: The 2006 IEEE international joint conference on neural network proceedings. IEEE, pp 1922–1929. https://doi.org/10.1109/IJCNN.2006.246935
Tso WW, Burnak B, Pistikopoulos EN (2020) Hy-pop: hyperparameter optimization of machine learning models through parametric programming. Comput Chem Eng 139:106902. https://doi.org/10.1016/j.compchemeng.2020.106902
Rosset S, Zhu J (2007) Piecewise linear regularized solution paths. Ann Stat 35(3):1012–1030. https://doi.org/10.1214/009053606000001370
Wang Q, Ma Y, Zhao K, Tian Y (2020) A comprehensive survey of loss functions in machine learning. Ann Data Sci 9:187–212. https://doi.org/10.1007/s40745-020-00253-5
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499. https://doi.org/10.1214/009053604000000067
Hastie T, Rosset S, Tibshirani R, Zhu J (2004) The entire regularization path for the support vector machine. J Mach Learn Res 5:1391–1415
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67(2):301–320. https://doi.org/10.1111/j.1467-9868.2005.00527.x
Tibshirani RJ, Taylor J (2011) The solution path of the generalized lasso. Ann Stat 39(3):1335–1371. https://doi.org/10.1214/11-AOS878
Gunter L, Zhu J (2007) Efficient computation and model selection for the support vector regression. Neural Comput 19(6):1633–1655. https://doi.org/10.1162/neco.2007.19.6.1633
Wang G, Yeung DY, Lochovsky FH (2008) A new solution path algorithm in support vector regression. IEEE Trans Neural Netw 19(10):1753–1767. https://doi.org/10.1109/TNN.2008.2002077
Wang G, Yeung DY, Lochovsky FH (2006) Two-dimensional solution path for support vector regression. In: Proceedings of the 23rd international conference on Machine learning, pp 993–1000
Karasuyama M, Harada N, Sugiyama M, Takeuchi I (2012) Multi-parametric solution-path algorithm for instance-weighted support vector machines. Mach Learn 88(3):297–330. https://doi.org/10.1007/s10994-012-5288-5
Ong CJ, Shao S, Yang J (2010) An improved algorithm for the solution of the regularization path of support vector machine. IEEE Trans Neural Netw 21(3):451–462. https://doi.org/10.1109/TNN.2009.2039000
Dai J, Chang C, Mai F, Zhao D, Xu W (2013) On the svmpath singularity. IEEE Trans Neural Netw Learn Syst 24(11):1736–1748. https://doi.org/10.1109/TNNLS.2013.2262180
Sentelle CG, Anagnostopoulos GC, Georgiopoulos M (2015) A simple method for solving the svm regularization path for semidefinite kernels. IEEE Trans Neural Netw Learn Syst 27(4):709–722. https://doi.org/10.1109/TNNLS.2015.2427333
Zhu J, Rosset S, Tibshirani R, Hastie TJ (2003) 1-norm support vector machines. In: Advances in neural information processing systems
Gu B, Wang JD, Zheng GS, Yu YC (2012) Regularization path for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 23(5):800–811. https://doi.org/10.1109/TNNLS.2012.2183644
Gu B, Sheng VS (2016) A robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248. https://doi.org/10.1109/TNNLS.2016.2527796
Wang L, Zhu J, Zou H (2006) The doubly regularized support vector machine. Stat Sin 16(2):589–615
Wang L, Gordon MD, Zhu J (2006b) Regularized least absolute deviations regression and an efficient algorithm for parameter tuning. In: Sixth international conference on data mining. IEEE, pp 690–700. https://doi.org/10.1109/ICDM.2006.134
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol 58(1):267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Vanderbei RJ (1999) Loqo: an interior point code for quadratic programming. Optim Methods Softw 11(1–4):451–484. https://doi.org/10.1080/10556789908805759
Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88
Osuna E, Freund R, Girosi F (1997) An improved training algorithm for support vector machines. In: Neural networks for signal processing VII. Proceedings of the 1997 IEEE signal processing society workshop. IEEE, pp 276–285
Vapnik V (1999) The nature of statistical learning theory. Springer, New York. https://doi.org/10.1007/978-1-4757-3264-1_2
Mao W, Yan G, Dong L (2009) Weighted solution path algorithm of support vector regression based on heuristic weight-setting optimization. Neurocomputing 73(1–3):495–505. https://doi.org/10.1016/j.neucom.2009.06.008
Mao W, Dong Ll, Zhang G (2008) Weighted solution path algorithm of support vector regression for abnormal data. In: 2008 19th international conference on pattern recognition. IEEE, pp 1–4. https://doi.org/10.1109/ICPR.2008.4761784
Koenker R, Bassett G Jr (1978) Regression quantiles. Econometrica 46:33–50
Takeuchi I, Le QV, Sears TD, Smola AJ (2006) Nonparametric quantile estimation. J Mach Learn Res 7(Jul):1231–1264
Li Y, Liu Y, Zhu J (2007) Quantile regression in reproducing kernel Hilbert spaces. J Am stat Assoc 102(477):255–268. https://doi.org/10.1198/016214506000000979
Takeuchi I, Nomura K, Kanamori T (2009) Nonparametric conditional density estimation using piecewise-linear solution path of kernel quantile regression. Neural Comput 21(2):533–559. https://doi.org/10.1162/neco.2008.10-07-628
Rosset S (2009) Bi-level path following for cross validated solution of kernel quantile regression. J Mach Learn Res 10(11):2473–2505
McCrea MV, Sherali HD, Trani AA (2008) A probabilistic framework for weather-based rerouting and delay estimations within an airspace planning model. Transp Res Part C Emerg Technol 16(4):410–431. https://doi.org/10.1016/j.trc.2007.09.001
Shashua A, Levin A (2002) Taxonomy of large margin principle algorithms for ordinal regression problems. In: Advances in neural information processing systems, vol 15, pp 937–944
Chu W, Keerthi SS (2007) Support vector ordinal regression. Neural Comput 19(3):792–815. https://doi.org/10.1162/neco.2007.19.3.792
Gu B, Ling C (2015) A new generalized error path algorithm for model selection. In: International conference on machine learning, pp 2549–2558
Gu B (2018) A regularization path algorithm for support vector ordinal regression. Neural Netw 98:114–121. https://doi.org/10.1016/j.neunet.2017.11.008
Park MY, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc Series B Stat Methodol 69(4):659–677. https://doi.org/10.1111/j.1467-9868.2007.00607.x
Tay JK, Narasimhan B, Hastie T (2021) Elastic net regularization paths for all generalized linear models. arXiv:2103.03475
Wang X, Pardalos PM (2014) A survey of support vector machines with uncertainties. Ann Data Sci 1(3–4):293–309. https://doi.org/10.1007/s40745-014-0022-8
Allgower EL, Georg K (1993) Continuation and path following. Acta Numer 2:1–64. https://doi.org/10.1017/S0962492900002336
Sentelle C, Anagnostopoulos GC, Georgiopoulos M (2011) Efficient revised simplex method for svm training. IEEE Trans Neural Netw 22(10):1650–1661. https://doi.org/10.1109/TNN.2011.2165081
Scheinberg K (2006) An efficient implementation of an active set method for svms. J Mach Learn Res 7(Oct):2237–2257
Wang L, Zhu J, Zou H (2007) Hybrid huberized support vector machines for microarray classification. In: Proceedings of the 24th international conference on machine learning, pp 983–990. https://doi.org/10.1145/1273496.1273620
Wang L, Zhu J, Zou H (2008) Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24(3):412–419. https://doi.org/10.1093/bioinformatics/btm579
Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245. https://doi.org/10.1162/089976600300015565
Poole D (2014) Linear algebra: a modern introduction. Nelson Education, Toronto
Lin CF, Wang SD (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471. https://doi.org/10.1109/72.991432
Lin Y, Lee Y, Wahba G (2002) Support vector machines for classification in nonstandard situations. Mach Learn 46(1–3):191–202. https://doi.org/10.1023/A:1012406528296
Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976. https://doi.org/10.1142/S0218001407005703
Lee Y, Cui Z (2006) Characterizing the solution path of multicategory support vector machines. Stat Sin 16:391–409
Lee Y, Lin Y, Wahba G (2004) Multicategory support vector machines: theory and application to the classification of microarray data and satellite radiance data. J Am Stat Assoc 99(465):67–81. https://doi.org/10.1198/016214504000000098
Flach PA (2003) The geometry of roc space: understanding machine learning metrics through roc isometrics. In: Proceedings of the 20th international conference on machine learning, pp 194–201
Majnik M, Bosnić Z (2013) Roc analysis of classifiers in machine learning: a survey. Intell Data Anal 17(3):531–558. https://doi.org/10.3233/IDA-130592
Bach FR, Heckerman D, Horvitz E (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741
Muller KR, Mika S, Ratsch G, Tsuda K, Scholkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201. https://doi.org/10.1109/72.914517
Wang G, Yeung DY, Lochovsky FH (2007) A kernel path algorithm for support vector machines. In: Proceedings of the 24th international conference on machine learning, pp 951–958
Tax DM, Duin RP (1999) Support vector domain description. Pattern Recognit Lett 20(11–13):1191–1199. https://doi.org/10.1016/S0167-8655(99)00087-2
Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471. https://doi.org/10.1162/089976601750264965
Lee G, Scott CD (2007) The one class support vector machine solution path. In: 2007 IEEE international conference on acoustics, speech and signal processing. IEEE, vol 2, pp II-521
Karasuyama M, Takeuchi I (2011) Suboptimal solution path algorithm for support vector machine. arXiv:1105.0471
Sheng VS, Ling CX (2006) Thresholding for making classifiers cost-sensitive. In: Proceedings of the 21st national conference on Artificial intelligence, pp 476–481
Karakoulas GI, Shawe-Taylor J (1999) Optimizing classifers for imbalanced training sets. In: Advances in neural information processing systems, pp 253–259
Scholkopf B, Smola AJ (2018) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge. https://doi.org/10.7551/mitpress/4175.001.0001
Davenport MA, Baraniuk RG, Scott CD (2010) Tuning support vector machines for minimax and Neyman-Pearson classification. IEEE Trans Pattern Anal Mach Intell 32(10):1888–1898. https://doi.org/10.1109/TPAMI.2010.29
Masnadi-Shirazi H, Vasconcelos N (2010) Risk minimization, probability elicitation, and cost-sensitive svms. In: Proceedings of the 27th international conference on machine learning, pp 759–766
Gu B, Sheng VS, Tay KY, Romano W, Li S (2016) Cross validation through two-dimensional solution surface for cost-sensitive svm. IEEE Trans Pattern Anal Mach Intell 39(6):1103–1121. https://doi.org/10.1109/TPAMI.2016.2578326
De Morsier F, Tuia D, Borgeaud M, Gass V, Thiran JP (2013) Semi-supervised novelty detection using svm entire solution path. IEEE Trans Geosci Remote Sens 51(4):1939–1950. https://doi.org/10.1109/TGRS.2012.2236683
Markou M, Singh S (2003) Novelty detection: a review-part 1: statistical approaches. Signal Process 83(12):2481–2497. https://doi.org/10.1016/j.sigpro.2003.07.018
Koenker R, Hallock KF (2001) Quantile regression. J Econ Perspect 15(4):143–156. https://doi.org/10.1257/jep.15.4.143
Christmann A, Steinwart I (2008) How svms can estimate quantiles and the median. In: Advances in neural information processing systems, pp 305–312
Steinwart I, Christmann A (2011) Estimating conditional quantiles with the help of the pinball loss. Bernoulli 17(1):211–225. https://doi.org/10.3150/10-BEJ267
Huang X, Shi L, Suykens JA (2016) Solution path for pin-svm classifiers with positive and negative \(\tau \) values. IEEE Trans Neural Netw Learn Syst 28(7):1584–1593. https://doi.org/10.1109/TNNLS.2016.2547324
Jayadeva Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910. https://doi.org/10.1109/TPAMI.2007.1068
Tian Y, Qi Z (2014) Review on: twin support vector machines. Ann Data Sci 1(2):253–277. https://doi.org/10.1007/s40745-014-0018-4
Xu Y, Yang Z, Pan X (2016) A novel twin support-vector machine with pinball loss. IEEE Trans Neural Netw Learn Syst 28(2):359–370. https://doi.org/10.1109/TNNLS.2015.2513006
Yang Z, Pan X, Xu Y (2018) Piecewise linear solution path for pinball twin support vector machine. Knowl Based Syst 160:311–324. https://doi.org/10.1016/j.knosys.2018.07.022
Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 133–142. https://doi.org/10.1145/775047.775067
Arreola KZ, Gärtner T, Gasso G, Canu S (2008) Regularization path for ranking svm. In: Proceedings of the European symposium on artificial neural networks
Funding
No funding received.
Author information
Authors and Affiliations
Contributions
GT conducted literature review and wrote the manuscript. NF review the work and edit the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Conflicts of interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tang, G., Fan, N. A Survey of Solution Path Algorithms for Regression and Classification Models. Ann. Data. Sci. 9, 749–789 (2022). https://doi.org/10.1007/s40745-022-00386-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-022-00386-9