Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A Survey of Solution Path Algorithms for Regression and Classification Models

  • Published:
Annals of Data Science Aims and scope Submit manuscript

Abstract

The loss function expresses the differences between the predicted values from regression or classification models and the actual instances in machine learning. Regularization also plays an important role in machine learning, and it can mitigate overfitting problems, perform variable selection, and produce sparse models. The hyperparameter in these models controls the trade-off between the loss function and the regularization term, as well as the bias-variance trade-off. The choice of hyperparameter will influence the performance of the models. Thus, the hyperparameter needs to be tuned for effective learning from data. In some machine learning models, the optimal values for estimated coefficients are piecewise linear with respect to the hyperparameter. Efficient algorithms can be developed to compute all solutions, and these kinds of methods are called solution path algorithms. They can significantly reduce the efforts for cross-validation and highly speed up hyperparameter tuning. In this paper, we review the solution path algorithms widely used in regression and classification machine learning problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

Not applicable.

Code Availability

Not applicable.

References

  1. Shi Y (2021) Advances in big data analytics: theory, algorithms and practices. Springer, Singapore. https://doi.org/10.1007/978-981-16-3607-3

    Book  Google Scholar 

  2. Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York

    Google Scholar 

  3. Shi Y, Tian Y, Kou G, Peng Y, Li J (2011) Optimization based data mining: theory and applications. Springer, Berlin. https://doi.org/10.1007/978-0-85729-504-0

    Book  Google Scholar 

  4. Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260. https://doi.org/10.1126/science.aaa8415

    Article  Google Scholar 

  5. Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178. https://doi.org/10.1007/s40745-017-0112-5

    Article  Google Scholar 

  6. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360

    Article  Google Scholar 

  7. Bennett KP, Hu J, Ji X, Kunapuli G, Pang JS (2006) Model selection via bilevel optimization. In: The 2006 IEEE international joint conference on neural network proceedings. IEEE, pp 1922–1929. https://doi.org/10.1109/IJCNN.2006.246935

  8. Tso WW, Burnak B, Pistikopoulos EN (2020) Hy-pop: hyperparameter optimization of machine learning models through parametric programming. Comput Chem Eng 139:106902. https://doi.org/10.1016/j.compchemeng.2020.106902

    Article  Google Scholar 

  9. Rosset S, Zhu J (2007) Piecewise linear regularized solution paths. Ann Stat 35(3):1012–1030. https://doi.org/10.1214/009053606000001370

  10. Wang Q, Ma Y, Zhao K, Tian Y (2020) A comprehensive survey of loss functions in machine learning. Ann Data Sci 9:187–212. https://doi.org/10.1007/s40745-020-00253-5

  11. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499. https://doi.org/10.1214/009053604000000067

    Article  Google Scholar 

  12. Hastie T, Rosset S, Tibshirani R, Zhu J (2004) The entire regularization path for the support vector machine. J Mach Learn Res 5:1391–1415

    Google Scholar 

  13. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67(2):301–320. https://doi.org/10.1111/j.1467-9868.2005.00527.x

    Article  Google Scholar 

  14. Tibshirani RJ, Taylor J (2011) The solution path of the generalized lasso. Ann Stat 39(3):1335–1371. https://doi.org/10.1214/11-AOS878

    Article  Google Scholar 

  15. Gunter L, Zhu J (2007) Efficient computation and model selection for the support vector regression. Neural Comput 19(6):1633–1655. https://doi.org/10.1162/neco.2007.19.6.1633

    Article  Google Scholar 

  16. Wang G, Yeung DY, Lochovsky FH (2008) A new solution path algorithm in support vector regression. IEEE Trans Neural Netw 19(10):1753–1767. https://doi.org/10.1109/TNN.2008.2002077

    Article  Google Scholar 

  17. Wang G, Yeung DY, Lochovsky FH (2006) Two-dimensional solution path for support vector regression. In: Proceedings of the 23rd international conference on Machine learning, pp 993–1000

  18. Karasuyama M, Harada N, Sugiyama M, Takeuchi I (2012) Multi-parametric solution-path algorithm for instance-weighted support vector machines. Mach Learn 88(3):297–330. https://doi.org/10.1007/s10994-012-5288-5

    Article  Google Scholar 

  19. Ong CJ, Shao S, Yang J (2010) An improved algorithm for the solution of the regularization path of support vector machine. IEEE Trans Neural Netw 21(3):451–462. https://doi.org/10.1109/TNN.2009.2039000

    Article  Google Scholar 

  20. Dai J, Chang C, Mai F, Zhao D, Xu W (2013) On the svmpath singularity. IEEE Trans Neural Netw Learn Syst 24(11):1736–1748. https://doi.org/10.1109/TNNLS.2013.2262180

    Article  Google Scholar 

  21. Sentelle CG, Anagnostopoulos GC, Georgiopoulos M (2015) A simple method for solving the svm regularization path for semidefinite kernels. IEEE Trans Neural Netw Learn Syst 27(4):709–722. https://doi.org/10.1109/TNNLS.2015.2427333

    Article  Google Scholar 

  22. Zhu J, Rosset S, Tibshirani R, Hastie TJ (2003) 1-norm support vector machines. In: Advances in neural information processing systems

  23. Gu B, Wang JD, Zheng GS, Yu YC (2012) Regularization path for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 23(5):800–811. https://doi.org/10.1109/TNNLS.2012.2183644

    Article  Google Scholar 

  24. Gu B, Sheng VS (2016) A robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248. https://doi.org/10.1109/TNNLS.2016.2527796

    Article  Google Scholar 

  25. Wang L, Zhu J, Zou H (2006) The doubly regularized support vector machine. Stat Sin 16(2):589–615

  26. Wang L, Gordon MD, Zhu J (2006b) Regularized least absolute deviations regression and an efficient algorithm for parameter tuning. In: Sixth international conference on data mining. IEEE, pp 690–700. https://doi.org/10.1109/ICDM.2006.134

  27. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol 58(1):267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

    Article  Google Scholar 

  28. Vanderbei RJ (1999) Loqo: an interior point code for quadratic programming. Optim Methods Softw 11(1–4):451–484. https://doi.org/10.1080/10556789908805759

    Article  Google Scholar 

  29. Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88

    Article  Google Scholar 

  30. Osuna E, Freund R, Girosi F (1997) An improved training algorithm for support vector machines. In: Neural networks for signal processing VII. Proceedings of the 1997 IEEE signal processing society workshop. IEEE, pp 276–285

  31. Vapnik V (1999) The nature of statistical learning theory. Springer, New York. https://doi.org/10.1007/978-1-4757-3264-1_2

    Book  Google Scholar 

  32. Mao W, Yan G, Dong L (2009) Weighted solution path algorithm of support vector regression based on heuristic weight-setting optimization. Neurocomputing 73(1–3):495–505. https://doi.org/10.1016/j.neucom.2009.06.008

    Article  Google Scholar 

  33. Mao W, Dong Ll, Zhang G (2008) Weighted solution path algorithm of support vector regression for abnormal data. In: 2008 19th international conference on pattern recognition. IEEE, pp 1–4. https://doi.org/10.1109/ICPR.2008.4761784

  34. Koenker R, Bassett G Jr (1978) Regression quantiles. Econometrica 46:33–50

    Article  Google Scholar 

  35. Takeuchi I, Le QV, Sears TD, Smola AJ (2006) Nonparametric quantile estimation. J Mach Learn Res 7(Jul):1231–1264

    Google Scholar 

  36. Li Y, Liu Y, Zhu J (2007) Quantile regression in reproducing kernel Hilbert spaces. J Am stat Assoc 102(477):255–268. https://doi.org/10.1198/016214506000000979

    Article  Google Scholar 

  37. Takeuchi I, Nomura K, Kanamori T (2009) Nonparametric conditional density estimation using piecewise-linear solution path of kernel quantile regression. Neural Comput 21(2):533–559. https://doi.org/10.1162/neco.2008.10-07-628

    Article  Google Scholar 

  38. Rosset S (2009) Bi-level path following for cross validated solution of kernel quantile regression. J Mach Learn Res 10(11):2473–2505

    Google Scholar 

  39. McCrea MV, Sherali HD, Trani AA (2008) A probabilistic framework for weather-based rerouting and delay estimations within an airspace planning model. Transp Res Part C Emerg Technol 16(4):410–431. https://doi.org/10.1016/j.trc.2007.09.001

    Article  Google Scholar 

  40. Shashua A, Levin A (2002) Taxonomy of large margin principle algorithms for ordinal regression problems. In: Advances in neural information processing systems, vol 15, pp 937–944

  41. Chu W, Keerthi SS (2007) Support vector ordinal regression. Neural Comput 19(3):792–815. https://doi.org/10.1162/neco.2007.19.3.792

    Article  Google Scholar 

  42. Gu B, Ling C (2015) A new generalized error path algorithm for model selection. In: International conference on machine learning, pp 2549–2558

  43. Gu B (2018) A regularization path algorithm for support vector ordinal regression. Neural Netw 98:114–121. https://doi.org/10.1016/j.neunet.2017.11.008

    Article  Google Scholar 

  44. Park MY, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc Series B Stat Methodol 69(4):659–677. https://doi.org/10.1111/j.1467-9868.2007.00607.x

    Article  Google Scholar 

  45. Tay JK, Narasimhan B, Hastie T (2021) Elastic net regularization paths for all generalized linear models. arXiv:2103.03475

  46. Wang X, Pardalos PM (2014) A survey of support vector machines with uncertainties. Ann Data Sci 1(3–4):293–309. https://doi.org/10.1007/s40745-014-0022-8

    Article  Google Scholar 

  47. Allgower EL, Georg K (1993) Continuation and path following. Acta Numer 2:1–64. https://doi.org/10.1017/S0962492900002336

    Article  Google Scholar 

  48. Sentelle C, Anagnostopoulos GC, Georgiopoulos M (2011) Efficient revised simplex method for svm training. IEEE Trans Neural Netw 22(10):1650–1661. https://doi.org/10.1109/TNN.2011.2165081

    Article  Google Scholar 

  49. Scheinberg K (2006) An efficient implementation of an active set method for svms. J Mach Learn Res 7(Oct):2237–2257

    Google Scholar 

  50. Wang L, Zhu J, Zou H (2007) Hybrid huberized support vector machines for microarray classification. In: Proceedings of the 24th international conference on machine learning, pp 983–990. https://doi.org/10.1145/1273496.1273620

  51. Wang L, Zhu J, Zou H (2008) Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24(3):412–419. https://doi.org/10.1093/bioinformatics/btm579

    Article  Google Scholar 

  52. Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245. https://doi.org/10.1162/089976600300015565

    Article  Google Scholar 

  53. Poole D (2014) Linear algebra: a modern introduction. Nelson Education, Toronto

    Google Scholar 

  54. Lin CF, Wang SD (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471. https://doi.org/10.1109/72.991432

    Article  Google Scholar 

  55. Lin Y, Lee Y, Wahba G (2002) Support vector machines for classification in nonstandard situations. Mach Learn 46(1–3):191–202. https://doi.org/10.1023/A:1012406528296

    Article  Google Scholar 

  56. Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976. https://doi.org/10.1142/S0218001407005703

    Article  Google Scholar 

  57. Lee Y, Cui Z (2006) Characterizing the solution path of multicategory support vector machines. Stat Sin 16:391–409

    Google Scholar 

  58. Lee Y, Lin Y, Wahba G (2004) Multicategory support vector machines: theory and application to the classification of microarray data and satellite radiance data. J Am Stat Assoc 99(465):67–81. https://doi.org/10.1198/016214504000000098

    Article  Google Scholar 

  59. Flach PA (2003) The geometry of roc space: understanding machine learning metrics through roc isometrics. In: Proceedings of the 20th international conference on machine learning, pp 194–201

  60. Majnik M, Bosnić Z (2013) Roc analysis of classifiers in machine learning: a survey. Intell Data Anal 17(3):531–558. https://doi.org/10.3233/IDA-130592

    Article  Google Scholar 

  61. Bach FR, Heckerman D, Horvitz E (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741

    Google Scholar 

  62. Muller KR, Mika S, Ratsch G, Tsuda K, Scholkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201. https://doi.org/10.1109/72.914517

    Article  Google Scholar 

  63. Wang G, Yeung DY, Lochovsky FH (2007) A kernel path algorithm for support vector machines. In: Proceedings of the 24th international conference on machine learning, pp 951–958

  64. Tax DM, Duin RP (1999) Support vector domain description. Pattern Recognit Lett 20(11–13):1191–1199. https://doi.org/10.1016/S0167-8655(99)00087-2

    Article  Google Scholar 

  65. Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471. https://doi.org/10.1162/089976601750264965

    Article  Google Scholar 

  66. Lee G, Scott CD (2007) The one class support vector machine solution path. In: 2007 IEEE international conference on acoustics, speech and signal processing. IEEE, vol 2, pp II-521

  67. Karasuyama M, Takeuchi I (2011) Suboptimal solution path algorithm for support vector machine. arXiv:1105.0471

  68. Sheng VS, Ling CX (2006) Thresholding for making classifiers cost-sensitive. In: Proceedings of the 21st national conference on Artificial intelligence, pp 476–481

  69. Karakoulas GI, Shawe-Taylor J (1999) Optimizing classifers for imbalanced training sets. In: Advances in neural information processing systems, pp 253–259

  70. Scholkopf B, Smola AJ (2018) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge. https://doi.org/10.7551/mitpress/4175.001.0001

    Book  Google Scholar 

  71. Davenport MA, Baraniuk RG, Scott CD (2010) Tuning support vector machines for minimax and Neyman-Pearson classification. IEEE Trans Pattern Anal Mach Intell 32(10):1888–1898. https://doi.org/10.1109/TPAMI.2010.29

    Article  Google Scholar 

  72. Masnadi-Shirazi H, Vasconcelos N (2010) Risk minimization, probability elicitation, and cost-sensitive svms. In: Proceedings of the 27th international conference on machine learning, pp 759–766

  73. Gu B, Sheng VS, Tay KY, Romano W, Li S (2016) Cross validation through two-dimensional solution surface for cost-sensitive svm. IEEE Trans Pattern Anal Mach Intell 39(6):1103–1121. https://doi.org/10.1109/TPAMI.2016.2578326

    Article  Google Scholar 

  74. De Morsier F, Tuia D, Borgeaud M, Gass V, Thiran JP (2013) Semi-supervised novelty detection using svm entire solution path. IEEE Trans Geosci Remote Sens 51(4):1939–1950. https://doi.org/10.1109/TGRS.2012.2236683

    Article  Google Scholar 

  75. Markou M, Singh S (2003) Novelty detection: a review-part 1: statistical approaches. Signal Process 83(12):2481–2497. https://doi.org/10.1016/j.sigpro.2003.07.018

    Article  Google Scholar 

  76. Koenker R, Hallock KF (2001) Quantile regression. J Econ Perspect 15(4):143–156. https://doi.org/10.1257/jep.15.4.143

    Article  Google Scholar 

  77. Christmann A, Steinwart I (2008) How svms can estimate quantiles and the median. In: Advances in neural information processing systems, pp 305–312

  78. Steinwart I, Christmann A (2011) Estimating conditional quantiles with the help of the pinball loss. Bernoulli 17(1):211–225. https://doi.org/10.3150/10-BEJ267

    Article  Google Scholar 

  79. Huang X, Shi L, Suykens JA (2016) Solution path for pin-svm classifiers with positive and negative \(\tau \) values. IEEE Trans Neural Netw Learn Syst 28(7):1584–1593. https://doi.org/10.1109/TNNLS.2016.2547324

    Article  Google Scholar 

  80. Jayadeva Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910. https://doi.org/10.1109/TPAMI.2007.1068

    Article  Google Scholar 

  81. Tian Y, Qi Z (2014) Review on: twin support vector machines. Ann Data Sci 1(2):253–277. https://doi.org/10.1007/s40745-014-0018-4

    Article  Google Scholar 

  82. Xu Y, Yang Z, Pan X (2016) A novel twin support-vector machine with pinball loss. IEEE Trans Neural Netw Learn Syst 28(2):359–370. https://doi.org/10.1109/TNNLS.2015.2513006

    Article  Google Scholar 

  83. Yang Z, Pan X, Xu Y (2018) Piecewise linear solution path for pinball twin support vector machine. Knowl Based Syst 160:311–324. https://doi.org/10.1016/j.knosys.2018.07.022

    Article  Google Scholar 

  84. Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 133–142. https://doi.org/10.1145/775047.775067

  85. Arreola KZ, Gärtner T, Gasso G, Canu S (2008) Regularization path for ranking svm. In: Proceedings of the European symposium on artificial neural networks

Download references

Funding

No funding received.

Author information

Authors and Affiliations

Authors

Contributions

GT conducted literature review and wrote the manuscript. NF review the work and edit the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Neng Fan.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Conflicts of interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, G., Fan, N. A Survey of Solution Path Algorithms for Regression and Classification Models. Ann. Data. Sci. 9, 749–789 (2022). https://doi.org/10.1007/s40745-022-00386-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40745-022-00386-9

Keywords