research-article

A naïve Bayes regularized logistic regression estimator for low-dimensional classification

Authors:

Prakash P. ShenoyAuthors Info & Claims

Volume 172, Issue C

https://doi.org/10.1016/j.ijar.2024.109239

Published: 01 September 2024 Publication History

Abstract

To reduce the estimator's variance and prevent overfitting, regularization techniques have attracted great interest from the statistics and machine learning communities. Most existing regularized methods rely on the sparsity assumption that a model with fewer parameters predicts better than one with many parameters. This assumption works particularly well in high-dimensional problems. However, the sparsity assumption may not be necessary when the number of predictors is relatively small compared to the number of training instances. This paper argues that shrinking the coefficients towards a low-variance data-driven estimate could be a better regularization strategy for such situations. For low-dimensional classification problems, we propose a naïve Bayes regularized logistic regression (NBRLR) that shrinks the logistic regression coefficients toward the naïve Bayes estimate to provide a reduction in variance. Our approach is primarily motivated by the fact that naïve Bayes is functionally equivalent to logistic regression if naïve Bayes' conditional independence assumption holds. Under standard conditions, we prove the consistency of the NBRLR estimator. Extensive simulation and empirical experimental results show that NBRLR is a competitive alternative to various state-of-the-art classifiers, especially on low-dimensional datasets.

Highlights

•

Propose a novel regularization method for classification problems.

•

Provide theoretical results, including consistency of the proposed estimator.

•

Provide extensive simulation and empirical experimental results, which support the competence of the proposed estimator.

References

[1]

E. Azeraf, E. Monfrini, W. Pieczynski, Using the naive Bayes as a discriminative model, in: Proceedings of the 2021 13th International Conference on Machine Learning and Computing, Association for Computing Machinery, New York, NY, USA, 2021, pp. 106–110,.

Digital Library

[2]

E. Azeraf, E. Monfrini, W. Pieczynski, Equivalence between lc-crf and hmm, and discriminative computing of hmm-based mpm and map, Algorithms 16 (2023) 173.

[3]

P. Domingos, M. Pazzani, Beyond independence: conditions for the optimality of the simple Bayesian classifier, in: Proc. 13th Intl. Conf. Machine Learning, 1996, pp. 105–112.

[4]

B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, et al., Least angle regression, Ann. Stat. 32 (2004) 407–499.

[5]

H.G. Eggleston, Convexity, Cambridge Tracts in Mathematics and Mathematical Physics, vol. 47, Cambridge University Press, 1958.

[6]

J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc. 96 (2001) 1348–1360.

[7]

J. Fan, H. Peng, Nonconcave penalized likelihood with a diverging number of parameters, Ann. Stat. 32 (2004) 928–961.

[8]

J. Friedman, T. Hastie, R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw. 33 (2010) 1.

[9]

A. Fujino, N. Ueda, K. Saito, A hybrid generative/discriminative approach to text classification with additional information, Inf. Process. Manag. 43 (2007) 379–392.

[10]

D.J. Hand, K. Yu, Idiot's Bayes—not so stupid after all?, Int. Stat. Rev. 69 (2001) 385–398.

[11]

T. Hastie, J. Friedman, R. Tibshirani, The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer, New York, NY, 2001.

[12]

X. He, P. Shi, Convergence rate of b-spline estimators of nonparametric conditional quantile functions, J. Nonparametr. Stat. 3 (1994) 299–308.

[13]

A.E. Hoerl, R.W. Kennard, Ridge regression: biased estimation for nonorthogonal problems, Technometrics 12 (1970) 55–67.

[14]

L. Jiang, L. Zhang, L. Yu, D. Wang, Class-specific attribute weighted naive Bayes, Pattern Recognit. 88 (2019) 321–330.

[15]

C. Kang, J. Tian, A hybrid generative/discriminative Bayesian classifier, in: Proceedings of the 19th International Florida Artificial Intelligence Research Society Conference, 2006, pp. 562–567.

[16]

K. Knight, W. Fu, Asymptotics for lasso-type estimators, Ann. Stat. 28 (2000) 1356–1378.

[17]

S. Kwon, Y. Kim, Large sample properties of the scad-penalized maximum likelihood estimation on high dimensions, Stat. Sin. 22 (2012) 629–653.

[18]

T.M. Mitchell, Machine Learning, WCB/McGraw-Hill, Boston, MA, 1997.

Digital Library

[19]

S.N. Negahban, P. Ravikumar, M.J. Wainwright, B. Yu, A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers, Stat. Sci. 27 (2012) 538–557. http://www.jstor.org/stable/41714783.

[20]

A.Y. Ng, M.I. Jordan, On discriminative vs. generative classifiers: a comparison of logistic regression and naïve Bayes, in: Advances in Neural Information Processing Systems, 2002, pp. 841–848.

[21]

T. Niblett, Constructing decision trees in noisy domains, in: Proceedings of the Second European Working Session on Learning, Sigma, Bled, Yugoslavia, 1987, pp. 67–78.

[22]

R. Raina, Y. Shen, A.Y. Ng, A. McCallum, Classification with hybrid generative/discriminative models, in: S. Thrun, L.K. Saul, B. Schölkopf (Eds.), Advances in Neural Information Processing Systems, 2003, pp. 545–552.

[23]

Y. Tan, New Probabilistic Techniques for Classification Problems and an Application, Ph.D. thesis University of Kansas School of Business, Lawrence, KS, USA, 2020.

[24]

Y. Tan, P.P. Shenoy, A bias-variance based heuristic for constructing a hybrid logistic regression-naïve Bayes model for classification, Int. J. Approx. Reason. 117 (2020) 15–28.

[25]

R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc., Ser. B, Methodol. 58 (1996) 267–288.

[26]

F. Wilcoxon, Individual comparisons by ranking methods, in: Breakthroughs in Statistics: Methodology and Distribution, Springer, 1992, pp. 196–202.

[27]

N.A. Zaidi, M.J. Carman, J. Cerquides, G.I. Webb, Naive-Bayes inspired effective pre-conditioner for speeding-up logistic regression, in: IEEE 2014 International Conference on Data Mining (ICDM), IEEE, 2014, pp. 1097–1102.

[28]

N.A. Zaidi, J. Cerquides, M.J. Carman, G.I. Webb, Alleviating naive Bayes attribute independence assumption by attribute weighting, J. Mach. Learn. Res. 14 (2013) 1947–1988.

[29]

N.A. Zaidi, G.I. Webb, M.J. Carman, F. Petitjean, J. Cerquides, A L R n: accelerated higher-order logistic regression, Mach. Learn. 104 (2016) 151–194.

[30]

H. Zhang, L. Jiang, L. Yu, Class-specific attribute value weighting for naive Bayes, Inf. Sci. 508 (2020) 260–274.

[31]

W. Zhang, L. Jiang, H. Zhang, A feature augmentation-based method for constructing generative-discriminative hybrid models, Sci. Sin. Inform. 52 (2022) 1792–1807.

[32]

C. Zheng, G. Wu, F. Bao, Y. Cao, C. Li, J. Zhu, Revisiting discriminative vs. generative classifiers: theory and implications, in: International Conference on Machine Learning, PMLR, 2023, pp. 42420–42477.

[33]

H. Zou, T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc., Ser. B, Stat. Methodol. 67 (2005) 301–320.

Index Terms

A naïve Bayes regularized logistic regression estimator for low-dimensional classification

Index terms have been assigned to the content through auto-classification.

Recommendations

A comprehensive review of recursive Naïve Bayes Classifiers

In this paper we provide a comprehensive empirical review of a variant of the Recursive Naïve Baye Classifier (RNBC*) in comparison to simple Naïve Bayes and C4.5. We show that in terms of a zero one loss cost function for classification accuracy, RNBC* ...
A novel Bagged Naïve Bayes-Decision Tree approach for multi-class classification problems
Soft Computing and Intelligent Systems: Techniques and Applications

Breakthrough classification performances have been achieved by utilizing ensemble techniques in machine learning and data mining. Bagging is one such ensemble technique that has outperformed single models in obtaining higher predictive performances. This ...
Classification using Hierarchical Naïve Bayes models

Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Approximate Reasoning

International Journal of Approximate Reasoning Volume 172, Issue C

Sep 2024

314 pages

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 September 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents