Robust Bayesian Linear Classifier Ensembles

Cerquides, Jesús; de Mántaras, Ramon López

doi:10.1007/11564096_12

Jesús Cerquides²³ &
Ramon López de Mántaras²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3720))

Included in the following conference series:

European Conference on Machine Learning

6016 Accesses
18 Citations

Abstract

Ensemble classifiers combine the classification results of several classifiers. Simple ensemble methods such as uniform averaging over a set of models usually provide an improvement over selecting the single best model. Usually probabilistic classifiers restrict the set of possible models that can be learnt in order to lower computational complexity costs. In these restricted spaces, where incorrect modeling assumptions are possibly made, uniform averaging sometimes performs even better than bayesian model averaging. Linear mixtures over sets of models provide an space that includes uniform averaging as a particular case. We develop two algorithms for learning maximum a posteriori weights for linear mixtures, based on expectation maximization and on constrained optimizition. We provide a nontrivial example of the utility of these two algorithms by applying them for one dependence estimators. We develop the conjugate distribution for one dependence estimators and empirically show that uniform averaging is clearly superior to Bayesian model averaging for this family of models. After that we empirically show that the maximum a posteriori linear mixture weights improve accuracy significantly over uniform aggregation.

Download to read the full chapter text

Chapter PDF

On the Interpretation of Ensemble Classifiers in Terms of Bayes Classifiers

Article 26 July 2018

A geometric framework for multiclass ensemble classifiers

Article Open access 27 September 2023

When is the Naive Bayes approximation not so naive?

Article 21 July 2017

References

Bouchard, G., Triggs, B.: The tradeoff between generative and discriminative classifiers. In: IASC International Symposium on Computational Statistics (COMPSTAT), Prague, August 2004, pp. 721–728 (2004)
Google Scholar
Cerquides, J., López de Mántaras, R.: Tan classifiers based on decomposable distributions. Machine Learning- Special Issue on Graphical Models for Classification 59(3), 323–354 (2005)
MATH Google Scholar
Clarke, B.: Comparing bayes model averaging and stacking when model approximation error cannot be ignored. Journal of Machine Learning Research 4, 683–712 (2003)
Article Google Scholar
Dash, D., Cooper, G.F.: Model averaging for prediction with discrete bayesian networks. Journal of Machine Learning Research 5, 1177–1203 (2004)
MathSciNet Google Scholar
Dawes, R.: The robust beauty of improper linear models. American Psychologist 34, 571–582 (1979)
Article Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Chapter Google Scholar
Domingos, P.: Bayesian averaging of classifiers and the overfitting problem. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 223–230 (2000)
Google Scholar
Fawcett, T.: Roc graphs: Notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, HP Laboratories Palo Alto (2003)
Google Scholar
Friedman, J.: Importance sampling: An alternative view of ensemble learning. In: Workshop on Data Mining Methodology and Applications (October 2004)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
Genest, C., McConway, K.: Allocating the weights in the linear opinion pool. Journal of Forecasting 9, 53–73 (1990)
Article Google Scholar
Genest, C., Zidek, J.: Combining probability distributions: A critique and an annotated bibliography. Statistical Science 1(1), 114–148 (1986)
Article MathSciNet Google Scholar
Ghahramani, Z., Kim, H.-C.: Bayesian classifier combination. Gatsby Technical report (2003)
Google Scholar
Gill, P., Murray, W., Saunders, M., Wright, M.: Constrained nonlinear programming. In: Nemhauser, G., Rinnooy Kan, A., Todd, M. (eds.) Optimization, Handbooks in Operations Research and Management Science. North-Holland, Amsterdam (1989)
Google Scholar
Greiner, R., Su, X., Shen, B., Zhou, W.: Structural extension to logistic regression: Discriminant parameter learning of belief net classifiers. Machine Learning - Special Issue on Graphical Models for Classification 59(3), 297–322 (2005)
MATH Google Scholar
Grossman, D., Domingos, P.: Learning bayesian network classifiers by maximizing conditional likelihood. In: Brodley, C.E. (ed.) ICML. ACM, New York (2004)
Google Scholar
Gruenwald, P., Kontkanen, P., Myllymäki, P., Roos, T., Tirri, H., Wettig, H.: Supervised posterior distributions. Presented at the Seventh Valencia International Meeting on Bayesian Statistics, Tenerife, Spain (2002)
Google Scholar
Hand, D., Till, R.: A simple generalization of the area under the roc curve to multiple class classification problems. Machine Learning 45(2), 171–186 (2001)
Article MATH Google Scholar
Hoeting, J., Madigan, D., Raftery, A., Volinsky, C.: Bayesian model averaging: A tutorial (with discussion). Statistical science 14, 382–401 (1999)
Article MATH MathSciNet Google Scholar
Hoeting, J., Madigan, D., Raftery, A., Volinsky, C.: Bayesian model averaging: A tutorial (with discussion) - correction. Statistical science 15, 193–195 (1999)
Article MathSciNet Google Scholar
Ide, J., Cozman, F.: Generation of random bayesian networks with constraints on induced width, with applications to the average analysis od d-connectivity, quasi-random sampling, and loopy propagation. Technical report, University of Sao Paulo (June 2003)
Google Scholar
Keogh, E., Pazzani, M.: Learning augmented bayesian classifiers: A comparison of distribution-based and classification-based approaches. In: Uncertainty 1999: The Seventh International Workshop on Artificial Intelligence and Statistics, Ft. Lauderdale, FL (1999)
Google Scholar
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions. Wiley, Chichester (1997)
MATH Google Scholar
McLachlan, G.J., Basford, K.E.: Mixture Models. Marcel Dekker, New York (1988)
MATH Google Scholar
Meila, M., Jordan, M.I.: Learning with mixtures of trees. Journal of Machine Learning Research 1, 1–48 (2000)
Article MathSciNet Google Scholar
Meila-Predoviciu, M.: Learning with mixtures of trees. PhD thesis, Department of Electrical Engineering and Computer Science. MIT (1999)
Google Scholar
Minka, T.: Bayesian model averaging is not model combination. MIT Media Lab note (December 2002)
Google Scholar
Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 841–848. MIT Press, Cambridge (2002)
Google Scholar
Pedregal, P.: Introduction to Optimization. Texts in Applied Mathematics, vol. 46. Springer, Heidelberg (2004)
MATH Google Scholar
Raina, R., Shen, Y., Ng, A.Y., McCallum, A.: Classification with hybrid generative/discriminative models. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16. MIT Press, Cambridge (2004)
Google Scholar
Roos, T., Wettig, H., Grünwald, P., Myllymäki, P., Tirri, H.: On discriminative bayesian network classifiers and logistic regression. Machine Learning - Special Issue on Graphical Models for Classification 59(3), 267–296 (2005)
MATH Google Scholar
Sahami, M.: Learning limited dependence Bayesian classifiers. In: Second International Conference on Knowledge Discovery in Databases, pp. 335–338 (1996)
Google Scholar
Thiesson, B., Meek, C., Chickering, D., Heckerman, D.: Learning mixtures of bayesian networks (1997)
Google Scholar
Thiesson, B., Meek, C., Chickering, D., Heckerman, D.: Learning mixtures of dag models. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI 1998), pp. 504–513 (1998)
Google Scholar
Ting, K., Witten, I.: Issues in stacked generalization. Journal of Artificial Intelligence Research 10, 271–289 (1999)
MATH Google Scholar
Webb, G.I., Boughton, J., Wang, Z.: Not so naive bayes: Aggregating one-dependence estimators. Machine Learning 58(1), 5–24 (2005)
Article MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Zheng, Z., Webb, G.I.: Lazy learning of bayesian rules. Machine Learning 41(1), 53–84 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. de Matemática Aplicada i Análisi, Universitat de Barcelona,
Jesús Cerquides
Artificial Intelligence Research Institute – IIIA, Spanish Council for Scientific Research – CSIC,
Ramon López de Mántaras

Authors

Jesús Cerquides
View author publications
You can also search for this author in PubMed Google Scholar
Ramon López de Mántaras
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Economics of the University of Porto, Portugal
João Gama
Faculdade de Engenharia & LIAAD, Universidade do Porto, Portugal
Rui Camacho
LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel B. Brazdil
LIACC/FEP, Universidade do Porto, Portugal
Alípio Mário Jorge
LIAAD-INESC Porto LA / FEP, University of Porto, R. de Ceuta, 118, 6., 4050-190, Porto, Portugal
Luís Torgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cerquides, J., de Mántaras, R.L. (2005). Robust Bayesian Linear Classifier Ensembles. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_12

Download citation

DOI: https://doi.org/10.1007/11564096_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29243-2
Online ISBN: 978-3-540-31692-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust Bayesian Linear Classifier Ensembles

Abstract

Chapter PDF

Similar content being viewed by others

On the Interpretation of Ensemble Classifiers in Terms of Bayes Classifiers

A geometric framework for multiclass ensemble classifiers

When is the Naive Bayes approximation not so naive?

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Robust Bayesian Linear Classifier Ensembles

Abstract

Chapter PDF

Similar content being viewed by others

On the Interpretation of Ensemble Classifiers in Terms of Bayes Classifiers

A geometric framework for multiclass ensemble classifiers

When is the Naive Bayes approximation not so naive?

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation