Abstract
The prior distribution of an attribute in a naïve Bayesian classifier is typically assumed to be a Dirichlet distribution, and this is called the Dirichlet assumption. The variables in a Dirichlet random vector can never be positively correlated and must have the same confidence level as measured by normalized variance. Both the generalized Dirichlet and the Liouville distributions include the Dirichlet distribution as a special case. These two multivariate distributions, also defined on the unit simplex, are employed to investigate the impact of the Dirichlet assumption in naïve Bayesian classifiers. We propose methods to construct appropriate generalized Dirichlet and Liouville priors for naïve Bayesian classifiers. Our experimental results on 18 data sets reveal that the generalized Dirichlet distribution has the best performance among the three distribution families. Not only is the Dirichlet assumption inappropriate, but also forcing the variables in a prior to be all positively correlated can deteriorate the performance of the naïve Bayesian classifier.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aitchison J (1986) The statistical analysis of compositional data. John Wiley, New York
Anderson DR, Sweeney DJ, Williams TA, Chen JC (2006) Statistics for business and economics: a practical approach, Chap. 7, Thomson Learning
Bier VM, Yi W (1995) A Bayesian method for analyzing dependencies in precursor data. Int J Forecast 11: 25–41
Blake C, Merz C (1998) UCI machine learning repository, http://www.ics.uci.edu/~mlearn/MLRepository.html.
Cestnik B, Bratko I (1991) On estimating probabilities in tree pruning. Machine Learning–EWSL-91, European Working Session on Learning. Springer-Verlag, Berlin, Germany, pp 138–150
Connor RJ, Mosimann JE (1969) Concepts of independence for proportions with a generalization of the Dirichlet distribution. J Am Stat Assoc 64: 194–206
Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29: 103–130
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 194–202
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman and Hall, New York
Hsu CN, Huang HJ, Wong TT (2003) Implications of the Dirichlet assumption for discretization of continuous attributes in naïve Bayesian classifiers. Mach Learn 53: 235–263
Ishwaran H, James LF (2001) Gibbs sampling methods for stick-breaking priors. J Am Stat Assoc 96: 161–173
Kohavi R, Sahami M (1996) Error-based and entropy-based discretization of continuous features. In: Proceedings of the second international conference on knowledge discovery and data mining, Portland, OR, pp 114–119
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. AI Research Branch, NASA Ames Research Center, Moffett Field, CA 94035, USA
Lochner RH (1975) A generalized Dirichlet distribution in Bayesian life testing. J Roy Stat Soc Series B 37: 103–113
Mitchell TM (1997) Machine learning. McGraw-Hill
Wilks SS (1962) Mathematical Statistics. John Wiley, New York
Wong TT (1998) Generalized Dirichlet distribution in Bayesian analysis. Appl Math Comput 97: 165–181
Wong TT (2005) A Bayesian approach employing generalized Dirichlet priors in predicting microchip yields. J Chin Inst Ind Eng 22: 210–217
Wong TT (2007) Perfect aggregation of Bayesian analysis on compositional data. Stat Papers 48: 265–282
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Charles Elkan.
Rights and permissions
About this article
Cite this article
Wong, TT. Alternative prior assumptions for improving the performance of naïve Bayesian classifiers. Data Min Knowl Disc 18, 183–213 (2009). https://doi.org/10.1007/s10618-008-0101-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-008-0101-6