Abstract
Smoothing the parameters of multinomial distributions is an important concern in statistical inference tasks. In this paper, we present a new smoothing prior for the Multinomial Naive Bayes classifier. Our approach takes advantage of the Beta-Liouville distribution for the estimation of the multinomial parameters. Dealing with sparse documents, we exploit vocabulary knowledge to define two distinct priors over the “observed” and the “unseen” words. We analyze the problem of large-scale and sparse data by enhancing Multinomial Naive Bayes classifier through smoothing the estimation of words with a Beta-scale. Our approach is evaluated on two different challenging applications with sparse and large-scale documents namely: emotion intensity analysis and hate speech detection. Experiments on real-world datasets show the effectiveness of our proposed classifier compared to the related-work methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abbas, M., Memon, K.A., Jamali, A.A., Memon, S., Ahmed, A.: Multinomial Naive Bayes classification model for sentiment analysis. IJCSNS 19(3), 62 (2019)
Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. (TOIS) 20(4), 357–389 (2002)
Bai, J., Nie, J.Y., Paradis, F.: Using language models for text classification. In: Proceedings of the Asia Information Retrieval Symposium, Beijing, China (2004)
Bouguila, N.: Clustering of count data using generalized Dirichlet multinomial distributions. IEEE Trans. Knowl. Data Eng. 20(4), 462–474 (2008)
Bouguila, N.: A model-based approach for discrete data clustering and feature weighting using MAP and stochastic complexity. IEEE Trans. Knowl. Data Eng. 21(12), 1649–1664 (2009)
Bouguila, N.: Count data modeling and classification using finite mixtures of distributions. IEEE Trans. Neural Netw. 22(2), 186–198 (2010)
Bouguila, N.: Infinite Liouville mixture models with application to text and texture categorization. Pattern Recognit. Lett. 33(2), 103–110 (2012)
Bouguila, N.: On the smoothing of multinomial estimates using Liouville mixture models and applications. Pattern Anal. Appl. 16(3), 349–363 (2013)
Bouguila, N., Ghimire, M.N.: Discrete visual features modeling via leave-one-out likelihood estimation and applications. J. Vis. Commun. Image Represent. 21(7), 613–626 (2010)
Bouguila, N., Ziou, D.: Unsupervised learning of a finite discrete mixture: applications to texture modeling and image databases summarization. J. Vis. Commun. Image Represent. 18(4), 295–309 (2007)
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 11 (2017)
Epaillard, E., Bouguila, N.: Proportional data modeling with hidden Markov models based on generalized Dirichlet and Beta-Liouville mixtures applied to anomaly detection in public areas. Pattern Recognit. 55, 125–136 (2016)
Eyheramendy, S., Lewis, D.D., Madigan, D.: On the Naive Bayes model for text categorization (2003)
Fan, W., Bouguila, N.: Learning finite Beta-Liouville mixture models via variational Bayes for proportional data clustering. In: Rossi, F. (ed.) IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, 3–9 August 2013, pp. 1323–1329. IJCAI/AAAI (2013)
Fan, W., Bouguila, N.: Online learning of a Dirichlet process mixture of Beta-Liouville distributions via variational inference. IEEE Trans. Neural Networks Learn. Syst. 24(11), 1850–1862 (2013)
Kadam, S., Gala, A., Gehlot, P., Kurup, A., Ghag, K.: Word embedding based multinomial Naive Bayes algorithm for spam filtering. In: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–5. IEEE (2018)
Madsen, R.E., Kauchak, D., Elkan, C.: Modeling word burstiness using the Dirichlet distribution. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 545–552 (2005)
McCallum, A., Nigam, K., et al.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48. Citeseer (1998)
Mohammad, S., Bravo-Marquez, F.: Emotion intensities in tweets. In: Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), pp. 65–77. Association for Computational Linguistics, Vancouver, Canada, August 2017
Najar, F., Bouguila, N.: Happiness analysis with fisher information of Dirichlet-multinomial mixture model. In: Goutte, C., Zhu, X. (eds.) Canadian AI 2020. LNCS (LNAI), vol. 12109, pp. 438–444. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47358-7_45
Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumptions of Naive Bayes text classifiers. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 616–623 (2003)
Singer, N.F.Y.: Efficient Bayesian parameter estimation in large discrete domains. Adv. Neural. Inf. Process. Syst. 11, 417 (1999)
Sivazlian, B.: On a multivariate extension of the gamma and beta distributions. SIAM J. Appl. Math. 41(2), 205–209 (1981)
Willems, D., Vuurpijl, L.: A Bayesian network approach to mode detection for interactive maps. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 869–873. IEEE (2007)
Wong, T.T.: Alternative prior assumptions for improving the performance of Naïve Bayesian classifiers. Data Min. Knowl. Disc. 18(2), 183–213 (2009)
Xiao, Y., Lin, C., Jiang, Y., Chu, X., Shen, X.: Reputation-based QoS provisioning in cloud computing via Dirichlet multinomial model. In: 2010 IEEE International Conference on Communications, pp. 1–5. IEEE (2010)
Yuan, Q., Cong, G., Thalmann, N.M.: Enhancing Naive Bayes with various smoothing methods for short text classification. In: Proceedings of the 21st International Conference on World Wide Web, pp. 645–646 (2012)
Zamzami, N., Bouguila, N.: A novel scaled Dirichlet-based statistical framework for count data modeling: unsupervised learning and exponential approximation. Pattern Recogn. 95, 36–47 (2019)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. (TOIS) 22(2), 179–214 (2004)
Zhang, J., Ghahramani, Z., Yang, Y.: A probabilistic model for online document clustering with application to novelty detection. Adv. Neural. Inf. Process. Syst. 17, 1617–1624 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Najar, F., Bouguila, N. (2021). Sparse Document Analysis Using Beta-Liouville Naive Bayes with Vocabulary Knowledge. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12822. Springer, Cham. https://doi.org/10.1007/978-3-030-86331-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-86331-9_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86330-2
Online ISBN: 978-3-030-86331-9
eBook Packages: Computer ScienceComputer Science (R0)