Abstract
We propose a way to infer distributions of any performance indicator computed from the confusion matrix. This allows us to evaluate the variability of an indicator and to assess the importance of an observed difference between two performance indicators. We will assume that the values in a confusion matrix are observations coming from a multinomial distribution. Our method is based on a Bayesian approach in which the unknown parameters of the multinomial probability function themselves are assumed to be generated from a random vector. We will show that these unknown parameters follow a Dirichlet distribution. Thanks to the Bayesian approach, we also benefit from an elegant way of injecting prior knowledge into the distributions. Experiments are done on real and synthetic data sets and assess our method’s ability to construct accurate distributions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM, New York (2006)
Efron, B.: Bootstrap methods: another look at the jackknife. In: Breakthroughs in Statistics, pp. 569–593. Springer, Berlin (1992)
Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Lawrence Erlbaum Associates Ltd (2001)
Forbes, C., Evans, M., Hastings, N., Peacock, B.: Statistical distributions. Wiley, Hoboken (2011)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian data analysis, vol. 2. Chapman & Hall/CRC Boca Raton, FL (2014)
Goutte, C., Gaussier, E.: A probabilistic interpretation of precision, recall and f-score, with implication for evaluation. In: Advances in Information Retrieval, pp. 345–359. Springer, Berlin (2005)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An introduction to statistical learning, vol. 6. Springer, Berlin (2013)
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002). http://CRAN.R-project.org/doc/Rnews/
Powers, D.M.: Evaluation: from precision, recall and f-measure to roc, informedness markedness and correlation (2011)
Wackerly, D., Mendenhall, W., Scheaffer, R.: Mathematical statistics with applications. Cengage Learning, Boston (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Caelen, O. A Bayesian interpretation of the confusion matrix. Ann Math Artif Intell 81, 429–450 (2017). https://doi.org/10.1007/s10472-017-9564-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-017-9564-8