research-article

Better Classifier Calibration for Small Datasets

Authors:

Alasalmi Tuomo,

Jaakko Suutala,

Heli KoskimäkiAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 14, Issue 3

Article No.: 34, Pages 1 - 19

https://doi.org/10.1145/3385656

Published: 13 May 2020 Publication History

Abstract

Classifier calibration does not always go hand in hand with the classifier’s ability to separate the classes. There are applications where good classifier calibration, i.e., the ability to produce accurate probability estimates, is more important than class separation. When the amount of data for training is limited, the traditional approach to improve calibration starts to crumble. In this article, we show how generating more data for calibration is able to improve calibration algorithm performance in many cases where a classifier is not naturally producing well-calibrated outputs and the traditional approach fails. The proposed approach adds computational cost but considering that the main use case is with small datasets this extra computational cost stays insignificant and is comparable to other methods in prediction time. From the tested classifiers, the largest improvement was detected with the random forest and naive Bayes classifiers. Therefore, the proposed approach can be recommended at least for those classifiers when the amount of data available for training is limited and good calibration is essential.

References

[1]

Tuomo Alasalmi, Heli Koskimäki, Jaakko Suutala, and Juha Röning. 2018. Getting more out of small data sets - Improving the calibration performance of isotonic regression by generating more data. In Proceedings of the 10th International Conference on Agents and Artificial Intelligence. SCITEPRESS, 379--386.

[2]

Ethem Alpaydm. 1999. Combined 5 x 2 cv F test for comparing supervised classification learning algorithms. Neural Computation 11, 8 (Nov 1999), 1885--1892.

[3]

Henrik Boström. 2008. Calibrating random forests. In Proceedings of the 7th International Conference on Machine Learning and Applications. 121--126.

Digital Library

[4]

Barbara Caputo, K. Sim, F. Furesjo, and Alex Smola. 2002. Appearance-based object recognition using SVMs: Which kernel should I use? In Proceedings of the NIPS Workshop on Statistical Methods for Computational Experiments in Visual Processing and Computer Vision, Whistler.

[5]

Manuel Fernández-Delgado, Eva Cernadas, and Dinani Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research 15, 90 (2014), 3133--3181.

Digital Library

[6]

Brian Connolly, K. Bretonnel Cohen, Daniel Santel, Ulya Bayram, and John Pestian. 2017. A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support. BMC Bioinformatics 18, 1 (Dec 2017), 361.

[7]

Alexander Philip Dawid. 1982. The well-calibrated Bayesian. Journal of the American Statistical Association 77, 379 (1982), 605--610.

[8]

Thomas G. Dietterich. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10, 7 (1998), 1895--1923.

Digital Library

[9]

Pedro Domingos and Michael Pazzani. 1997. On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29, 2--3 (1997), 103--130.

Digital Library

[10]

Dheeru Dua and Casey Graff. 2019. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.

[11]

Matthias Elter, Rüdiger Schulz-Wendtland, and Thomas Wittenberg. 2007. The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Medical Physics 34, 11 (2007), 4164--4172.

[12]

Andrew Gelman, Aleks Jakulin, Maria Grazia Pittau, and Yu Sung Su. 2008. A weakly informative default prior distribution for logistic and other regression models. Annals of Applied Statistics 2, 4 (2008), 1360--1383.

[13]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Dino Pedreschi, and Fosca Giannotti. 2018. A survey of methods for explaining black box models. ACM Computing Surveys 51, 5 (2018), 1--42.

Digital Library

[14]

Xiaoqian Jiang, Melanie Osl, Jihoon Kim, and Lucila Ohno-Machado. 2011. Smooth isotonic regression: A new method to calibrate predictive models. AMIA Joint Summits on Translational Science Proceedings 2011 (2011), 16--20.

[15]

Xiaoqian Jiang, Melanie Osl, Jihoon Kim, and Lucila Ohno-Machado. 2012. Calibrating predictive model estimates to support personalized medicine. Journal of the American Medical Informatics Association 19, 2 (2012), 263--274.

[16]

Max Kuhn and Kjell Johnson. 2013. Applied Predictive Modeling. Vol. 26. Springer.

[17]

Meelis Kull and Peter Flach. 2015. Novel decompositions of proper scoring rules for classification: Score adjustment as precursor to calibration. In Machine Learning and Knowledge Discovery in Databases (Lecture Notes in Computer Science), Annalisa Appice, Pedro Pereira Rodrigues, Vítor Santos Costa, Carlos Soares, João Gama, and Alípio Jorge (Eds.), Vol. 9284. Springer International Publishing, 1--16.

[18]

Malte Kuss and Carl Edward Rasmussen. 2005. Assesing approximate inference for binary Gaussian process classification. Journal of Machine Learning Research 6, Oct (2005), 1679--1704.

[19]

Kamel Mansouri, Tine Ringsted, Davide Ballabio, Roberto Todeschini, and Viviana Consonni. 2013. Quantitative structure--activity relationship models for ready biodegradability of chemicals. Journal of Chemical Information and Modeling 53, 4 (2013), 867--878.

[20]

Mahdi Pakdaman Naeini and Gregory F. Cooper. 2018. Binary classifier calibration using an ensemble of near isotonic regression models. Knowledge and Information Systems 54, 1 (2018), 151--170.

Digital Library

[21]

Mahdi Pakdaman Naeini, Gregory F. Cooper, and Milos Hauskrecht. 2015. Binary classifier calibration: Bayesian non-parametric approach. In Proceedings of the SIAM International Conference on Data Mining. 208--216.

[22]

Mahdi Pakdaman Naeini, Gregory F. Cooper, and Milos Hauskrecht. 2015. Obtaining well calibrated probabilities using Bayesian binning. In Proceedings of the AAAI Conference on Artificial Intelligence. 2901--2907.

[23]

Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning. ACM, 625--632.

Digital Library

[24]

Alexandru Niculescu-Mizil and Richard A. Caruana. 2005. Obtaining calibrated probabilities from boosting. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence. 413--420.

[25]

John C. Platt. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers (1999).

[26]

Ryan J. Tibshirani, Holger Hoefling, and Robert Tibshirani. 2011. Nearly-isotonic regression. Technometrics 53, 1 (2011), 54--61.

[27]

Bernard Lewis Welch. 1947. The generalization of ‘Student’s’ problem when several different population variances are involved. Biometrika 34, 1--2 (1947), 28--35.

[28]

Christopher K. I. Williams and David Barber. 1998. Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 12 (1998), 1342--1351.

Digital Library

[29]

I-Cheng Yeh, King-Jang Yang, and Tao-Ming Ting. 2009. Knowledge discovery on RFM model using Bernoulli sequence. Expert Systems with Applications 36, 3 (2009), 5866--5871.

Digital Library

[30]

Bianca Zadrozny and Charles Elkan. 2001. Learning and making decisions when costs and probabilities are both unknown. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 204--213.

Digital Library

[31]

Bianca Zadrozny and Charles Elkan. 2001. Obtaining calibrated probability estimates from decision trees and Naive Bayesian classifiers. In Proceedings of the 18th International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, 609--616. Retrieved from http://dl.acm.org/citation.cfm?id=645530.655658.

[32]

Bianca Zadrozny and Charles Elkan. 2002. Transforming classifier scores into accurate multiclass probability estimates. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 694--699.

Digital Library

Cited By

Saleh E(2024)The development of fragility curves using calibrated probabilistic classifiersStructures10.1016/j.istruc.2024.10661864(106618)Online publication date: Jun-2024
https://doi.org/10.1016/j.istruc.2024.106618
Saleh ETarawneh A(2024)On application of machine learning classifiers in evaluating liquefaction potential of civil infrastructureInterpretable Machine Learning for the Analysis, Design, Assessment, and Informed Decision Making for Civil Infrastructure10.1016/B978-0-12-824073-1.00015-0(205-227)Online publication date: 2024
https://doi.org/10.1016/B978-0-12-824073-1.00015-0
Yunas Mahmoud ANeagu DScrimieri DRashad Ahmed Abdullatif A(2022)Machine Learning Experiments with Artificially Generated Big Data from Small Immunotherapy Datasets2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00165(986-991)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00165
Show More Cited By

Index Terms

Better Classifier Calibration for Small Datasets
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
  2. Modeling and simulation
    1. Model development and analysis
      1. Uncertainty quantification

Recommendations

Overfitting cautious selection of classifier ensembles with genetic algorithms

Information fusion research has recently focused on the characteristics of the decision profiles of ensemble members in order to optimize performance. These characteristics are particularly important in the selection of ensemble members. However, even ...
Combined PEST and Trial-Error approach to improve APEX calibration

Automatic calibration using Parameter Estimation (PEST).Conventional trial-and-error method.PEST improves APEX calibration.Coupling inverse modeling and trial-error improves APEX calibration. The Agricultural Policy Environmental eXtender (APEX), a ...
A novel strategy for a vertical web page classifier based on continuous learning naïve bayes algorithm

Web page classification may be considered as a one of the most challenging research areas. Where the web has a huge volume of unstructured documents of distributed data related to a variety of domains; so, considering one base for the classification ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 14, Issue 3

June 2020

381 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3388473

Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
Minginglamp Academy of Sciences, China

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2020

Online AM: 07 May 2020

Accepted: 01 February 2020

Revised: 01 January 2020

Received: 01 July 2019

Published in TKDD Volume 14, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
289
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)4

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Saleh E(2024)The development of fragility curves using calibrated probabilistic classifiersStructures10.1016/j.istruc.2024.10661864(106618)Online publication date: Jun-2024
https://doi.org/10.1016/j.istruc.2024.106618
Saleh ETarawneh A(2024)On application of machine learning classifiers in evaluating liquefaction potential of civil infrastructureInterpretable Machine Learning for the Analysis, Design, Assessment, and Informed Decision Making for Civil Infrastructure10.1016/B978-0-12-824073-1.00015-0(205-227)Online publication date: 2024
https://doi.org/10.1016/B978-0-12-824073-1.00015-0
Yunas Mahmoud ANeagu DScrimieri DRashad Ahmed Abdullatif A(2022)Machine Learning Experiments with Artificially Generated Big Data from Small Immunotherapy Datasets2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00165(986-991)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00165
Marisa MRamli ASuhadi SSulistyowati SRobbani I(2022)Telecommunication Network Interference Analysis Using Naive Bayes Classifier AlgorithmRecent Advances in Soft Computing and Data Mining10.1007/978-3-031-00828-3_17(171-183)Online publication date: 4-May-2022
https://doi.org/10.1007/978-3-031-00828-3_17
Esuli AMolinari ASebastiani F(2020)A Critical Reassessment of the Saerens-Latinne-Decaestecker Algorithm for Posterior Probability AdjustmentACM Transactions on Information Systems10.1145/343316439:2(1-34)Online publication date: 31-Dec-2020
https://doi.org/10.1145/3433164

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents