An Improved Model Selection Heuristic for AUC

Wu, Shaomin; Flach, Peter; Ferri, Cèsar

doi:10.1007/978-3-540-74958-5_44

Shaomin Wu¹,
Peter Flach² &
Cèsar Ferri³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4701))

Included in the following conference series:

European Conference on Machine Learning

6179 Accesses
14 Citations

Abstract

The area under the ROC curve (AUC) has been widely used to measure ranking performance for binary classification tasks. AUC only employs the classifier’s scores to rank the test instances; thus, it ignores other valuable information conveyed by the scores, such as sensitivity to small differences in the score values However, as such differences are inevitable across samples, ignoring them may lead to overfitting the validation set when selecting models with high AUC. This problem is tackled in this paper. On the basis of ranks as well as scores, we introduce a new metric called scored AUC (sAUC), which is the area under the sROC curve. The latter measures how quickly AUC deteriorates if positive scores are decreased. We study the interpretation and statistical properties of sAUC. Experimental results on UCI data sets convincingly demonstrate the effectiveness of the new metric for classifier evaluation and selection in the case of limited validation data.

Download to read the full chapter text

Chapter PDF

Efficient AUC Optimization for Information Ranking Applications

Bounding the difference between RankRC and RankSVM and application to multi-level rare class kernel ranking

Article 08 September 2017

Algorithm Selection via Meta-Learning and Active Meta-Learning

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L.: Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988)
Article MATH Google Scholar
Ferri, C., Flach, P., Hernández-Orallo, J., Senad, A.: Modifying ROC curves to incorporate predicted probabilities. In: Proceedings of the Second Workshop on ROC Analysis in Machine Learning (ROCML 2005) (2005)
Google Scholar
Fawcett, T.: Using Rule Sets to Maximize ROC Performance. In: Proc. IEEE Int’l Conf. Data Mining, pp. 131–138 (2001)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Let. 27-8, 861–874 (2006)
Article Google Scholar
Hanley, J.A., McNeil, B.J.: The Meaning and Use of the AUC Under a Receiver Operating Characteristic (ROC) Curve. Radiology 143, 29–36 (1982)
Google Scholar
Hsieh, F., Turnbull, B.W.: Nonparametric and Semiparametric Estimation of the Receiver Operating Characteristic Curve. Annals of Statistics 24, 25–40 (1996)
Article MATH MathSciNet Google Scholar
Huang, J., Ling, C.X.: Dynamic Ensemble Re-Construction for Better Ranking. In: Proc. 9th Eur. Conf. Principles and Practice of Knowledge Discovery in Databases, pp. 511–518 (2005)
Google Scholar
Huang, J., Ling, C.X.: Using AUC and Accuray in Evaluating Learing Algorithms. IEEE Transactions on Knowledge and Data Engineering 17, 299–310 (2005)
Article Google Scholar
Provost, F., Fawcett, T., Kohavi, R.: Analysis and Visualization of Classifier Performance: Comparison Under Imprecise Class and Cost Distribution. In: Proc. 3rd Int’l Conf. Knowledge Discovery and Data Mining, pp. 43–48 (1997)
Google Scholar
Provost, F., Fawcett, T.: Robust Classification for Imprecise Environments. Machine Learning 42, 203–231 (2001)
Article MATH Google Scholar
Provost, F., Domingos, P.: Tree Induction for Probability-Based Ranking. Machine Learning 52, 199–215 (2003)
Article MATH Google Scholar
Wu, S.M., Flach, P.: Scored Metric for Classifier Evaluation and Selection. In: Proceedings of the Second Workshop on ROC Analysis in Machine Learning (ROCML 2005) (2005)
Google Scholar
Zhou, X.H., Obuchowski, N.A., McClish, D.K.: Statistical Methods in Diagnostic Medicine. John Wiley and Sons, Chichester (2002)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Cranfield University, United Kingdom
Shaomin Wu
University of Bristol, United Kingdom
Peter Flach
Universitat Politècnica de València, Spain
Cèsar Ferri

Authors

Shaomin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Peter Flach
View author publications
You can also search for this author in PubMed Google Scholar
Cèsar Ferri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joost N. Kok Jacek Koronacki Raomon Lopez de Mantaras Stan Matwin Dunja Mladenič Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, S., Flach, P., Ferri, C. (2007). An Improved Model Selection Heuristic for AUC. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science(), vol 4701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74958-5_44

Download citation

DOI: https://doi.org/10.1007/978-3-540-74958-5_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74957-8
Online ISBN: 978-3-540-74958-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Improved Model Selection Heuristic for AUC

Abstract

Chapter PDF

Similar content being viewed by others

Efficient AUC Optimization for Information Ranking Applications

Bounding the difference between RankRC and RankSVM and application to multi-level rare class kernel ranking

Algorithm Selection via Meta-Learning and Active Meta-Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Improved Model Selection Heuristic for AUC

Abstract

Chapter PDF

Similar content being viewed by others

Efficient AUC Optimization for Information Ranking Applications

Bounding the difference between RankRC and RankSVM and application to multi-level rare class kernel ranking

Algorithm Selection via Meta-Learning and Active Meta-Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation