Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Small-sample precision of ROC-related estimates

Published: 01 March 2010 Publication History

Abstract

Motivation: The receiver operator characteristic (ROC) curves are commonly used in biomedical applications to judge the performance of a discriminant across varying decision thresholds. The estimated ROC curve depends on the true positive rate (TPR) and false positive rate (FPR), with the key metric being the area under the curve (AUC). With small samples these rates need to be estimated from the training data, so a natural question arises: How well do the estimates of the AUC, TPR and FPR compare with the true metrics?
Results: Through a simulation study using data models and analysis of real microarray data, we show that (i) for small samples the root mean square differences of the estimated and true metrics are considerable; (ii) even for large samples, there is only weak correlation between the true and estimated metrics; and (iii) generally, there is weak regression of the true metric on the estimated metric. For classification rules, we consider linear discriminant analysis, linear support vector machine (SVM) and radial basis function SVM. For error estimation, we consider resubstitution, three kinds of cross-validation and bootstrap. Using resampling, we show the unreliability of some published ROC results.
Availability: Companion web site at http://compbio.tgen.org/paper_supp/ROC/roc.html

Cited By

View all
  • (2022)Accumulative Time Based Ranking Method to Reputation Evaluation in Information NetworksJournal of Computer Science and Technology10.1007/s11390-021-0471-437:4(960-974)Online publication date: 1-Jul-2022
  • (2020)Graph Laplacian for image anomaly detectionMachine Vision and Applications10.1007/s00138-020-01059-431:1-2Online publication date: 7-Feb-2020
  • (2016)Optimal ROC-Based Classification and Performance Analysis under Bayesian Uncertainty ModelsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2015.246596613:4(719-729)Online publication date: 1-Jul-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Bioinformatics
Bioinformatics  Volume 26, Issue 6
March 2010
146 pages

Publisher

Oxford University Press, Inc.

United States

Publication History

Published: 01 March 2010

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Accumulative Time Based Ranking Method to Reputation Evaluation in Information NetworksJournal of Computer Science and Technology10.1007/s11390-021-0471-437:4(960-974)Online publication date: 1-Jul-2022
  • (2020)Graph Laplacian for image anomaly detectionMachine Vision and Applications10.1007/s00138-020-01059-431:1-2Online publication date: 7-Feb-2020
  • (2016)Optimal ROC-Based Classification and Performance Analysis under Bayesian Uncertainty ModelsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2015.246596613:4(719-729)Online publication date: 1-Jul-2016
  • (2016)Evaluation Method, Dataset Size or Dataset ContentJournal of Mathematical Imaging and Vision10.1007/s10851-015-0626-455:3(378-400)Online publication date: 1-Jul-2016
  • (2015)Ensemble of sparse classifiers for high-dimensional biological dataInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2015.06941612:2(167-183)Online publication date: 1-May-2015
  • (2015)Optimizing area under the ROC curve using semi-supervised learningPattern Recognition10.1016/j.patcog.2014.07.02548:1(276-287)Online publication date: 1-Jan-2015
  • (2015)Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposalExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.02.04242:13(5737-5753)Online publication date: 1-Aug-2015
  • (2015)Investigating fitness functions for a hyper-heuristic evolutionary algorithm in the context of balanced and imbalanced data classificationGenetic Programming and Evolvable Machines10.1007/s10710-014-9235-z16:3(241-281)Online publication date: 1-Sep-2015
  • (2014)An empirical evaluation of ranking measures with respect to robustness to noiseJournal of Artificial Intelligence Research10.5555/2655713.265572149:1(241-267)Online publication date: 1-Jan-2014
  • (2014)DISCOVERING ROBUST EMBEDDINGS IN DISSIMILARITY SPACE FOR HIGH-DIMENSIONAL LINGUISTIC FEATURESComputational Intelligence10.1111/j.1467-8640.2012.00452.x30:2(285-315)Online publication date: 1-May-2014
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media