Article

Precision-Recall-Gain curves: PR analysis done right

Authors:

Peter A. Flach,

Meelis KullAuthors Info & Claims

NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1

Pages 838 - 846

Published: 07 December 2015 Publication History

Abstract

Precision-Recall analysis abounds in applications of binary classification where true negatives do not add value and hence should not affect assessment of the classifier's performance. Perhaps inspired by the many advantages of receiver operating characteristic (ROC) curves and the area under such curves for accuracy-based performance assessment, many researchers have taken to report Precision-Recall (PR) curves and associated areas as performance metric. We demonstrate in this paper that this practice is fraught with difficulties, mainly because of incoherent scale assumptions - e.g., the area under a PR curve takes the arithmetic mean of precision values whereas the F_β score applies the harmonic mean. We show how to fix this by plotting PR curves in a different coordinate system, and demonstrate that the new Precision-Recall-Gain curves inherit all key advantages of ROC curves. In particular, the area under Precision-Recall-Gain curves conveys an expected F₁ score on a harmonic scale, and the convex hull of a Precision-Recall-Gain curve allows us to calibrate the classifier's scores so as to determine, for each operating point on the convex hull, the interval of β values for which the point optimises F_β. We demonstrate experimentally that the area under traditional PR curves can easily favour models with lower expected F₁ score than others, and so the use of Precision-Recall-Gain curves will result in better model selection.

References

[1]

K. Boyd, V. S. Costa, J. Davis, and C. D. Page. Unachievable region in precision-recall space and its effect on empirical evaluation. In International Conference on Machine Learning, page 349, 2012.

[2]

J. Davis and M. Goadrich. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, pages 233-240, 2006.

[3]

T. Fawcett. An introduction to ROC analysis. Pattern Recognition Letters, 27(8):861-874, 2006.

[4]

T. Fawcett and A. Niculescu-Mizil. PAV and the ROC convex hull. Machine Learning, 68(1): 97-106, July 2007.

[5]

P. A. Flach. The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), pages 194-201, 2003.

[6]

P. A. Flach. ROC analysis. In C. Sammut and G. Webb, editors, Encyclopedia of Machine Learning, pages 869-875. Springer US, 2010.

[7]

D. J. Hand and R. J. Till. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45(2):171-186, 2001.

[8]

J. Hernández-Orallo, P. Flach, and C. Ferri. A unified view of performance metrics: Translating threshold choice into expected classification loss. Journal of Machine Learning Research, 13: 2813-2869, 2012.

[9]

O. O. Koyejo, N. Natarajan, P. K. Ravikumar, and I. S. Dhillon. Consistent binary classification with generalized performance metrics. In Advances in Neural Information Processing Systems, pages 2744-2752, 2014.

[10]

Z. C. Lipton, C. Elkan, and B. Naryanaswamy. Optimal thresholding of classifiers to maximize F1 measure. In Machine Learning and Knowledge Discovery in Databases, volume 8725 of Lecture Notes in Computer Science, pages 225-239. Springer Berlin Heidelberg, 2014.

[11]

H. Narasimhan, R. Vaish, and S. Agarwal. On the statistical consistency of plug-in classifiers for non-decomposable performance measures. In Advances in Neural Information Processing Systems 27, pages 1493-1501. 2014.

[12]

S. P. Parambath, N. Usunier, and Y. Grandvalet. Optimizing F-measures by cost-sensitive classification. In Advances in Neural Information Processing Systems, pages 2123-2131, 2014.

[13]

J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, pages 61-74. MIT Press, Boston, 1999.

[14]

F. Provost and T. Fawcett. Robust classification for imprecise environments. Machine Learning, 42(3):203-231, 2001.

[15]

B. Sluban and N. Lavrač. Vipercharts: Visual performance evaluation platform. In H. Blockeel, K. Kersting, S. Nijssen, and F. Železný, editors, Machine Learning and Knowledge Discovery in Databases, volume 8190 of Lecture Notes in Computer Science, pages 650-653. Springer

[16]

Berlin Heidelberg, 2013.

[17]

C. J. Van Rijsbergen. Information Retrieval. Butterworth-Heinemann, Newton, MA, USA, 2nd edition, 1979.

[18]

J. Vanschoren, J. N. van Rijn, B. Bischl, and L. Torgo. OpenML: networked science in machine learning. SIGKDD Explorations, 15(2):49-60, 2013.

[19]

N. Ye, K. M. A. Chai, W. S. Lee, and H. L. Chieu. Optimizing F-measures: A tale of two approaches. In Proceedings of the 29th International Conference on Machine Learning, pages 289-296, 2012.

[20]

B. Zadrozny and C. Elkan. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pages 609-616, 2001.

[21]

M.-J. Zhao, N. Edakunni, A. Pocock, and G. Brown. Beyond Fano's inequality: bounds on the optimal F-score, BER, and cost-sensitive risk and their implications. The Journal of Machine Learning Research, 14(1):1033-1090, 2013.

Cited By

Mikhail JFossaceca JIammartino R(2019)A Semi-Boosted Nested Model With Sensitivity-Based Weighted Binarization for Multi-Domain Network Intrusion DetectionACM Transactions on Intelligent Systems and Technology10.1145/331377810:3(1-27)Online publication date: 12-Apr-2019
https://dl.acm.org/doi/10.1145/3313778
Tao HLu X(2018)Smoky vehicle detection based on multi-feature fusion and ensemble neural networksMultimedia Tools and Applications10.1007/s11042-018-6248-277:24(32153-32177)Online publication date: 1-Dec-2018
https://dl.acm.org/doi/10.1007/s11042-018-6248-2
Pillai IFumera GRoli F(2017)Designing multi-label classifiers that maximize F measuresPattern Recognition10.1016/j.patcog.2016.08.00861:C(394-404)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1016/j.patcog.2016.08.008

Precision-Recall-Gain curves: PR analysis done right
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

The relationship between Precision-Recall and ROC curves
ICML '06: Proceedings of the 23rd international conference on Machine learning

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an ...
Rational Pythagorean-hodograph space curves

A method for constructing rational Pythagorean-hodograph (PH) curves in R^3 is proposed, based on prescribing a field of rational unit tangent vectors. This tangent field, together with its first derivative, defines the orientation of the curve ...
Polynomial curves with projections to PH curves
Abstract
Despite the fact that the orthogonal projection of a spatial Pythagorean hodograph (PH) curve into the plane is not a planar PH curve in general, we can find special cases such that the PH property is preserved when the curve is ...
Graphical abstract
Highlights
- PH curves are characterized via intersection multiplicity of a suitable associated curve with the absolute conic.

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1

December 2015

3626 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 07 December 2015

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mikhail JFossaceca JIammartino R(2019)A Semi-Boosted Nested Model With Sensitivity-Based Weighted Binarization for Multi-Domain Network Intrusion DetectionACM Transactions on Intelligent Systems and Technology10.1145/331377810:3(1-27)Online publication date: 12-Apr-2019
https://dl.acm.org/doi/10.1145/3313778
Tao HLu X(2018)Smoky vehicle detection based on multi-feature fusion and ensemble neural networksMultimedia Tools and Applications10.1007/s11042-018-6248-277:24(32153-32177)Online publication date: 1-Dec-2018
https://dl.acm.org/doi/10.1007/s11042-018-6248-2
Pillai IFumera GRoli F(2017)Designing multi-label classifiers that maximize F measuresPattern Recognition10.1016/j.patcog.2016.08.00861:C(394-404)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1016/j.patcog.2016.08.008

View Options

View options

Media

Figures

Other

Tables

View Table of Contents