Theory of Optimizing Pseudolinear Performance Measures: Application to F-measure

Parambath, Shameem A Puthiya; Usunier, Nicolas; Grandvalet, Yves

Computer Science > Machine Learning

arXiv:1505.00199v4 (cs)

[Submitted on 1 May 2015 (v1), last revised 1 Jan 2018 (this version, v4)]

Title:Theory of Optimizing Pseudolinear Performance Measures: Application to F-measure

Authors:Shameem A Puthiya Parambath, Nicolas Usunier, Yves Grandvalet

View PDF

Abstract:Non-linear performance measures are widely used for the evaluation of learning algorithms. For example, $F$-measure is a commonly used performance measure for classification problems in machine learning and information retrieval community. We study the theoretical properties of a subset of non-linear performance measures called pseudo-linear performance measures which includes $F$-measure, \emph{Jaccard Index}, among many others. We establish that many notions of $F$-measures and \emph{Jaccard Index} are pseudo-linear functions of the per-class false negatives and false positives for binary, multiclass and multilabel classification. Based on this observation, we present a general reduction of such performance measure optimization problem to cost-sensitive classification problem with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the $F$-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on pseudo-linear measures, which are asymptotic in nature. We also establish the multi-objective nature of the $F$-score maximization problem by linking the algorithm with the weighted-sum approach used in multi-objective optimization. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various $F$-measure optimization tasks.

Comments:	Extended Version of the NIPS 2014 Paper
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1505.00199 [cs.LG]
	(or arXiv:1505.00199v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1505.00199

Submission history

From: Shameem Puthiya Parambath Mr. [view email]
[v1] Fri, 1 May 2015 15:25:59 UTC (955 KB)
[v2] Tue, 19 May 2015 19:21:58 UTC (1 KB) (withdrawn)
[v3] Mon, 17 Aug 2015 18:53:59 UTC (951 KB)
[v4] Mon, 1 Jan 2018 06:34:30 UTC (955 KB)

Computer Science > Machine Learning

Title:Theory of Optimizing Pseudolinear Performance Measures: Application to F-measure

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Theory of Optimizing Pseudolinear Performance Measures: Application to F-measure

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators