High Dimensional Classification via Empirical Risk Minimization: Improvements and Optimality

Mai, Xiaoyi; Liao, Zhenyu

Statistics > Machine Learning

arXiv:1905.13742v1 (stat)

[Submitted on 31 May 2019 (this version), latest version 25 Nov 2020 (v2)]

Title:High Dimensional Classification via Empirical Risk Minimization: Improvements and Optimality

Authors:Xiaoyi Mai, Zhenyu Liao

View PDF

Abstract:In this article, we investigate a family of classification algorithms defined by the principle of empirical risk minimization, in the high dimensional regime where the feature dimension $p$ and data number $n$ are both large and comparable. Based on recent advances in high dimensional statistics and random matrix theory, we provide under mixture data model a unified stochastic characterization of classifiers learned with different loss functions. Our results are instrumental to an in-depth understanding as well as practical improvements on this fundamental classification approach. As the main outcome, we demonstrate the existence of a universally optimal loss function which yields the best high dimensional performance at any given $n/p$ ratio.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1905.13742 [stat.ML]
	(or arXiv:1905.13742v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1905.13742

Submission history

From: Zhenyu Liao [view email]
[v1] Fri, 31 May 2019 17:52:26 UTC (28 KB)
[v2] Wed, 25 Nov 2020 02:51:19 UTC (46 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat.ML

< prev | next >

new | recent | 2019-05

Change to browse by:

cs
cs.LG
stat

References & Citations

export BibTeX citation

Statistics > Machine Learning

Title:High Dimensional Classification via Empirical Risk Minimization: Improvements and Optimality

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:High Dimensional Classification via Empirical Risk Minimization: Improvements and Optimality

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators