Let the Data Choose its Features: Differentiable Unsupervised Feature Selection

Lindenbaum, Ofir; Shaham, Uri; Svirsky, Jonathan; Peterfreund, Erez; Kluger, Yuval

Computer Science > Machine Learning

arXiv:2007.04728v2 (cs)

[Submitted on 9 Jul 2020 (v1), revised 11 Jul 2020 (this version, v2), latest version 9 Nov 2020 (v3)]

Title:Let the Data Choose its Features: Differentiable Unsupervised Feature Selection

Authors:Ofir Lindenbaum, Uri Shaham, Jonathan Svirsky, Erez Peterfreund, Yuval Kluger

View PDF

Abstract:Scientific observations often consist of a large number of variables (features). Identifying a subset of meaningful features is often ignored in unsupervised learning, despite its potential for unraveling clear patterns hidden in the ambient space. In this paper, we present a method for unsupervised feature selection, tailored for the task of clustering. We propose a differentiable loss function which combines the graph Laplacian with a gating mechanism based on continuous approximation of Bernoulli random variables. The Laplacian is used to define a scoring term that favors low-frequency features, while the parameters of the Bernoulli variables are trained to enable selection of the most informative features. We mathematically motivate the proposed approach and demonstrate that in the high noise regime, it is crucial to compute the Laplacian on the gated inputs, rather than on the full feature set. Experimental demonstration of the efficacy of the proposed approach and its advantage over current baselines is provided using several real-world examples.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2007.04728 [cs.LG]
	(or arXiv:2007.04728v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2007.04728

Submission history

From: Ofir Lindenbaum [view email]
[v1] Thu, 9 Jul 2020 11:58:16 UTC (8,057 KB)
[v2] Sat, 11 Jul 2020 18:28:35 UTC (8,057 KB)
[v3] Mon, 9 Nov 2020 11:23:01 UTC (7,974 KB)

Computer Science > Machine Learning

Title:Let the Data Choose its Features: Differentiable Unsupervised Feature Selection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Let the Data Choose its Features: Differentiable Unsupervised Feature Selection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators