research-article

Unsupervised feature selection for multi-cluster data

Authors:

Xiaofei HeAuthors Info & Claims

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 333 - 342

https://doi.org/10.1145/1835804.1835848

Published: 25 July 2010 Publication History

Abstract

In many data analysis tasks, one is often confronted with very high dimensional data. Feature selection techniques are designed to find the relevant feature subset of the original features which can facilitate clustering, classification and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. Traditional unsupervised feature selection methods address this issue by selecting the top ranked features based on certain scores computed independently for each feature. These approaches neglect the possible correlation between different features and thus can not produce an optimal feature subset. Inspired from the recent developments on manifold learning and L1-regularized models for subset selection, we propose in this paper a new approach, called Multi-Cluster Feature Selection (MCFS), for unsupervised feature selection. Specifically, we select those features such that the multi-cluster structure of the data can be best preserved. The corresponding optimization problem can be efficiently solved since it only involves a sparse eigen-problem and a L1-regularized least squares problem. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithm.

Supplementary Material

JPG File (kdd2010_cai_ufsm_01.jpg)

Download
9.76 KB

MOV File (kdd2010_cai_ufsm_01.mov)

Download
150.20 MB

References

[1]

M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems 14, pages 585--591. 2001.

[2]

J. Bi, K. Bennett, M. Embrechts, C. Breneman, and M. Song. Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3:1229--1243, 2003.

Digital Library

[3]

S. Boutemedjet, N. Bouguila, and D. Ziou. A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8):1429--1443, 2009.

Digital Library

[4]

C. Boutsidis, M. W. Mahoney, and P. Drineas. Unsupervised feature selection for principal components analysis. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08), pages 61--69, 2008.

Digital Library

[5]

D. Cai. Spectral Regression: A Regression Framework for Efficient Regularized Subspace Learning. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, May 2009.

Digital Library

[6]

D. Cai, X. He, and J. Han. Spectral regression: A unified approach for sparse subspace learning. In Proc. Int. Conf. on Data Mining (ICDM'07), 2007.

Digital Library

[7]

D. Cai, X. He, and J. Han. Sparse projections over graph. In Proc. 2008 AAAI Conf. on Artificial Intelligence (AAAI'08), 2008.

Digital Library

[8]

P. K. Chan, D. F. Schlag, and J. Y. Zien. Spectral k-way ratio-cut partitioning and clustering. IEEE Transactions on Computer-Aided Design, 13:1088--1096, 1994.

Digital Library

[9]

F. R. K. Chung. Spectral Graph Theory, volume 92 of Regional Conference Series in Mathematics. AMS, 1997.

[10]

C. Constantinopoulos, M. K. Titsias, and A. Likas. Bayesian feature and model selection for gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6):1013--1018, 2006.

Digital Library

[11]

T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, 2nd edition, 2006.

Digital Library

[12]

R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience, Hoboken, NJ, 2nd edition, 2000.

Digital Library

[13]

J. G. Dy and C. E. Brodley. Feature selection for unsupervised learning. Journal of Machine Learning Research, 5:845--889, 2004.

Digital Library

[14]

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics, 32(2):407--499, 2004.

[15]

M. A. Fanty and R. Cole. Spoken letter recognition. In Advances in Neural Information Processing Systems 3, 1990.

Digital Library

[16]

T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer-Verlag, 2001.

[17]

X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18, 2005.

[18]

J. J. Hull. A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell., 16(5), 1994.

Digital Library

[19]

R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1--2):273--324, 1997.

Digital Library

[20]

M. H. C. Law, M. A. T. Figueiredo, and A. K. Jain. Simultaneous feature selection and clustering using mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9):1154--1166, 2004.

Digital Library

[21]

H. Liu and L. Yu. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4):491--502, 2005.

Digital Library

[22]

A. Y. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, pages 849--856. MIT Press, Cambridge, MA, 2001.

Digital Library

[23]

J. L. Rodgers and W. A. Nicewander. Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1):59--66, 1988.

[24]

V. Roth and T. Lange. Feature selection in clustering problems. In Advances in Neural Information Processing Systems 16. 2003.

[25]

S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323--2326, 2000.

[26]

J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.

Digital Library

[27]

G. W. Stewart. Matrix Algorithms Volume II: Eigensystems. SIAM, 2001.

Digital Library

[28]

J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319--2323, 2000.

[29]

L. Wolf and A. Shashua. Feature selection for unsupervised and supervised inference: The emergence of sparsity in a weight-based approach. Journal of Machine Learning Research, 6:1855--1887, 2005.

Digital Library

[30]

Z. Zhao and H. Liu. Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th Annual International Conference on Machine Learning (ICML'07), pages 1151--1157, 2007.

Digital Library

Cited By

Moslemi AJamshidi M(2025)Unsupervised feature selection using sparse manifold learning: Auto-encoder approachInformation Processing & Management10.1016/j.ipm.2024.10392362:1(103923)Online publication date: Jan-2025
https://doi.org/10.1016/j.ipm.2024.103923
Ju HGuo JDing WYang X(2025)D3WC: Deep three-way clustering with granular evidence fusionInformation Fusion10.1016/j.inffus.2024.102699114(102699)Online publication date: Feb-2025
https://doi.org/10.1016/j.inffus.2024.102699
Lee HKim MCho S(2024)CoCoder : Concrete Autoencoder using Covariance for Unsupervised Feature SelectionJOURNAL OF BROADCAST ENGINEERING10.5909/JBE.2024.29.3.24229:3(242-251)Online publication date: 31-May-2024
https://doi.org/10.5909/JBE.2024.29.3.242
Show More Cited By

Index Terms

Unsupervised feature selection for multi-cluster data
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Feature selection

Recommendations

Cluster structure preserving unsupervised feature selection for multi-view tasks

Multi-view or multi-modal tasks exist in many areas of pattern analysis as the advancement of feature acquisition or extraction. These tasks are usually confronted with the issue of curse of dimensionality. In this work we consider the unsupervised ...
A Redundancy Based Unsupervised Feature Selection Method for High-Dimensional Data
ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing

Feature selection is a process to select key features from the initial feature set. It is commonly used as a preprocessing step to improve the efficiency and accuracy of a classification model in artificial intelligence and machine learning domains. ...
A clustering-based feature selection via feature separability
ICNC-FSKD 2015

With the extensive increase of the amount of data, such as text categorization, genomic microarray data, bio-informatics and digital images, there are more and more challenges in feature selection. Recently, feature selection has been widely studied in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

July 2010

1240 pages

ISBN:9781450300551

DOI:10.1145/1835804

General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '10

Sponsor:

KDD '10: The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

July 25 - 28, 2010

DC, Washington, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

793
Total Citations
View Citations
4,505
Total Downloads

Downloads (Last 12 months)217
Downloads (Last 6 weeks)23

Reflects downloads up to 25 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Moslemi AJamshidi M(2025)Unsupervised feature selection using sparse manifold learning: Auto-encoder approachInformation Processing & Management10.1016/j.ipm.2024.10392362:1(103923)Online publication date: Jan-2025
https://doi.org/10.1016/j.ipm.2024.103923
Ju HGuo JDing WYang X(2025)D3WC: Deep three-way clustering with granular evidence fusionInformation Fusion10.1016/j.inffus.2024.102699114(102699)Online publication date: Feb-2025
https://doi.org/10.1016/j.inffus.2024.102699
Lee HKim MCho S(2024)CoCoder : Concrete Autoencoder using Covariance for Unsupervised Feature SelectionJOURNAL OF BROADCAST ENGINEERING10.5909/JBE.2024.29.3.24229:3(242-251)Online publication date: 31-May-2024
https://doi.org/10.5909/JBE.2024.29.3.242
Demirel SAydın F(2024)A New Fast Filter-based Unsupervised Feature Selection Algorithm Using Cumulative and Shannon EntropyJournal of Soft Computing and Artificial Intelligence10.55195/jscai.14646385:1(11-23)Online publication date: 15-Jun-2024
https://doi.org/10.55195/jscai.1464638
Kampezidou STikayat Ray ABhat APinon Fischer OMavris D(2024)Fundamental Components and Principles of Supervised Machine Learning Workflows with Numerical and Categorical DataEng10.3390/eng50100215:1(384-416)Online publication date: 29-Feb-2024
https://doi.org/10.3390/eng5010021
Yang ALi XXiu X(2024) Sparse PCA via ℓ 2,0 -Norm Constrained Optimization for Unsupervised Feature Selection 2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10661810(7375-7379)Online publication date: 28-Jul-2024
https://doi.org/10.23919/CCC63176.2024.10661810
Huang CShang PXiu X(2024)A Riemannian Augmented Lagrangian Method for Structured Sparse PCA2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10661785(3410-3415)Online publication date: 28-Jul-2024
https://doi.org/10.23919/CCC63176.2024.10661785
Ghasemi PLee J(2024)Unsupervised Feature Selection to Identify Important ICD-10 and ATC Codes for Machine Learning on a Cohort of Patients With Coronary Heart Disease: Retrospective StudyJMIR Medical Informatics10.2196/5289612(e52896-e52896)Online publication date: 26-Jul-2024
https://doi.org/10.2196/52896
Gong JHan TYang XChen ZDong J(2024)Refined composite multivariate multiscale weighted permutation entropy and multicluster feature selection-based fault detection of gearboxTransactions of the Institute of Measurement and Control10.1177/01423312241257143Online publication date: 25-Jul-2024
https://doi.org/10.1177/01423312241257143
Guo YSun YWang ZNie FWang F(2024)Double-Structured Sparsity Guided Flexible Embedding Learning for Unsupervised Feature SelectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326718435:10(13354-13367)Online publication date: Oct-2024
https://doi.org/10.1109/TNNLS.2023.3267184
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents