research-article

Online Feature Selection with Group Structure Analysis

Authors:

Xindong WuAuthors Info & Claims

IEEE Transactions on Knowledge and Data Engineering, Volume 27, Issue 11

Pages 3029 - 3041

https://doi.org/10.1109/TKDE.2015.2441716

Published: 01 November 2015 Publication History

Abstract

Online selection of dynamic features has attracted intensive interest in recent years. However, existing online feature selection methods evaluate features individually and ignore the underlying structure of a feature stream. For instance, in image analysis, features are generated in groups which represent color, texture, and other visual information. Simply breaking the group structure in feature selection may degrade performance. Motivated by this observation, we formulate the problem as an online group feature selection. The problem assumes that features are generated individually but there are group structures in the feature stream. To the best of our knowledge, this is the first time that the correlation among streaming features has been considered in the online feature selection process. To solve this problem, we develop a novel online group feature selection method named OGFS. Our proposed approach consists of two stages: online intra-group selection and online inter-group selection. In the intra-group selection, we design a criterion based on spectral analysis to select discriminative features in each group. In the inter-group selection, we utilize a linear regression model to select an optimal subset. This two-stage procedure continues until there are no more features arriving or some predefined stopping conditions are met. Finally, we apply our method to multiple tasks including image classification and face verification. Extensive empirical studies performed on real-world and benchmark data sets demonstrate that our method outperforms other state-of-the-art online feature selection methods.

References

[1]

X. Wu, X. Zhu, G.-Q. Wu, and W. Ding, “Data mining with big data,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 1, pp. 97 –107, Jan. 2014.

Digital Library

[2]

I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” The J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.

Digital Library

[3]

H. Liu and H. Motoda, Computational Methods of Feature Selection. Boca Raton, FL, USA: CRC Press, 2007.

[4]

L. Yu and H. Liu, “ Efficient feature selection via analysis of relevance and redundancy,” The J. Mach. Learn. Res., vol. 5, pp. 1205–1224, 2004.

Digital Library

[5]

M. Wang, H. Li, D. Tao, K. Lu, and X. Wu, “Multimodal Graph-based reranking for web image search,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4649 –4661, Nov. 2012.

Digital Library

[6]

J. Mairal, F. Bach, J. Ponce, and G. Sapiro, “Online learning for matrix factorization and sparse coding,” The J. Mach. Learn. Res., vol. 11, pp. 19–60, 2010.

Digital Library

[7]

L. Bottou, “Online learning and stochastic approximations, ” Online Learning in Neural Networks, Cambridge Univ. Press, 1998.

Digital Library

[8]

J. Wang, P. Zhao, S. Hoi, and R. Jin, “Online feature selection and its applications,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 3, pp. 698–710, Mar. 2013.

[9]

X. Gao, S. C. H. Hoi, Y. Zhang, J. Wan, and J. Li, “Soml: Sparse online metric learning with application to image retrieval,” in Proc. AAAI Conf. Artif. Intell., 2014, pp. 1206–1212.

[10]

S. Perkins and J. Theiler, “Online feature selection using grafting,” in Proc. Int. Conf. Mach. Learn., 2003, pp. 592–599.

[11]

J. Zhou, D. P. Foster, R. Stine, and L. H. Ungar, “Streamwise feature selection using Alpha-investing,” in Proc. 11th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining , 2005, pp. 384–393.

[12]

X. Wu, K. Yu, H. Wang, and W. Ding, “Online streaming feature selection, ” in Proc. Int. Conf. Mach. Learn., 2010, pp. 1159–1166.

[13]

X. Wu, K. Yu, W. Ding, H. Wang, and X. Zhu, “Online feature selection with streaming features,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 5, pp. 1178– 1192, May 2013.

Digital Library

[14]

J. Shen, G. Liu, J. Chen, Y. Fang, J. Xie, Y. Yu, and S. Yan, “Unified structured learning for simultaneous human pose estimation and garment attribute classification,” arXiv preprint arXiv:1404.4923, 2014.

[15]

M. Wang, Y. Gao, K. Lu, and Y. Rui, “View-based discriminative probabilistic modeling for 3d object retrieval and recognition,” IEEE Trans. Image Process., vol. 22, no. 4, pp. 1395–1407, Apr. 2013.

Digital Library

[16]

M. Wang, B. Ni, X.-S. Hua, and T.-S. Chua, “Assistive tagging: A survey of multimedia tagging with human-computer joint exploration,” ACM Comput. Surveys, vol. 44, no. 4, p. 25, 2012.

Digital Library

[17]

Z.-Q. Zhao, H. Glotin, Z. Xie, J. Gao, and X. Wu, “Cooperative sparse representation in two opposite directions for Semi-supervised image annotation,” IEEE Trans. Image Process., vol. 21, no. 9, pp. 4218–4231, Sep. 2012.

Digital Library

[18]

M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” J. Roy. Statist. Soc., vol. 68, no. 1, pp. 49– 67, 2006.

[19]

S. Xiang, X. T. Shen, and J. P. Ye, “Efficient sparse group feature selection via nonconvex optimization,” in Proc. Int. Conf. Mach. Learn., 2012, pp. 284–292.

[20]

Y. Zhou, U. Porwal, C. Zhang, H. Q. Ngo, L. Nguyen, C. Ré, and V. Govindaraju, “Parallel feature selection inspired by group testing,” in Proc. Neural Inf. Process. Syst., 2014, pp. 3554–3562.

[21]

S. Xiang, T. Yang, and J. Ye, “Simultaneous feature and feature group selection through hard thresholding,” in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2014, pp. 532–541.

Digital Library

[22]

J. Wang and J. Ye, “Two-layer feature reduction for Sparse-group lasso via decomposition of convex sets, ” in Proc. Neural Inf. Process. Syst., 2014, pp. 2132 –2140.

[23]

H. Yang, Z. Xu, I. King, and M. R. Lyu, “Online learning for group lasso, ” in Proc. 27th Int. Conf. Mach. Learn., 2010, pp. 1191 –1198.

Digital Library

[24]

J. Wang, Z.-Q. Zhao, X. Hu, Y.-M. Cheung, M. Wang, and X. Wu, “Online group feature selection,” in Proc. 23rd Int. Joint Conf. Artif. Intell., 2013, pp. 1757–1763.

[25]

J. G. Dy and C. E. Brodley, “Feature selection for unsupervised learning,” The J. Mach. Learn. Res., vol. 5, pp. 845–889, 2004.

Digital Library

[26]

Y. Yang, H. T. Shen, Z. Ma, Z. Huang, and X. Zhou, “l 2, 1-norm regularized discriminative feature selection for unsupervised learning,” in Proc. 22nd Int. Joint Conf. Artif. Intell., 2011, pp. 1589–1594.

[27]

Z. Zhao and H. Liu, “Spectral feature selection for supervised and unsupervised learning,” in Proc. Int. Conf. Mach. Learn., 2007, pp. 1151–1157.

Digital Library

[28]

A. K. Farahat, A. Ghodsi, and M. S. Kamel, “Efficient greedy feature selection for unsupervised learning,” Knowl. Inf. Syst., vol. 35, no. 2, pp. 285–310, 2013.

[29]

Z. Zhao, L. Wang, H. Liu, and J. Ye, “On similarity preserving feature selection, ” IEEE Trans. Knowl. Data Eng., vol. 25, no. 3, pp. 619–632, Mar. 2013.

Digital Library

[30]

S. Das, “Filters, wrappers and a Boosting-based hybrid for feature selection, ” in Proc. 18th Int. Conf. Mach. Learn., 2001, pp. 74 –81.

[31]

H. Wang, S. Yan, D. Xu, X. Tang, and T. Huang, “Trace ratio vs. ratio trace for dimensionality reduction,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog., 2007, pp. 671 –676.

[32]

I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” The J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.

Digital Library

[33]

E. F. Combarro, E. Montanes, I. Diaz, J. Ranilla, and R. Mones, “Introducing a family of linear measures for feature selection in text categorization,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 9, pp. 1223–1232, Sep. 2005.

Digital Library

[34]

L. Song, A. Smola, A. Gretton, K. M. Borgwardt, and J. Bedo, “Supervised feature selection via dependence estimation, ” in Proc. 24th Int. Conf. Mach. Learn., 2007, pp. 823 –830.

Digital Library

[35]

H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of Max-dependency, Max-relevance, and Min-redundancy, ” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8, pp. 1226–1238, Aug. 2005.

Digital Library

[36]

I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Mach. Learn., vol. 46, no. 1-3, pp. 389–422, 2002.

Digital Library

[37]

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression, ” The Ann. Statist., vol. 32, no. 2, pp. 407–499, 2004.

[38]

H. Zou, “The adaptive lasso and its oracle properties,” J. Am. Statist. Assoc., vol. 101, no. 476, pp. 1418– 1429, 2006.

[39]

L. Yuan, J. Liu, and J. Ye, “Efficient methods for overlapping group lasso,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 9, pp. 2104–2116, Sep. 2013.

Digital Library

[40]

R. Tibshirani, “Regression shrinkage and selection via the lasso, ” J. Roy. Statist. Soc., vol. 58, pp. 267–288, 1996.

[41]

K. Glocer, D. Eads, and J. Theiler, “Online feature selection for pixel classification,” in Proc. 22nd Int. Conf. Mach. Learn. , 2005, pp. 249–256.

Digital Library

[42]

C. Pillers Dobler, “Mathematical statistics: Basic ideas and selected topics, ” The Am. Statist., vol. 56, no. 4, pp. 332–332, 2002.

[43]

G. H. John, R. Kohavi, K. Pfleger, et al.et al., “ Irrelevant features and the subset selection problem,” in Proc. Int. Conf. Mach. Learn., 1994, vol. 94, pp. 121–129.

[44]

Z. Zhao, L. Wang, H. Liu, and J. Ye, “On similarity preserving feature selection,” IEEE Trans. Knowledge Data Eng., vol. 25, no. 3, pp. 619–632, 2013.

[45]

F. Nie, S. Xiang, Y. Jia, C. Zhang, and S. Yan, “Trace ratio criterion for feature selection,” in Proc. AAAI Conf. Artif. Intell., 2008, vol. 2, pp. 671–676.

[46]

F. C. Graham, Spectral Graph Theory, American Mathematical Soc., vol. 92, 1997.

[47]

H. Lee, A. Battle, R. Raina, and A. Y. Ng, “Efficient sparse coding algorithms, ” in Proc. Neural Inf. Process. Syst., 2007, vol. 19, p. 801.

[48]

A. Hyvärinen, P. O. Hoyer, and M. Inki, “The independence assumption: analyzing the independence of the components by topography,” in Advances in Independent Component Analysis. New York, NY, USA: Springer, 2000, pp. 45–62.

[49]

Z. Xu, G. Huang, K. Q. Weinberger, and A. X. Zheng, “Gradient boosted feature selection, ” in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2014, pp. 522–531.

Digital Library

[50]

A. Krizhevsky, “Learning multiple layers of features from tiny images, ” Master's Thesis, Comput. Sci. Dept., Univ. of Toronto, 2009.

[51]

L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories, ” Comput. Vis. Image Understanding, vol. 106, no. 1, pp. 59–70, 2007.

Digital Library

[52]

G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” Univ. of Massachusetts, Amherst, MA, USA, Tech. Rep. 07-49, 2007.

Cited By

Zhou PGuo YYu HYan YZhang YWu X(2024)Concept Evolution Detecting over Feature StreamsACM Transactions on Knowledge Discovery from Data10.1145/367801218:8(1-32)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3678012
Zhuo SQiu JWang CHuang S(2024)Online Feature Selection With Varying Feature SpacesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337724336:9(4806-4819)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3377243
Sun YZhu P(2024)Online group streaming feature selection based on fuzzy neighborhood granular ball rough setsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123778249:PCOnline publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123778
Show More Cited By

Index Terms

Online Feature Selection with Group Structure Analysis
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
    2. Machine learning algorithms
2. Information systems

Index terms have been assigned to the content through auto-classification.

Recommendations

A survey on online feature selection with streaming features

In the era of big data, the dimensionality of data is increasing dramatically in many domains. To deal with high dimensionality, online feature selection becomes critical in big data mining. Recently, online selection of dynamic features has received ...
Online group feature selection
IJCAI '13: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Online feature selection with dynamic features has become an active research area in recent years. However, in some real-world applications such as image analysis and email spam filtering, features may arrive by groups. Existing online feature selection ...
Online early terminated streaming feature selection based on Rough Set theory▪
Abstract
Feature selection is a vital dimensionality reduction technology for machine learning and data mining that aims to select a minimal subset from the original feature space. Traditional feature selection methods assume that all features ...
Highlights
- First present the issue of how to terminate online streaming feature selection early.

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering

IEEE Transactions on Knowledge and Data Engineering Volume 27, Issue 11

Nov. 2015

287 pages

ISSN:1041-4347

Issue’s Table of Contents

Copyright © 2015.

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 November 2015

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

31
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou PGuo YYu HYan YZhang YWu X(2024)Concept Evolution Detecting over Feature StreamsACM Transactions on Knowledge Discovery from Data10.1145/367801218:8(1-32)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3678012
Zhuo SQiu JWang CHuang S(2024)Online Feature Selection With Varying Feature SpacesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337724336:9(4806-4819)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3377243
Sun YZhu P(2024)Online group streaming feature selection based on fuzzy neighborhood granular ball rough setsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123778249:PCOnline publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123778
Chen ZSheng VEdwards AZhang K(2024)Cost-sensitive sparse group online learning for imbalanced data streamsMachine Language10.1007/s10994-023-06403-z113:7(4407-4444)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s10994-023-06403-z
Li DGu BWilliams BChen YNeville J(2023)When online learning meets ODEProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i7.26029(8545-8553)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i7.26029
Eskandari SSeifaddini M(2023)Online and offline streaming feature selection methods with bat algorithm for redundancy analysisPattern Recognition10.1016/j.patcog.2022.109007133:COnline publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.patcog.2022.109007
Zhao RSun JYou YJiang JYu H(2023)Learning framework based on ER Rule for data streams with generalized feature spacesInformation Sciences: an International Journal10.1016/j.ins.2023.119604649:COnline publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1016/j.ins.2023.119604
Villa-Blanco CBielza CLarrañaga P(2023)Feature subset selection for data and feature streams: a reviewArtificial Intelligence Review10.1007/s10462-023-10546-956:Suppl 1(1011-1062)Online publication date: 13-Jul-2023
https://dl.acm.org/doi/10.1007/s10462-023-10546-9
Pishgoo BAzirani ARaahemi B(2022)A dynamic feature selection and intelligent model serving for hybrid batch-stream processingKnowledge-Based Systems10.1016/j.knosys.2022.109749256:COnline publication date: 28-Nov-2022
https://dl.acm.org/doi/10.1016/j.knosys.2022.109749
Zheng WChen SFu ZLi JYang J(2022)Streaming feature selection via graph diffusionInformation Sciences: an International Journal10.1016/j.ins.2022.10.087618:C(150-168)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1016/j.ins.2022.10.087
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents