Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Online Feature Selection with Group Structure Analysis

Published: 01 November 2015 Publication History

Abstract

Online selection of dynamic features has attracted intensive interest in recent years. However, existing online feature selection methods evaluate features individually and ignore the underlying structure of a feature stream. For instance, in image analysis, features are generated in groups which represent color, texture, and other visual information. Simply breaking the group structure in feature selection may degrade performance. Motivated by this observation, we formulate the problem as an online group feature selection. The problem assumes that features are generated individually but there are group structures in the feature stream. To the best of our knowledge, this is the first time that the correlation among streaming features has been considered in the online feature selection process. To solve this problem, we develop a novel online group feature selection method named OGFS. Our proposed approach consists of two stages: online intra-group selection and online inter-group selection. In the intra-group selection, we design a criterion based on spectral analysis to select discriminative features in each group. In the inter-group selection, we utilize a linear regression model to select an optimal subset. This two-stage procedure continues until there are no more features arriving or some predefined stopping conditions are met. Finally, we apply our method to multiple tasks including image classification and face verification. Extensive empirical studies performed on real-world and benchmark data sets demonstrate that our method outperforms other state-of-the-art online feature selection methods.

References

[1]
X. Wu, X. Zhu, G.-Q. Wu, and W. Ding, “Data mining with big data,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 1, pp. 97 –107, Jan. 2014.
[2]
I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” The J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
[3]
H. Liu and H. Motoda, Computational Methods of Feature Selection. Boca Raton, FL, USA: CRC Press, 2007.
[4]
L. Yu and H. Liu, “ Efficient feature selection via analysis of relevance and redundancy,” The J. Mach. Learn. Res., vol. 5, pp. 1205–1224, 2004.
[5]
M. Wang, H. Li, D. Tao, K. Lu, and X. Wu, “Multimodal Graph-based reranking for web image search,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4649 –4661, Nov. 2012.
[6]
J. Mairal, F. Bach, J. Ponce, and G. Sapiro, “Online learning for matrix factorization and sparse coding,” The J. Mach. Learn. Res., vol. 11, pp. 19–60, 2010.
[7]
L. Bottou, “Online learning and stochastic approximations, ” Online Learning in Neural Networks, Cambridge Univ. Press, 1998.
[8]
J. Wang, P. Zhao, S. Hoi, and R. Jin, “Online feature selection and its applications,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 3, pp. 698–710, Mar. 2013.
[9]
X. Gao, S. C. H. Hoi, Y. Zhang, J. Wan, and J. Li, “Soml: Sparse online metric learning with application to image retrieval,” in Proc. AAAI Conf. Artif. Intell., 2014, pp. 1206–1212.
[10]
S. Perkins and J. Theiler, “Online feature selection using grafting,” in Proc. Int. Conf. Mach. Learn., 2003, pp. 592–599.
[11]
J. Zhou, D. P. Foster, R. Stine, and L. H. Ungar, “Streamwise feature selection using Alpha-investing,” in Proc. 11th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining , 2005, pp. 384–393.
[12]
X. Wu, K. Yu, H. Wang, and W. Ding, “Online streaming feature selection, ” in Proc. Int. Conf. Mach. Learn., 2010, pp. 1159–1166.
[13]
X. Wu, K. Yu, W. Ding, H. Wang, and X. Zhu, “Online feature selection with streaming features,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 5, pp. 1178– 1192, May 2013.
[14]
J. Shen, G. Liu, J. Chen, Y. Fang, J. Xie, Y. Yu, and S. Yan, “Unified structured learning for simultaneous human pose estimation and garment attribute classification,” arXiv preprint arXiv:1404.4923, 2014.
[15]
M. Wang, Y. Gao, K. Lu, and Y. Rui, “View-based discriminative probabilistic modeling for 3d object retrieval and recognition,” IEEE Trans. Image Process., vol. 22, no. 4, pp. 1395–1407, Apr. 2013.
[16]
M. Wang, B. Ni, X.-S. Hua, and T.-S. Chua, “Assistive tagging: A survey of multimedia tagging with human-computer joint exploration,” ACM Comput. Surveys, vol. 44, no. 4, p. 25, 2012.
[17]
Z.-Q. Zhao, H. Glotin, Z. Xie, J. Gao, and X. Wu, “Cooperative sparse representation in two opposite directions for Semi-supervised image annotation,” IEEE Trans. Image Process., vol. 21, no. 9, pp. 4218–4231, Sep. 2012.
[18]
M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” J. Roy. Statist. Soc., vol. 68, no. 1, pp. 49– 67, 2006.
[19]
S. Xiang, X. T. Shen, and J. P. Ye, “Efficient sparse group feature selection via nonconvex optimization,” in Proc. Int. Conf. Mach. Learn., 2012, pp. 284–292.
[20]
Y. Zhou, U. Porwal, C. Zhang, H. Q. Ngo, L. Nguyen, C. Ré, and V. Govindaraju, “Parallel feature selection inspired by group testing,” in Proc. Neural Inf. Process. Syst., 2014, pp. 3554–3562.
[21]
S. Xiang, T. Yang, and J. Ye, “Simultaneous feature and feature group selection through hard thresholding,” in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2014, pp. 532–541.
[22]
J. Wang and J. Ye, “Two-layer feature reduction for Sparse-group lasso via decomposition of convex sets, ” in Proc. Neural Inf. Process. Syst., 2014, pp. 2132 –2140.
[23]
H. Yang, Z. Xu, I. King, and M. R. Lyu, “Online learning for group lasso, ” in Proc. 27th Int. Conf. Mach. Learn., 2010, pp. 1191 –1198.
[24]
J. Wang, Z.-Q. Zhao, X. Hu, Y.-M. Cheung, M. Wang, and X. Wu, “Online group feature selection,” in Proc. 23rd Int. Joint Conf. Artif. Intell., 2013, pp. 1757–1763.
[25]
J. G. Dy and C. E. Brodley, “Feature selection for unsupervised learning,” The J. Mach. Learn. Res., vol. 5, pp. 845–889, 2004.
[26]
Y. Yang, H. T. Shen, Z. Ma, Z. Huang, and X. Zhou, “l 2, 1-norm regularized discriminative feature selection for unsupervised learning,” in Proc. 22nd Int. Joint Conf. Artif. Intell., 2011, pp. 1589–1594.
[27]
Z. Zhao and H. Liu, “Spectral feature selection for supervised and unsupervised learning,” in Proc. Int. Conf. Mach. Learn., 2007, pp. 1151–1157.
[28]
A. K. Farahat, A. Ghodsi, and M. S. Kamel, “Efficient greedy feature selection for unsupervised learning,” Knowl. Inf. Syst., vol. 35, no. 2, pp. 285–310, 2013.
[29]
Z. Zhao, L. Wang, H. Liu, and J. Ye, “On similarity preserving feature selection, ” IEEE Trans. Knowl. Data Eng., vol. 25, no. 3, pp. 619–632, Mar. 2013.
[30]
S. Das, “Filters, wrappers and a Boosting-based hybrid for feature selection, ” in Proc. 18th Int. Conf. Mach. Learn., 2001, pp. 74 –81.
[31]
H. Wang, S. Yan, D. Xu, X. Tang, and T. Huang, “Trace ratio vs. ratio trace for dimensionality reduction,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog., 2007, pp. 671 –676.
[32]
I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” The J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
[33]
E. F. Combarro, E. Montanes, I. Diaz, J. Ranilla, and R. Mones, “Introducing a family of linear measures for feature selection in text categorization,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 9, pp. 1223–1232, Sep. 2005.
[34]
L. Song, A. Smola, A. Gretton, K. M. Borgwardt, and J. Bedo, “Supervised feature selection via dependence estimation, ” in Proc. 24th Int. Conf. Mach. Learn., 2007, pp. 823 –830.
[35]
H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of Max-dependency, Max-relevance, and Min-redundancy, ” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8, pp. 1226–1238, Aug. 2005.
[36]
I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Mach. Learn., vol. 46, no. 1-3, pp. 389–422, 2002.
[37]
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression, ” The Ann. Statist., vol. 32, no. 2, pp. 407–499, 2004.
[38]
H. Zou, “The adaptive lasso and its oracle properties,” J. Am. Statist. Assoc., vol. 101, no. 476, pp. 1418– 1429, 2006.
[39]
L. Yuan, J. Liu, and J. Ye, “Efficient methods for overlapping group lasso,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 9, pp. 2104–2116, Sep. 2013.
[40]
R. Tibshirani, “Regression shrinkage and selection via the lasso, ” J. Roy. Statist. Soc., vol. 58, pp. 267–288, 1996.
[41]
K. Glocer, D. Eads, and J. Theiler, “Online feature selection for pixel classification,” in Proc. 22nd Int. Conf. Mach. Learn. , 2005, pp. 249–256.
[42]
C. Pillers Dobler, “Mathematical statistics: Basic ideas and selected topics, ” The Am. Statist., vol. 56, no. 4, pp. 332–332, 2002.
[43]
G. H. John, R. Kohavi, K. Pfleger, et al.et al., “ Irrelevant features and the subset selection problem,” in Proc. Int. Conf. Mach. Learn., 1994, vol. 94, pp. 121–129.
[44]
Z. Zhao, L. Wang, H. Liu, and J. Ye, “On similarity preserving feature selection,” IEEE Trans. Knowledge Data Eng., vol. 25, no. 3, pp. 619–632, 2013.
[45]
F. Nie, S. Xiang, Y. Jia, C. Zhang, and S. Yan, “Trace ratio criterion for feature selection,” in Proc. AAAI Conf. Artif. Intell., 2008, vol. 2, pp. 671–676.
[46]
F. C. Graham, Spectral Graph Theory, American Mathematical Soc., vol. 92, 1997.
[47]
H. Lee, A. Battle, R. Raina, and A. Y. Ng, “Efficient sparse coding algorithms, ” in Proc. Neural Inf. Process. Syst., 2007, vol. 19, p. 801.
[48]
A. Hyvärinen, P. O. Hoyer, and M. Inki, “The independence assumption: analyzing the independence of the components by topography,” in Advances in Independent Component Analysis. New York, NY, USA: Springer, 2000, pp. 45–62.
[49]
Z. Xu, G. Huang, K. Q. Weinberger, and A. X. Zheng, “Gradient boosted feature selection, ” in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2014, pp. 522–531.
[50]
A. Krizhevsky, “Learning multiple layers of features from tiny images, ” Master's Thesis, Comput. Sci. Dept., Univ. of Toronto, 2009.
[51]
L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories, ” Comput. Vis. Image Understanding, vol. 106, no. 1, pp. 59–70, 2007.
[52]
G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” Univ. of Massachusetts, Amherst, MA, USA, Tech. Rep. 07-49, 2007.

Cited By

View all
  • (2024)Concept Evolution Detecting over Feature StreamsACM Transactions on Knowledge Discovery from Data10.1145/367801218:8(1-32)Online publication date: 13-Jul-2024
  • (2024)Online Feature Selection With Varying Feature SpacesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337724336:9(4806-4819)Online publication date: 1-Sep-2024
  • (2024)Online group streaming feature selection based on fuzzy neighborhood granular ball rough setsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123778249:PCOnline publication date: 17-Jul-2024
  • Show More Cited By

Index Terms

  1. Online Feature Selection with Group Structure Analysis
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE Transactions on Knowledge and Data Engineering
          IEEE Transactions on Knowledge and Data Engineering  Volume 27, Issue 11
          Nov. 2015
          287 pages

          Publisher

          IEEE Educational Activities Department

          United States

          Publication History

          Published: 01 November 2015

          Author Tags

          1. verification
          2. Online feature selection
          3. streaming feature
          4. group structure
          5. classification

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 03 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Concept Evolution Detecting over Feature StreamsACM Transactions on Knowledge Discovery from Data10.1145/367801218:8(1-32)Online publication date: 13-Jul-2024
          • (2024)Online Feature Selection With Varying Feature SpacesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337724336:9(4806-4819)Online publication date: 1-Sep-2024
          • (2024)Online group streaming feature selection based on fuzzy neighborhood granular ball rough setsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123778249:PCOnline publication date: 17-Jul-2024
          • (2024)Cost-sensitive sparse group online learning for imbalanced data streamsMachine Language10.1007/s10994-023-06403-z113:7(4407-4444)Online publication date: 1-Jul-2024
          • (2023)When online learning meets ODEProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i7.26029(8545-8553)Online publication date: 7-Feb-2023
          • (2023)Online and offline streaming feature selection methods with bat algorithm for redundancy analysisPattern Recognition10.1016/j.patcog.2022.109007133:COnline publication date: 1-Jan-2023
          • (2023)Learning framework based on ER Rule for data streams with generalized feature spacesInformation Sciences: an International Journal10.1016/j.ins.2023.119604649:COnline publication date: 1-Nov-2023
          • (2023)Feature subset selection for data and feature streams: a reviewArtificial Intelligence Review10.1007/s10462-023-10546-956:Suppl 1(1011-1062)Online publication date: 13-Jul-2023
          • (2022)A dynamic feature selection and intelligent model serving for hybrid batch-stream processingKnowledge-Based Systems10.1016/j.knosys.2022.109749256:COnline publication date: 28-Nov-2022
          • (2022)Streaming feature selection via graph diffusionInformation Sciences: an International Journal10.1016/j.ins.2022.10.087618:C(150-168)Online publication date: 1-Dec-2022
          • Show More Cited By

          View Options

          View options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media