Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey
Public Access

Feature Selection: A Data Perspective

Published: 06 December 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data-mining and machine-learning problems. The objectives of feature selection include building simpler and more comprehensible models, improving data-mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity-based, information-theoretical-based, sparse-learning-based, and statistical-based methods. To facilitate and promote the research in this community, we also present an open source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.

    References

    [1]
    Thomas Abeel, Thibault Helleputte, Yves Van de Peer, Pierre Dupont, and Yvan Saeys. 2010. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26, 3 (2010), 392--398.
    [2]
    Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, and Eric P. Xing. 2009. Mixed membership stochastic blockmodels. In NIPS. 33--40.
    [3]
    Salem Alelyani, Huan Liu, and Lei Wang. 2011. The effect of the characteristics of the dataset on the selection stability. In ICTAI. 970--977.
    [4]
    Salem Alelyani, Jiliang Tang, and Huan Liu. 2013. Feature selection for clustering: A review. Data Clustering: Algorithms and Applications 29 (2013).
    [5]
    Jun Chin Ang, Andri Mirzal, Habibollah Haron, and Haza Nuzly Abdull Hamed. 2016. Supervised, unsupervised, and semi-supervised feature selection: A review on gene selection. IEEE/ACM TCBB 13, 5 (2016), 971--989.
    [6]
    Hiromasa Arai, Crystal Maung, Ke Xu, and Haim Schweitzer. 2016. Unsupervised feature selection by heuristic search with provable bounds on suboptimality. In AAAI. 666--672.
    [7]
    Francis R. Bach. 2008. Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9 (2008), 1179--1225.
    [8]
    Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In WSDM. 635--644.
    [9]
    Roberto Battiti. 1994. Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Network. 5, 4 (1994), 537--550.
    [10]
    Mustafa Bilgic, Lilyana Mihalkova, and Lise Getoor. 2010. Active learning for networked data. In ICML. 79--86.
    [11]
    Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.
    [12]
    Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. 2012. Conditional likelihood maximisation: A unifying framework for information-theoretic feature selection. J. Mach. Learn. Res. 13, 1 (2012), 27--66.
    [13]
    Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In KDD. 333--342.
    [14]
    Xiao Cai, Feiping Nie, and Heng Huang. 2013. Exact top-k feature selection via &ell;<sub>2,0</sub>-norm constraint. In IJCAI. 1240--1246.
    [15]
    Girish Chandrashekar and Ferat Sahin. 2014. A survey on feature selection methods. Comput. Electr. Eng. 40, 1 (2014), 16--28.
    [16]
    Xiaojun Chang, Feiping Nie, Yi Yang, and Heng Huang. 2014. A convex formulation for semi-supervised multi-label feature selection. In AAAI. 1171--1177.
    [17]
    Chen Chen, Hanghang Tong, Lei Xie, Lei Ying, and Qing He. 2016. FASCINATE: Fast cross-layer dependency inference on multi-layered networks. In KDD. 765--774.
    [18]
    Kewei Cheng, Jundong Li, and Huan Liu. 2016. FeatureMiner: A tool for interactive feature selection. In CIKM. 2445--2448.
    [19]
    Kewei Cheng, Jundong Li, and Huan Liu. 2017. Unsupervised feature selection in signed social networks. In KDD. 777--786.
    [20]
    Alexandre d’Aspremont, Laurent El Ghaoui, Michael I. Jordan, and Gert R. G. Lanckriet. 2007. A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49, 3 (2007), 434--448.
    [21]
    John C. Davis and Robert J. Sampson. 1986. Statistics and Data Analysis in Geology. Vol. 646. Wiley. New York.
    [22]
    Chris Ding, Ding Zhou, Xiaofeng He, and Hongyuan Zha. 2006. R 1-PCA: Rotational invariant -norm principal component analysis for robust subspace factorization. In ICML. 281--288.
    [23]
    Liang Du and Yi-Dong Shen. 2015. Unsupervised feature selection with adaptive structure learning. In KDD. 209--218.
    [24]
    Liang Du, Zhiyong Shen, Xuan Li, Peng Zhou, and Yi-Dong Shen. 2013. Local and global discriminative learning for unsupervised feature selection. In ICDM. 131--140.
    [25]
    Richard O. Duda, Peter E. Hart, and David G. Stork. 2012. Pattern Classification. John Wiley 8 Sons.
    [26]
    Janusz Dutkowski and Anna Gambin. 2007. On consensus biomarker selection. BMC Bioinform. 8, 5 (2007), S5.
    [27]
    Ali El Akadi, Abdeljalil El Ouardighi, and Driss Aboutajdine. 2008. A powerful feature selection approach based on mutual information. Int. J. Comput. Sci. Netw. Secur. 8, 4 (2008), 116.
    [28]
    Jianqing Fan, Richard Samworth, and Yichao Wu. 2009. Ultrahigh dimensional feature selection: Beyond the linear model. J. Mach. Learn. Res. 10 (2009), 2013--2038.
    [29]
    Ahmed K. Farahat, Ali Ghodsi, and Mohamed S. Kamel. 2011. An efficient greedy method for unsupervised feature selection. In ICDM. 161--170.
    [30]
    Christiane Fellbaum. 1998. WordNet. Wiley Online Library.
    [31]
    Yinfu Feng, Jun Xiao, Yueting Zhuang, and Xiaoming Liu. 2013. Adaptive unsupervised multi-view feature selection for visual concept recognition. In ACCV. 343--357.
    [32]
    François Fleuret. 2004. Fast binary feature selection with conditional mutual information. JMLR 5 (2004), 1531--1555.
    [33]
    Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2010. A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736 (2010).
    [34]
    Keinosuke Fukunaga. 2013. Introduction to Statistical Pattern Recognition. Academic Press.
    [35]
    Shuyang Gao, Greg Ver Steeg, and Aram Galstyan. 2016. Variational information maximization for feature selection. In NIPS. 487--495.
    [36]
    C. W. Gini. 1912. Variability and mutability, contribution to the study of statistical distribution and relaitons. Studi Economico-Giuricici Della R (1912).
    [37]
    David E. Golberg. 1989. Genetic algorithms in search, optimization, and machine learning. Addison-Wesley.
    [38]
    Quanquan Gu, Marina Danilevsky, Zhenhui Li, and Jiawei Han. 2012. Locality preserving feature learning. In AISTATS. 477--485.
    [39]
    Quanquan Gu and Jiawei Han. 2011. Towards feature selection in network. In CIKM. 1175--1184.
    [40]
    Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011a. Correlated multi-label feature selection. In CIKM. ACM, 1087--1096.
    [41]
    Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011b. Generalized fisher score for feature selection. In UAI. 266--273.
    [42]
    Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011c. Joint feature selection and subspace learning. In IJCAI. 1294--1299.
    [43]
    Baofeng Guo and Mark S. Nixon. 2009. Gait feature subset selection by mutual information. IEEE TMSC(A) 39, 1 (2009), 36--46.
    [44]
    Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. JMLR 3 (2003), 1157--1182.
    [45]
    Isabelle Guyon, Steve Gunn, Masoud Nikravesh, and Lofti A Zadeh. 2008. Feature Extraction: Foundations and Applications. Springer.
    [46]
    Mark A. Hall and Lloyd A. Smith. 1999. Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. In FLAIRS. 235--239.
    [47]
    Satoshi Hara and Takanori Maehara. 2017. Enumerate lasso solutions for feature selection. In AAAI. 1985--1991.
    [48]
    Trevor Hastie, Robert Tibshirani, Jerome Friedman, and James Franklin. 2005. The elements of statistical learning: Data mining, inference and prediction. Math. Intell. 27, 2 (2005), 83--85.
    [49]
    Xiaofei He, Deng Cai, and Partha Niyogi. 2005. Laplacian score for feature selection. In NIPS. 507--514.
    [50]
    Zengyou He and Weichuan Yu. 2010. Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 4 (2010), 215--225.
    [51]
    Chenping Hou, Feiping Nie, Dongyun Yi, and Yi Wu. 2011. Feature selection via joint embedding learning and sparse regression. In IJCAI. 1324--1329.
    [52]
    Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu. 2013. ActNeT: Active learning for networked texts in microblogging. In SDM. 306--314.
    [53]
    Hao Huang, Shinjae Yoo, and S Kasiviswanathan. 2015. Unsupervised feature selection on data streams. In CIKM. 1031--1040.
    [54]
    Junzhou Huang, Tong Zhang, and Dimitris Metaxas. 2011. Learning with structured sparsity. J. Mach. Learn. Res. 12 (2011), 3371--3412.
    [55]
    Laurent Jacob, Guillaume Obozinski, and Jean-Philippe Vert. 2009. Group lasso with overlap and graph lasso. In ICML. 433--440.
    [56]
    Aleks Jakulin. 2005. Machine Learning Based on Attribute Interactions. Ph.D. Dissertation. Univerza v Ljubljani.
    [57]
    Rodolphe Jenatton, Jean-Yves Audibert, and Francis Bach. 2011. Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12 (2011), 2777--2824.
    [58]
    Rodolphe Jenatton, Julien Mairal, Francis R. Bach, and Guillaume R. Obozinski. 2010. Proximal methods for sparse hierarchical dictionary learning. In ICML. 487--494.
    [59]
    Ling Jian, Jundong Li, Kai Shu, and Huan Liu. 2016. Multi-label informed feature selection. In IJCAI. 1627--1633.
    [60]
    Yi Jiang and Jiangtao Ren. 2011. Eigenvalue sensitive feature selection. In ICML. 89--96.
    [61]
    Alexandros Kalousis, Julien Prados, and Melanie Hilario. 2007. Stability of feature selection algorithms: A study on high-dimensional spaces. Knowl. Inf. Syst. 12, 1 (2007), 95--116.
    [62]
    Seyoung Kim and Eric P Xing. 2009. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 5, 8 (2009).
    [63]
    Seyoung Kim and Eric P Xing. 2010. Tree-guided group lasso for multi-task regression with structured sparsity. In ICML. 543--550.
    [64]
    Kenji Kira and Larry A. Rendell. 1992. A practical approach to feature selection. In ICML Workshop. 249--256.
    [65]
    Ron Kohavi and George H. John. 1997. Wrappers for feature subset selection. Artif. Intell. 97, 1 (1997), 273--324.
    [66]
    Daphne Koller and Mehran Sahami. 1995. Toward optimal feature selection. In ICML. 284--292.
    [67]
    Gert R. G. Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I. Jordan. 2004. Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5 (2004), 27--72.
    [68]
    David D. Lewis. 1992. Feature selection and feature extraction for text categorization. In Proceedings of the Workshop on Speech and Natural Language. 212--217.
    [69]
    Jundong Li, Harsh Dani, Xia Hu, and Huan Liu. 2017. Radar: Residual analysis for anomaly detection in attributed networks. In IJCAI. 2152--2158.
    [70]
    Jundong Li, Xia Hu, Ling Jian, and Huan Liu. 2016. Toward time-evolving feature selection on dynamic networks. In ICDM. 1003--1008.
    [71]
    Jundong Li, Xia Hu, Jiliang Tang, and Huan Liu. 2015. Unsupervised streaming feature selection in social media. In CIKM. 1041--1050.
    [72]
    Jundong Li, Xia Hu, Liang Wu, and Huan Liu. 2016. Robust unsupervised feature selection on networked data. In SDM. 387--395.
    [73]
    Jundong Li and Huan Liu. 2017. Challenges of feature selection for big data analytics. IEEE Intell. Syst. 32, 2 (2017), 9--15.
    [74]
    Jundong Li, Jiliang Tang, and Huan Liu. 2017a. Reconstruction-based unsupervised feature selection: An embedded approach. In IJCAI. 2159--2165.
    [75]
    Jundong Li, Liang Wu, Osmar R. Zaïane, and Huan Liu. 2017b. Toward personalized relational learning. In SDM. 444--452.
    [76]
    Yifeng Li, Chih-Yu Chen, and Wyeth W. Wasserman. 2015. Deep feature selection: Theory and application to identify enhancers and promoters. In RECOMB. 205--217.
    [77]
    Zechao Li, Yi Yang, Jing Liu, Xiaofang Zhou, and Hanqing Lu. 2012. Unsupervised feature selection using nonnegative spectral analysis. In AAAI. 1026--1032.
    [78]
    David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. J. Assist Inf. Sci. Technol. 58, 7 (2007), 1019--1031.
    [79]
    Dahua Lin and Xiaoou Tang. 2006. Conditional infomax learning: An integrated framework for feature extraction and fusion. In ECCV. 68--82.
    [80]
    Hongfu Liu, Haiyi Mao, and Yun Fu. 2016a. Robust multi-view feature selection. In ICDM. 281--290.
    [81]
    Huan Liu and Hiroshi Motoda. 2007. Computational Methods of Feature Selection. CRC Press.
    [82]
    Huan Liu and Rudy Setiono. 1995. Chi2: Feature selection and discretization of numeric attributes. In ICTAI. 388--391.
    [83]
    Hongfu Liu, Ming Shao, and Yun Fu. 2016b. Consensus guided unsupervised feature selection. In AAAI. 1874--1880.
    [84]
    Jun Liu, Shuiwang Ji, and Jieping Ye. 2009a. Multi-task feature learning via efficient &ell;<sub>2,0</sub>-norm minimization. In UAI. 339--348.
    [85]
    Jun Liu, Shuiwang Ji, and Jieping Ye. 2009b. SLEP: Sparse Learning with Efficient Projections. Arizona State University. Retrieved from http://www.public.asu.edu/&sim;jye02/Software/SLEP.
    [86]
    Jun Liu and Jieping Ye. 2010. Moreau-Yosida regularization for grouped tree structure learning. In NIPS. 1459--1467.
    [87]
    Xinwang Liu, Lei Wang, Jian Zhang, Jianping Yin, and Huan Liu. 2014. Global and local structure preservation for feature selection. Trans. Neur. Netw. Learn. Syst. 25, 6 (2014), 1083--1095.
    [88]
    Bo Long, Zhongfei Mark Zhang, Xiaoyun Wu, and Philip S. Yu. 2006. Spectral clustering for multi-type relational data. In ICML. 585--592.
    [89]
    Bo Long, Zhongfei Mark Zhang, and Philip S Yu. 2007. A probabilistic framework for relational clustering. In KDD. 470--479.
    [90]
    Steven Loscalzo, Lei Yu, and Chris Ding. 2009. Consensus group stable feature selection. In KDD. 567--576.
    [91]
    Shuangge Ma, Xiao Song, and Jian Huang. 2007. Supervised group Lasso with applications to microarray data analysis. BMC Bioinf. 8, 1 (2007), 60.
    [92]
    Sofus A Macskassy and Foster Provost. 2007. Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8 (2007), 935--983.
    [93]
    Peter V. Marsden and Noah E Friedkin. 1993. Network studies of social influence. Sociol. Methods Res. 22, 1 (1993), 127--151.
    [94]
    Mahdokht Masaeli, Yan Yan, Ying Cui, Glenn Fung, and Jennifer G. Dy. 2010. Convex principal feature selection. In SDM. 619--628.
    [95]
    Crystal Maung and Haim Schweitzer. 2013. Pass-efficient unsupervised feature selection. In NIPS. 1628--1636.
    [96]
    James McAuley, Ji Ming, Darryl Stewart, and Philip Hanna. 2005. Subband correlation and robust speech recognition. IEEE Trans. Speech Audio Process. 13, 5 (2005), 956--964.
    [97]
    Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks. Ann. Rev. Sociol. (2001), 415--444.
    [98]
    Lukas Meier, Sara Van De Geer, and Peter Bühlmann. 2008. The group lasso for logistic regression. J. Roy. Stat. Soc. B 70, 1 (2008), 53--71.
    [99]
    Patrick E. Meyer and Gianluca Bontempi. 2006. On the use of variable complementarity for feature selection in cancer classification. In Applications of Evolutionary Computing. 91--102.
    [100]
    Patrick Emmanuel Meyer, Colas Schretter, and Gianluca Bontempi. 2008. Information-theoretic feature selection in microarray data using variable complementarity. IEEE J. Select. Top. Sign. Process. 2, 3 (2008), 261--274.
    [101]
    Patrenahalli M Narendra and Keinosuke Fukunaga. 1977. A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 100, 9 (1977), 917--922.
    [102]
    Michael Netzer, Gunda Millonig, Melanie Osl, Bernhard Pfeifer, Siegfried Praun, Johannes Villinger, Wolfgang Vogel, and Christian Baumgartner. 2009. A new ensemble-based algorithm for identifying breath gas marker candidates in liver disease using ion molecule reaction mass spectrometry. Bioinformatics 25, 7 (2009), 941--947.
    [103]
    Xuan Vinh Nguyen, Jeffrey Chan, Simone Romano, and James Bailey. 2014. Effective global approaches for mutual information based feature selection. In KDD. 512--521.
    [104]
    Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint -norms minimization. In NIPS. 1813--1821.
    [105]
    Feiping Nie, Shiming Xiang, Yangqing Jia, Changshui Zhang, and Shuicheng Yan. 2008. Trace ratio criterion for feature selection. In AAAI. 671--676.
    [106]
    Feiping Nie, Wei Zhu, Xuelong Li, and others. 2016. Unsupervised feature selection with structured graph optimization. In AAAI. 1302--1308.
    [107]
    Guillaume Obozinski, Ben Taskar, and Michael Jordan. 2007. Joint Covariate Selection for Grouped Classification. Technical Report. Technical Report, Statistics Department, UC Berkeley.
    [108]
    Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. 2011. Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, Oct (2011), 2825--2830.
    [109]
    Hanyang Peng and Yong Fan. 2016. Direct sparsity optimization based feature selection for multi-class classification. In IJCAI. 1918--1924.
    [110]
    Hanyang Peng and Yong Fan. 2017. A general framework for sparsity regularized feature selection via iteratively reweighted least square minimization. In AAAI. 2471--2477.
    [111]
    Hanchuan Peng, Fuhui Long, and Chris Ding. 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 8 (2005), 1226--1238.
    [112]
    Jie Peng, Ji Zhu, Anna Bergamaschi, Wonshik Han, Dong-Young Noh, Jonathan R Pollack, and Pei Wang. 2010. Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer. Ann. Appl. Stat. 4, 1 (2010), 53.
    [113]
    Simon Perkins, Kevin Lacker, and James Theiler. 2003. Grafting: Fast, incremental feature selection by gradient descent in function space. J. Mach. Learn. Res. 3 (2003), 1333--1356.
    [114]
    Simon Perkins and James Theiler. 2003. Online feature selection using grafting. In ICML. 592--599.
    [115]
    Mingjie Qian and Chengxiang Zhai. 2013. Robust unsupervised feature selection. In IJCAI. 1621--1627.
    [116]
    Ariadna Quattoni, Xavier Carreras, Michael Collins, and Trevor Darrell. 2009. An efficient projection for regularization. In ICML. 857--864.
    [117]
    Marko Robnik-Šikonja and Igor Kononenko. 2003. Theoretical and empirical analysis of relieff and rrelieff. Mach. Learn. 53, 1-2 (2003), 23--69.
    [118]
    Debaditya Roy, K Sri Rama Murty, and C Krishna Mohan. 2015. Feature selection using deep neural networks. In IJCNN. 1--6.
    [119]
    Yvan Saeys, Thomas Abeel, and Yves Van de Peer. 2008. Robust feature selection using ensemble feature selection techniques. In ECMLPKDD (2008), 313--325.
    [120]
    Yvan Saeys, Iñaki Inza, and Pedro Larrañaga. 2007. A review of feature selection techniques in bioinformatics. Bioinformatics 23, 19 (2007), 2507--2517.
    [121]
    Ted Sandler, John Blitzer, Partha P. Talukdar, and Lyle H. Ungar. 2009. Regularized learning with networks of features. In NIPS. 1401--1408.
    [122]
    Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI Mag. 29, 3 (2008), 93.
    [123]
    Qiang Shen, Ren Diao, and Pan Su. 2012. Feature selection ensemble. Turing-100 10 (2012), 289--306.
    [124]
    Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8 (2000), 888--905.
    [125]
    Lei Shi, Liang Du, and Yi-Dong Shen. 2014. Robust spectral learning for unsupervised feature selection. In ICDM. 977--982.
    [126]
    Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, and Pavel Serdyukov. 2016. Efficient high-order interaction-aware feature selection based on conditional mutual information. In NIPS. 4637--4645.
    [127]
    Sameer Singh, Jeremy Kubica, Scott Larsen, and Daria Sorokina. 2009. Parallel large scale feature selection for logistic regression. In SDM. 1172--1183.
    [128]
    Mingkui Tan, Ivor W Tsang, and Li Wang. 2014. Towards ultrahigh dimensional feature selection for big data. J. Mach. Learn. Res. 15, 1 (2014), 1371--1429.
    [129]
    Jiliang Tang, Salem Alelyani, and Huan Liu. 2014. Feature selection for classification: A review. Data Classification: Algorithms and Applications (2014), 37.
    [130]
    Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. 2013. Unsupervised feature selection for multi-view data in social media. In SDM. 270--278.
    [131]
    Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. 2014. Discriminant analysis for unsupervised feature selection. In SDM. 938--946.
    [132]
    Jiliang Tang and Huan Liu. 2012a. Feature selection with linked data in social media. In SDM. 118--128.
    [133]
    Jiliang Tang and Huan Liu. 2012b. Unsupervised feature selection for linked social media data. In KDD. 904--912.
    [134]
    Jiliang Tang and Huan Liu. 2013. Coselect: Feature selection with instance selection for social media data. In SDM. 695--703.
    [135]
    Lei Tang and Huan Liu. 2009. Relational learning via latent social dimensions. In KDD. 817--826.
    [136]
    Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B (1996), 267--288.
    [137]
    Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. 2005. Sparsity and smoothness via the fused lasso. J. Roy. Stat. Soc. B 67, 1 (2005), 91--108.
    [138]
    Robert Tibshirani, Guenther Walther, and Trevor Hastie. 2001. Estimating the number of clusters in a data set via the gap statistic. J. Roy. Stat. Soc. B 63, 2 (2001), 411--423.
    [139]
    William T. Vetterling, Saul A. Teukolsky, and William H. Press. 1992. Numerical Recipes: Example Book (C). Press Syndicate of the University of Cambridge.
    [140]
    Michel Vidal-Naquet and Shimon Ullman. 2003. Object recognition with informative features and linear classification. In ICCV. 281--288.
    [141]
    Hua Wang, Feiping Nie, and Heng Huang. 2013. Multi-view clustering and feature learning via structured sparsity. In ICML. 352--360.
    [142]
    Huan Wang, Shuicheng Yan, Dong Xu, Xiaoou Tang, and Thomas Huang. 2007. Trace ratio vs. ratio trace for dimensionality reduction. In CVPR. 1--8.
    [143]
    Jie Wang and Jieping Ye. 2015. Multi-layer feature reduction for tree structured group lasso via hierarchical projection. In NIPS. 1279--1287.
    [144]
    Jialei Wang, Peilin Zhao, Steven C. H. Hoi, and Rong Jin. 2014b. Online feature selection and its applications. IEEE TKDE 26, 3 (2014), 698--710.
    [145]
    Qian Wang, Jiaxing Zhang, Sen Song, and Zheng Zhang. 2014a. Attentional neural network: Feature selection using cognitive feedback. In NIPS. 2033--2041.
    [146]
    Xiaokai Wei, Bokai Cao, and Philip S. Yu. 2016a. Nonlinear joint unsupervised feature selection. In SDM. 414--422.
    [147]
    Xiaokai Wei, Bokai Cao, and Philip S. Yu. 2016b. Unsupervised feature selection on networks: A generative view. In AAAI. 2215--2221.
    [148]
    Xiaokai Wei, Sihong Xie, and Philip S. Yu. 2015. Efficient partial order preserving unsupervised feature selection on networks. In SDM. 82--90.
    [149]
    Xiaokai Wei and Philip S. Yu. 2016. Unsupervised feature selection by preserving stochastic neighbors. In AISTATS. 995--1003.
    [150]
    Liang Wu, Jundong Li, Xia Hu, and Huan Liu. 2017. Gleaning wisdom from the past: Early detection of emerging rumors in social media. In SDM. SIAM, 99--107.
    [151]
    Xindong Wu, Kui Yu, Hao Wang, and Wei Ding. 2010. Online streaming feature selection. In ICML. 1159--1166.
    [152]
    Zhixiang Xu, Gao Huang, Kilian Q. Weinberger, and Alice X. Zheng. 2014. Gradient boosted feature selection. In KDD. 522--531.
    [153]
    Makoto Yamada, Avishek Saha, Hua Ouyang, Dawei Yin, and Yi Chang. 2014. N3LARS: Minimum redundancy maximum relevance feature selection for large and high-dimensional data. arXiv preprint arXiv:1411.2331 (2014).
    [154]
    Feng Yang and K. Z. Mao. 2011. Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 4 (2011), 1080--1092.
    [155]
    Howard Hua Yang and John E. Moody. 1999. Data visualization and feature selection: New algorithms for nongaussian data. In NIPS. 687--693.
    [156]
    Sen Yang, Lei Yuan, Ying-Cheng Lai, Xiaotong Shen, Peter Wonka, and Jieping Ye. 2012. Feature grouping and selection over an undirected graph. In KDD. 922--930.
    [157]
    Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. &ell;<sub>2,0</sub>-norm regularized discriminative feature selection for unsupervised learning. In IJCAI. 1589--1594.
    [158]
    Yi Yang, Dong Xu, Feiping Nie, Shuicheng Yan, and Yueting Zhuang. 2010. Image clustering using local discriminant models and global integration. IEEE Trans. Inf. Process. 19, 10 (2010), 2761--2773.
    [159]
    Yee Hwa Yang, Yuanyuan Xiao, and Mark R. Segal. 2005. Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21, 7 (2005), 1084--1093.
    [160]
    Jieping Ye and Jun Liu. 2012. Sparse methods for biomedical data. ACM SIGKDD Explor. Newslett. 14, 1 (2012), 4--15.
    [161]
    Kui Yu, Xindong Wu, Wei Ding, and Jian Pei. 2014. Towards scalable and accurate online feature selection for big data. In ICDM. 660--669.
    [162]
    Lei Yu and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In ICML. 856--863.
    [163]
    Stella X. Yu and Jianbo Shi. 2003. Multiclass spectral clustering. In ICCV. 313--319.
    [164]
    Lei Yuan, Jun Liu, and Jieping Ye. 2011. Efficient methods for overlapping group lasso. In NIPS. 352--360.
    [165]
    Ming Yuan and Yi Lin. 2006. Model selection and estimation in regression with grouped variables. J. Roy Stat. Soc. B 68, 1 (2006), 49--67.
    [166]
    Sepehr Abbasi Zadeh, Mehrdad Ghadiri, Vahab S. Mirrokni, and Morteza Zadimoghaddam. 2017. Scalable feature selection via distributed diversity maximization. In AAAI. 2876--2883.
    [167]
    Jian Zhang, Zoubin Ghahramani, and Yiming Yang. 2008. Flexible latent variable models for multi-task learning. Mach. Learn. 73, 3 (2008), 221--242.
    [168]
    Miao Zhang, Chris H. Q. Ding, Ya Zhang, and Feiping Nie. 2014. Feature selection at the discrete limit. In AAAI. 1355--1361.
    [169]
    Qin Zhang, Peng Zhang, Guodong Long, Wei Ding, Chengqi Zhang, and Xindong Wu. 2015. Towards mining trapezoidal data streams. In ICDM. 1111--1116.
    [170]
    Lei Zhao, Qinghua Hu, and Wenwu Wang. 2015. Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Trans. Multimedia 17, 11 (2015), 1936--1948.
    [171]
    Peng Zhao, Guilherme Rocha, and Bin Yu. 2009. The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics (2009), 3468--3497.
    [172]
    Zhou Zhao, Xiaofei He, Deng Cai, Lijun Zhang, Wilfred Ng, and Yueting Zhuang. 2016. Graph regularized feature selection with data reconstruction. IEEE Trans. Knowl. Data Eng. 28, 3 (2016), 689--700.
    [173]
    Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In ICML. 1151--1157.
    [174]
    Zheng Zhao and Huan Liu. 2008. Multi-source feature selection via geometry-dependent covariance analysis. In FSDM. 36--47.
    [175]
    Zheng Zhao, Lei Wang, Huan Liu, and others. 2010. Efficient spectral feature selection with minimum redundancy. In AAAI. 673--678.
    [176]
    Zheng Zhao, Ruiwen Zhang, James Cox, David Duling, and Warren Sarle. 2013. Massively parallel feature selection: An approach based on variance preservation. Mach. Learn. 92, 1 (2013), 195--220.
    [177]
    Jing Zhou, Dean Foster, Robert Stine, and Lyle Ungar. 2005. Streaming feature selection using alpha-investing. In KDD. 384--393.
    [178]
    Jiayu Zhou, Jun Liu, Vaibhav A Narayan, and Jieping Ye. 2012. Modeling disease progression via fused sparse group lasso. In KDD. 1095--1103.
    [179]
    Yao Zhou and Jingrui He. 2017. A randomized approach for crowdsourcing in the presence of multiple views. In ICDM.
    [180]
    Zhi-Hua Zhou. 2012. Ensemble Methods: Foundations and Algorithms. CRC Press.
    [181]
    Ji Zhu, Saharon Rosset, Robert Tibshirani, and Trevor J. Hastie. 2004. 1-norm support vector machines. In NIPS. 49--56.
    [182]
    Pengfei Zhu, Qinghua Hu, Changqing Zhang, and Wangmeng Zuo. 2016. Coupled dictionary learning for unsupervised feature selection. In AAAI. 2422--2428.

    Cited By

    View all
    • (2024)Engineering Applications of Artificial IntelligenceEnhancing Medical Imaging with Emerging Technologies10.4018/979-8-3693-5261-8.ch010(166-179)Online publication date: 15-Apr-2024
    • (2024)Inversion of Forest Aboveground Biomass in Regions with Complex Terrain Based on PolSAR Data and a Machine Learning Model: Radiometric Terrain Correction AssessmentRemote Sensing10.3390/rs1612222916:12(2229)Online publication date: 19-Jun-2024
    • (2024)Accelerated Design for Perovskite-Oxide-Based Photocatalysts Using Machine Learning TechniquesMaterials10.3390/ma1712302617:12(3026)Online publication date: 20-Jun-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 50, Issue 6
    November 2018
    752 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/3161158
    • Editor:
    • Sartaj Sahni
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 December 2017
    Accepted: 01 August 2017
    Revised: 01 July 2017
    Received: 01 September 2016
    Published in CSUR Volume 50, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. Feature selection

    Qualifiers

    • Survey
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8,562
    • Downloads (Last 6 weeks)1,178

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Engineering Applications of Artificial IntelligenceEnhancing Medical Imaging with Emerging Technologies10.4018/979-8-3693-5261-8.ch010(166-179)Online publication date: 15-Apr-2024
    • (2024)Inversion of Forest Aboveground Biomass in Regions with Complex Terrain Based on PolSAR Data and a Machine Learning Model: Radiometric Terrain Correction AssessmentRemote Sensing10.3390/rs1612222916:12(2229)Online publication date: 19-Jun-2024
    • (2024)Accelerated Design for Perovskite-Oxide-Based Photocatalysts Using Machine Learning TechniquesMaterials10.3390/ma1712302617:12(3026)Online publication date: 20-Jun-2024
    • (2024)An Extensive Performance Comparison between Feature Reduction and Feature Selection Preprocessing Algorithms on Imbalanced Wide DataInformation10.3390/info1504022315:4(223)Online publication date: 16-Apr-2024
    • (2024)Forest Management Type Identification Based on Stacking Ensemble LearningForests10.3390/f1505088715:5(887)Online publication date: 20-May-2024
    • (2024)Identification of Airline Turbulence Using WOA-CatBoost Algorithm in Airborne Quick Access Record (QAR) DataApplied Sciences10.3390/app1411441914:11(4419)Online publication date: 23-May-2024
    • (2024)A Deep Learning Approach for Trajectory Control of Tilt-Rotor UAVAerospace10.3390/aerospace1101009611:1(96)Online publication date: 19-Jan-2024
    • (2024)A robust ensemble feature selection approach to prioritize genes associated with survival outcome in high-dimensional gene expression dataFrontiers in Systems Biology10.3389/fsysb.2024.13555954Online publication date: 21-Mar-2024
    • (2024)Machine learning models for assessing risk factors affecting health care costs: 12-month exercise-based cardiac rehabilitationFrontiers in Public Health10.3389/fpubh.2024.137834912Online publication date: 28-May-2024
    • (2024)Optimizing DUS testing for Chimonanthus praecox using feature selection based on a genetic algorithmFrontiers in Plant Science10.3389/fpls.2023.132860314Online publication date: 18-Jan-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media