Fast, Accurate, and Stable Feature Selection Using Neural Networks

Deraeve, James; Alexander, William H.

doi:10.1007/s12021-018-9371-3

Fast, Accurate, and Stable Feature Selection Using Neural Networks

Original Article
Published: 21 March 2018

Volume 16, pages 253–268, (2018)
Cite this article

Neuroinformatics Aims and scope Submit manuscript

James Deraeve¹ &
William H. Alexander¹

876 Accesses
12 Citations
2 Altmetric
Explore all metrics

Abstract

Multi-voxel pattern analysis often necessitates feature selection due to the high dimensional nature of neuroimaging data. In this context, feature selection techniques serve the dual purpose of potentially increasing classification accuracy and revealing sets of features that best discriminate between classes. However, feature selection techniques in current, widespread use in the literature suffer from a number of deficits, including the need for extended computational time, lack of consistency in selecting features relevant to classification, and only marginal increases in classifier accuracy. In this paper we present a novel method for feature selection based on a single-layer neural network which incorporates cross-validation during feature selection and stability selection through iterative subsampling. Comparing our approach to popular alternative feature selection methods, we find increased classifier accuracy, reduced computational cost and greater consistency with which relevant features are selected. Furthermore, we demonstrate that importance mapping, a technique used to identify voxels relevant to classification, can lead to the selection of irrelevant voxels due to shared activation patterns across categories. Our method, owing to its relatively simple architecture, flexibility and speed, can provide a viable alternative for researchers to identify sets of features that best discriminate classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Muller, A., Kossaifi, J., … Varoquaux, G. (2014). Machine learning for neuroimaging with Scikit-learn. arXiv:1412.3919 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1412.3919
Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2013). A review of feature selection methods on synthetic data. Knowledge and Information Systems, 34(3), 483–519. https://doi.org/10.1007/s10115-012-0487-8.
Article Google Scholar
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on computational learning theory (pp. 144–152). New York: ACM. https://doi.org/10.1145/130385.130401.
Chapter Google Scholar
Cao, L. J., & Chong, W. K. (2002). Feature extraction in support vector machine: a comparison of PCA, XPCA and ICA. In Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP ‘02 (Vol. 2, pp. 1001–1005 vol. 2). https://doi.org/10.1109/ICONIP.2002.1198211.
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024.
Article Google Scholar
Chou, C. A., Kampa, K., Mehta, S. H., Tungaraza, R. F., Chaovalitwongse, W. A., & Grabowski, T. J. (2014). Voxel selection framework in multi-voxel pattern analysis of fMRI data for prediction of neural response to visual stimuli. IEEE Transactions on Medical Imaging, 33(4), 925–934. https://doi.org/10.1109/TMI.2014.2298856.
Article PubMed Google Scholar
Chu, C., Hsu, A.-L., Chou, K.-H., Bandettini, P., & Lin, C. (2012). Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. NeuroImage, 60(1), 59–70. https://doi.org/10.1016/j.neuroimage.2011.11.066.
Article PubMed Google Scholar
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.
Article Google Scholar
Cox, D. D., & Savoy, R. L. (2003). Functional magnetic resonance imaging (fMRI) “brain reading”: Detecting and classifying distributed patterns of fMRI activity in human visual cortex. NeuroImage, 19(2), 261–270. https://doi.org/10.1016/S1053-8119(03)00049-1.
Article PubMed Google Scholar
Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In Proceedings of the eighteenth international conference on machine learning (pp. 74–81). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Retrieved from http://dl.acm.org/citation.cfm?id=645530.658297.
Google Scholar
De Martino, F., Valente, G., Staeren, N., Ashburner, J., Goebel, R., & Formisano, E. (2008). Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. NeuroImage, 43(1), 44–58. https://doi.org/10.1016/j.neuroimage.2008.06.037.
Article PubMed Google Scholar
Dernoncourt, D., Hanczar, B., & Zucker, J.-D. (2014). Analysis of feature selection stability on high dimension and small sample data. Computational Statistics & Data Analysis, 71, 681–693. https://doi.org/10.1016/j.csda.2013.07.012.
Article Google Scholar
Ding, C., & Peng, H. (2005). Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology, 3(2), 185–205. https://doi.org/10.1142/S0219720005001004.
Article PubMed CAS Google Scholar
Dittman, D., Khoshgoftaar, T. M., Wald, R., & Wang, H. (2011). Stability Analysis of Feature Ranking Techniques on Biological Datasets. In 2011 I.E. International Conference on Bioinformatics and Biomedicine (pp. 252–256). https://doi.org/10.1109/BIBM.2011.84.
Do, L.-N., Yang, H.-J., Kim, S.-H., Lee, G.-S., & Kim, S.-H. (2015). A multi-voxel-activity-based feature selection method for human cognitive states classification by functional magnetic resonance imaging data. Cluster Computing, 18(1), 199–208. https://doi.org/10.1007/s10586-014-0369-9.
Article Google Scholar
Fan, M., & Chou, C.-A. (2016). Exploring stability-based voxel selection methods in MVPA using cognitive neuroimaging data: A comprehensive study. Brain Informatics, 3(3), 193–203. https://doi.org/10.1007/s40708-016-0048-0.
Article PubMed PubMed Central Google Scholar
Fleuret, F. (2004). Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research, 5(Nov), 1531–1555.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182.
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for Cancer classification using support vector machines. Machine Learning, 46(1–3), 389–422. https://doi.org/10.1023/A:1012487302797.
Article Google Scholar
Hall, M. A. (1998). Correlation-based feature selection for machine learning.
Haury, A.-C., Gestraud, P., & Vert, J.-P. (2011). The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLoS One, 6(12), e28210. https://doi.org/10.1371/journal.pone.0028210.
Article PubMed PubMed Central CAS Google Scholar
Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293(5539), 2425–2430. https://doi.org/10.1126/science.1063736.
Article PubMed CAS Google Scholar
Hebart, M. N., Görgen, K., & Haynes, J.-D. (2015). The decoding toolbox (TDT): A versatile software package for multivariate analyses of functional imaging data. Frontiers in Neuroinformatics, 8. https://doi.org/10.3389/fninf.2014.00088.
Johnson, J. D., McDuff, S. G. R., Rugg, M. D., & Norman, K. A. (2009). Recollection, familiarity, and cortical reinstatement: A multi-voxel pattern analysis. Neuron, 63(5), 697–708. https://doi.org/10.1016/j.neuron.2009.08.011.
Article PubMed PubMed Central CAS Google Scholar
Kalousis, A., Prados, J., & Hilario, M. (2005). Stability of feature selection algorithms. In Fifth IEEE International Conference on Data Mining (ICDM’05) (p. 8 pp.-). https://doi.org/10.1109/ICDM.2005.135.
Kalousis, A., Prados, J., & Hilario, M. (2007). Stability of feature selection algorithms: A study on high-dimensional spaces. Knowledge and Information Systems, 12(1), 95–116. https://doi.org/10.1007/s10115-006-0040-8.
Article Google Scholar
Kerr, W. T., Douglas, P. K., Anderson, A., & Cohen, M. S. (2014). The utility of data-driven feature selection: Re: Chu et al. 2012. NeuroImage, 84, 1107–1110. https://doi.org/10.1016/j.neuroimage.2013.07.050.
Article PubMed Google Scholar
Kirk, P., Witkover, A., Bangham, C. R. M., Richardson, S., Lewin, A. M., & Stumpf, M. P. H. (2013). Balancing the robustness and predictive performance of biomarkers. Journal of Comparative Biology, 20(12), 979–989. https://doi.org/10.1089/cmb.2013.0018.
Article CAS Google Scholar
Kononenko, I., & Simec, E. (1995). Induction of decision trees using Relieff. In Proceedings of the ISSEK94 workshop on mathematical and statistical methods in artificial intelligence (pp. 199–220). Springer, Vienna. https://doi.org/10.1007/978-3-7091-2690-5_14.
Kononenko, I., Šimec, E., & Robnik-Šikonja, M. (1997). Overcoming the myopia of inductive learning algorithms with RELIEFF. Applied Intelligence, 7(1), 39–55. https://doi.org/10.1023/A:1008280620621.
Article Google Scholar
Křížek, P., Kittler, J., & Hlaváč, V. (2007). Improving stability of feature selection methods. In Computer Analysis of Images and Patterns (pp. 929–936). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74272-2_115, Improving Stability of Feature Selection Methods.
Kuncheva, L. I., Rodriguez, J. J., Plumpton, C. O., Linden, D. E. J., & Johnston, S. J. (2010). Random subspace ensembles for fMRI classification. IEEE Transactions on Medical Imaging, 29(2), 531–542. https://doi.org/10.1109/TMI.2009.2037756.
Article PubMed Google Scholar
Lewis-Peacock, J. A., Drysdale, A. T., Oberauer, K., & Postle, B. R. (2011). Neural evidence for a distinction between short-term memory and the focus of attention. Journal of Cognitive Neuroscience, 24(1), 61–79. https://doi.org/10.1162/jocn_a_00140.
Article PubMed PubMed Central Google Scholar
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature Selection: A Data Perspective. ACM Computing. Surveys, 50(6), 94:1–94:45. :https://doi.org/10.1145/3136625.
Liu, H., & Setiono, R. (1995). Chi2: feature selection and discretization of numeric attributes. In Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence (pp. 388–391). https://doi.org/10.1109/TAI.1995.479783.
Ma, S., & Huang, J. (2008). Penalized feature selection and classification in bioinformatics. Briefings in Bioinformatics, 9(5), 392–403. https://doi.org/10.1093/bib/bbn027.
Article PubMed PubMed Central Google Scholar
Mahmoudi, A., Takerkart, S., Regragui, F., Boussaoud, D., & Brovelli, A. (2012). Multivoxel pattern analysis for fMRI data: A review. Computational and Mathematical Methods in Medicine, 2012, e961257. https://doi.org/10.1155/2012/961257.
McDuff, S. G. R., Frankel, H. C., & Norman, K. A. (2009). Multivoxel pattern analysis reveals increased memory targeting and reduced use of retrieved details during single-agenda source monitoring. Journal of Neuroscience, 29(2), 508–516. https://doi.org/10.1523/JNEUROSCI.3587-08.2009.
Article PubMed PubMed Central CAS Google Scholar
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x.
Article Google Scholar
Michel, V., Damon, C., & Thirion, B. (2008). Mutual information-based feature selection enhances fMRI brain activity classification. In 2008 5th IEEE International Symposium on Biomedical Imaging: From Nano to Macro (pp. 592–595). https://doi.org/10.1109/ISBI.2008.4541065.
Mwangi, B., Tian, T. S., & Soares, J. C. (2014). A review of feature reduction techniques in neuroimaging. Neuroinformatics, 12(2), 229–244. https://doi.org/10.1007/s12021-013-9204-3.
Article PubMed PubMed Central Google Scholar
Nie, F., Xiang, S., Jia, Y., Zhang, C., & Yan, S. (2008). Trace ratio criterion for feature selection. In In AAAI (pp. 671–676).
Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: Multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10(9), 424–430. https://doi.org/10.1016/j.tics.2006.07.005.
Article PubMed Google Scholar
O’Toole, A. J., Jiang, F., Abdi, H., & Haxby, J. V. (2005). Partially distributed representations of objects and faces in ventral temporal cortex. Journal of Cognitive Neuroscience, 17(4), 580–590. https://doi.org/10.1162/0898929053467550.
Article PubMed Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825–2830.
Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226–1238. https://doi.org/10.1109/TPAMI.2005.159.
Article PubMed Google Scholar
Polyn, S. M., Natu, V. S., Cohen, J. D., & Norman, K. A. (2005). Category-specific cortical activity precedes retrieval during memory search. Science, 310(5756), 1963–1966. https://doi.org/10.1126/science.1117645.
Article PubMed CAS Google Scholar
Ross, B. C. (2014). Mutual information between discrete and continuous data sets., Mutual Information between Discrete and Continuous Data Sets. PloS One, PLoS ONE, 9, 9(2, 2), e87357–e87357. https://doi.org/10.1371/journal.pone.0087357, https://doi.org/10.1371/journal.pone.0087357.
Saarimäki, H., Gotsopoulos, A., Jääskeläinen, I. P., Lampinen, J., Vuilleumier, P., Hari, R., Sams, M., & Nummenmaa, L. (2016). Discrete neural signatures of basic emotions. Cerebral Cortex, 26(6), 2563–2573. https://doi.org/10.1093/cercor/bhv086.
Article PubMed Google Scholar
Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507–2517. https://doi.org/10.1093/bioinformatics/btm344.
Article PubMed CAS Google Scholar
Saeys, Y., Abeel, T., & Peer, Y. V. de. (2008). Robust feature selection using ensemble feature selection techniques. In Machine Learning and Knowledge Discovery in Databases (pp. 313–325). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87481-2_21, Robust Feature Selection Using Ensemble Feature Selection Techniques.
Sayres, R., Ress, D., & Grill-Spector, K. (2005). Identifying distributed object representations in human Extrastriate visual cortex. In Proceedings of the 18th international conference on neural information processing systems (pp. 1169–1176). Cambridge: MIT Press Retrieved from http://dl.acm.org/citation.cfm?id=2976248.2976395.
Google Scholar
Stiglic, G., & Kokol, P. (2010). Stability of ranked gene lists in large microarray analysis studies. BioMed Research International, 2010, e616358. https://doi.org/10.1155/2010/616358.
Article CAS Google Scholar
Tohka, J., Moradi, E., Huttunen, H., & Initiative, A. D. N. (2016). Comparison of feature selection techniques in machine learning for anatomical brain MRI in dementia. Neuroinformatics, 14(3), 279–296. https://doi.org/10.1007/s12021-015-9292-3.
Article PubMed Google Scholar
Toloşi, L., & Lengauer, T. (2011). Classification with correlated features: Unreliability of feature ranking and solutions. Bioinformatics, 27(14), 1986–1994. https://doi.org/10.1093/bioinformatics/btr300.
Article PubMed CAS Google Scholar
Turney, P. (1995). Technical note: Bias and the quantification of stability. Machine Learning, 20(1–2), 23–33. https://doi.org/10.1023/A:1022682001417.
Article Google Scholar
Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24(1), 175–186. https://doi.org/10.1007/s00521-013-1368-0.
Article Google Scholar
Wang, Y., Li, Z., Wang, Y., Wang, X., Zheng, J., Duan, X., & Chen, H. (2015). A Novel Approach for Stable Selection of Informative Redundant Features from High Dimensional fMRI Data. arXiv:1506.08301 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1506.08301
Wright, S. (1965). The interpretation of population structure by F-statistics with special regard to Systems of Mating. Evolution, 19(3), 395–420. https://doi.org/10.1111/j.1558-5646.1965.tb01731.x.
Article Google Scholar
Yan, S., Yang, X., Wu, C., Zheng, Z., & Guo, Y. (2014). Balancing the stability and predictive performance for multivariate voxel selection in fMRI study. In Brain Informatics and Health (pp. 90–99). Springer, Cham. https://doi.org/10.1007/978-3-319-09891-3_9, Balancing the Stability and Predictive Performance for Multivariate Voxel Selection in fMRI Study.
Zeithamova, D., de Araujo Sanchez, M.-A., & Adke, A. (2017). Trial timing and pattern-information analyses of fMRI data. NeuroImage, 153(Supplement C), 221–231. https://doi.org/10.1016/j.neuroimage.2017.04.025.
Zhao, Z., & Liu, H. (2007). Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th international conference on machine learning (pp. 1151–1157). New York: ACM. https://doi.org/10.1145/1273496.1273641.
Chapter Google Scholar
Zhao, Z., Wang, L., Liu, H., & Ye, J. (2013). On similarity preserving feature selection. IEEE Transactions on Knowledge and Data Engineering, 25(3), 619–632. https://doi.org/10.1109/TKDE.2011.222.
Article Google Scholar

Download references

Acknowledgments

This research was supported by FWO-Flanders Odysseus II Award #G.OC44.13 N to WHA.

Author information

Authors and Affiliations

Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, B-9000, Ghent, Belgium
James Deraeve & William H. Alexander

Authors

James Deraeve
View author publications
You can also search for this author in PubMed Google Scholar
William H. Alexander
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Deraeve.

Ethics declarations

Conflict of Interest

We report no conflicts of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deraeve, J., Alexander, W.H. Fast, Accurate, and Stable Feature Selection Using Neural Networks. Neuroinform 16, 253–268 (2018). https://doi.org/10.1007/s12021-018-9371-3

Download citation

Published: 21 March 2018
Issue Date: April 2018
DOI: https://doi.org/10.1007/s12021-018-9371-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast, Accurate, and Stable Feature Selection Using Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mutual Information Iterated Local Search: A Wrapper-Filter Hybrid for Feature Selection in Brain Computer Interfaces

Exploring stability-based voxel selection methods in MVPA using cognitive neuroimaging data: a comprehensive study

Brain Neural Data Analysis Using Machine Learning Feature Selection and Classification Methods

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fast, Accurate, and Stable Feature Selection Using Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mutual Information Iterated Local Search: A Wrapper-Filter Hybrid for Feature Selection in Brain Computer Interfaces

Exploring stability-based voxel selection methods in MVPA using cognitive neuroimaging data: a comprehensive study

Brain Neural Data Analysis Using Machine Learning Feature Selection and Classification Methods

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation