Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Robust biomarker identification for cancer diagnosis with ensemble feature selection methods

Published: 01 February 2010 Publication History

Abstract

Motivation: Biomarker discovery is an important topic in biomedical applications of computational biology, including applications such as gene and SNP selection from high-dimensional data. Surprisingly, the stability with respect to sampling variation or robustness of such selection processes has received attention only recently. However, robustness of biomarkers is an important issue, as it may greatly influence subsequent biological validations. In addition, a more robust set of markers may strengthen the confidence of an expert in the results of a selection method.
Results: Our first contribution is a general framework for the analysis of the robustness of a biomarker selection algorithm. Secondly, we conducted a large-scale analysis of the recently introduced concept of ensemble feature selection, where multiple feature selections are combined in order to increase the robustness of the final set of selected features. We focus on selection methods that are embedded in the estimation of support vector machines (SVMs). SVMs are powerful classification models that have shown state-of-the-art performance on several diagnosis and prognosis tasks on biological data. Their feature selection extensions also offered good results for gene selection tasks. We show that the robustness of SVMs for biomarker discovery can be substantially increased by using ensemble feature selection techniques, while at the same time improving upon classification performances. The proposed methodology is evaluated on four microarray datasets showing increases of up to almost 30% in robustness of the selected biomarkers, along with an improvement of ~15% in classification performance. The stability improvement with ensemble methods is particularly noticeable for small signature sizes (a few tens of genes), which is most relevant for the design of a diagnosis or prognosis model from a gene signature.
Supplementary information: Supplementary data are available at Bioinformatics online.

Cited By

View all
  • (2024)Explainable feature selection and ensemble classification via feature polarityInformation Sciences: an International Journal10.1016/j.ins.2024.120818676:COnline publication date: 1-Aug-2024
  • (2024)Stable feature selection based on probability estimation in gene expression datasetsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123372248:COnline publication date: 15-Aug-2024
  • (2023)Cancer Prognosis and Diagnosis Methods Based on Ensemble LearningACM Computing Surveys10.1145/358021855:12(1-34)Online publication date: 3-Mar-2023
  • Show More Cited By

Index Terms

  1. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Bioinformatics
            Bioinformatics  Volume 26, Issue 3
            February 2010
            149 pages

            Publisher

            Oxford University Press, Inc.

            United States

            Publication History

            Published: 01 February 2010

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 23 Sep 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Explainable feature selection and ensemble classification via feature polarityInformation Sciences: an International Journal10.1016/j.ins.2024.120818676:COnline publication date: 1-Aug-2024
            • (2024)Stable feature selection based on probability estimation in gene expression datasetsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123372248:COnline publication date: 15-Aug-2024
            • (2023)Cancer Prognosis and Diagnosis Methods Based on Ensemble LearningACM Computing Surveys10.1145/358021855:12(1-34)Online publication date: 3-Mar-2023
            • (2023)A Fast Hybrid Feature Selection Method Based on Dynamic Clustering and Improved Particle Swarm Optimization for High-Dimensional Health Care DataIEEE Transactions on Consumer Electronics10.1109/TCE.2023.333437370:1(2447-2459)Online publication date: 28-Nov-2023
            • (2023)ECM-EFSPattern Recognition10.1016/j.patcog.2023.109449139:COnline publication date: 1-Jul-2023
            • (2023)Screening model of candidate drugs for breast cancer based on ensemble learning algorithm and molecular descriptorExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.119185213:PCOnline publication date: 1-Mar-2023
            • (2023)Deep learning approach for predicting lymph node metastasis in non-small cell lung cancer by fusing image–gene dataEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106140122:COnline publication date: 1-Jun-2023
            • (2023)SEQENSComputers in Biology and Medicine10.1016/j.compbiomed.2022.106413152:COnline publication date: 1-Jan-2023
            • (2023)Improved intelligent water drop-based hybrid feature selection method for microarray data processingComputational Biology and Chemistry10.1016/j.compbiolchem.2022.107809103:COnline publication date: 1-Apr-2023
            • (2023)Investigation of ensemble methods in terms of statistics: TIMMS 2019 exampleNeural Computing and Applications10.1007/s00521-023-08969-035:32(23507-23520)Online publication date: 6-Sep-2023
            • Show More Cited By

            View Options

            View options

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media