Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Introducing a Stable Bootstrap Validation Framework for Reliable Genomic Signature Extraction

Published: 01 January 2018 Publication History
  • Get Citation Alerts
  • Abstract

    The application of machine learning methods for the identification of candidate genes responsible for phenotypes of interest, such as cancer, is a major challenge in the field of bioinformatics. These lists of genes are often called genomic signatures and their linkage to phenotype associations may form a significant step in discovering the causation between genotypes and phenotypes. Traditional methods that produce genomic signatures from DNA Microarray data tend to extract significantly different lists under relatively small variations of the training data. That instability hinders the validity of research findings and raises skepticism about the reliability of such methods. In this study, a complete framework for the extraction of stable and reliable lists of candidate genes is presented. The proposed methodology enforces stability of results at the validation step and as a result, it is independent of the feature selection and classification methods used. Furthermore, two different statistical tests are performed in order to assess the statistical significance of the observed results. Moreover, the consistency of the signatures extracted by independent executions of the proposed method is also evaluated. The results of this study highlight the importance of stability issues in genomic signatures, beyond their prediction capabilities.

    References

    [1]
    N.-K. Chlis, S. Sfakianakis, E.S. Bei, and M. Zervakis, "A generic framework for the elicitation of stable and reliable gene expression signatures," in Proc. IEEE 13th Int. Conf. Bioinf. Bioengineering, 2013, pp. 1-6.
    [2]
    C. A. Davis, et al., "Reliable gene signatures for microarray classification: Assessment of stability and performance," Bioinf., vol. 22, no. 19, pp. 2356-2363, 2006.
    [3]
    H. Zengyou and Y. Weichuan, "Stable feature selection for biomarker discovery," Comput. Biol. Chemistry, vol. 34, pp. 215-225, 2010.
    [4]
    L. Yu, Y. Han, and M. E. Berens, "Stable gene selection from microarray data via sample weighting," IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 9, no. 1, pp. 262-272, Jan./Feb. 2012.
    [5]
    J. Goeman, J. Oosting, L. Finos, and A. Solari, "The global test and the globaltest R package," [Online]. Available: http://bioconductor.org/packages/release/bioc/html/globaltest.html
    [6]
    J. J. Goeman, S. A. van de Geer, F. de Kort, and H.C. van Houwelingen, "A global test for groups of genes: testing association with a clinical outcome," Bioinf., vol. 20, no. 1, pp. 93-99, 2004.
    [7]
    L. Ein-Dor, I. Kela, G. Getz, D. Givol, and E. Domany, "Outcome signature genes in breast cancer: Is there a unique set?" Bioinf., vol. 21, no. 2, pp. 171-178, 2005.
    [8]
    D. Venet, J. E. Dumont, and V. Detours, "Most random gene expression signatures are significantly associated with breast cancer outcome," PLoS Comput. Biol., vol. 7, 2011, Art. no. e1002240.
    [9]
    L. Yu, C. Ding, and S. Loscalzo, "Stable feature selection via dense feature groups," in Proc. 14th ACM SIGKDD, 2008, pp. 803-811.
    [10]
    A. L. Boulesteix and M. Slawski, "Stability and aggregation of ranked gene lists," Briefings Bioinf., vol. 10, no. 5, pp. 556-568, 2009.
    [11]
    A. C. Haury, P. Gestraud, and J. P. Vert, "The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures," PLoSONE, vol. 6, no. 12, 2011, Art. no. e28210.
    [12]
    N.-K. Chlis, E. S. Bei, and M. Zervakis, "Extracting reliable gene expression signatures through stable bootstrap validation," in Proc. IEEE EMBC, 2015, pp. 4458-4461.
    [13]
    S. Sfakianakis, E. S. Bei, M. Zervakis, D. Vassou, and D. Kafetzopoulos, "On the identification of circulating tumor cells in breast cancer," IEEE J. Biomed. Health Inf., vol. 18, no. 3, pp. 773-782, May 2014.
    [14]
    C. Clarke, et al., "Correlating transcriptional networks to breast cancer survival: A large-scale coexpression analysis," Carcinogenesis, vol. 34, no. 10, pp. 2300-2308, 2013.
    [15]
    C. Chen, et al., "Two gene co-expression modules differentiate psychotics and controls," Mol Psychiatry, vol. 18, no. 12, pp. 1308- 1314, 2013.
    [16]
    V.G. Tusher, R. Tibshirani, and G. Chu, "Significance analysis of microarrays applied to the ionizing radiation response," in Proc. Nat. Academy Sci. USA, vol. 98, no. 9, pp. 5116-5121, 2001.
    [17]
    R. Tibshirani, G. Chu, B. Narasimhan, and J. Li, SAMR: SAM: Significance Analysis Of Microarrays (R package). [Online]. Available: https://cran.r-project.org/web/packages/samr/index.html
    [18]
    A. Papoulis and S.U. Pillai, Probability, Random Variables and Stochastic Processes. New York, NY, USA: McGraw-Hill Europe, 2002.
    [19]
    C. M. Bishop, Pattern Recognition and Machine Learning. Berlin, Germany: Springer, 2006.
    [20]
    R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," in Proc. 14th Int. Joint Conf. Artif. Intell., 1995, vol. 2, pp. 1137-1143.
    [21]
    D. M. Diez, C. D. Barr, and M. Çetinkaya-Rundel, OpenIntro Statistics, 2nd Ed. CreateSpace Independent Publishing Platform, (2012). [Online]. Available: https://www.openintro.org/stat/textbook.php
    [22]
    D. S. Moore, G. P. McCabe, and B. A. Craig, Introduction to the Practice of Statistics, Sixth Edition. New York, NY, USA: W. H. Freeman and Company, 2009.
    [23]
    M. E. Tipping, "Sparse Bayesian learning and the relevance vector machine," J. Mach. Learning Res., vol. 1, pp. 211-244, 2001.
    [24]
    I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Mach. Learning, vol. 46, pp. 389-422, 2002.
    [25]
    N.-K. Chlis, E. S. Bei, S. Sfakianakis, D. Iliopoulou, D. Kafetzopoulos, and M. Zervakis, "Searching For Significant Genes In Cancer Metastasis By Tissue Comparisons," in Proc. 6th Eur. Conf. Int. Federation Med. Biol. Eng., 2014, pp. 594-597.
    [26]
    H. Maciejewski, "Gene set analysis methods: Statistical models and methodological differences," Briefings Bioinf., vol. 15, no. 4, pp. 504-518, 2014,
    [27]
    R. A. Irizarry, et al., "Multiple-laboratory comparison of microarray platforms," Nature Methods, vol. 2, no. 5, pp. 345-350, 2005.
    [28]
    J. Wang, D. Duncan, Z. Shi, and B. Zhang, "WEB-based gene set analysis toolkit (webgestalt): Update 2013," Nucleic Acids Res., vol. 41, pp. W77-W83, 2013.
    [29]
    E. Mosca, R. Alfieri, I. Merelli, F. Viti, A. Calabria, and L. Milanesi, "A multilevel data integration resource for breast cancer study," BMC Syst. Biol., vol. 4, 2010, Art. no. 76.
    [30]
    A. Gutiérrez-Sacristán, et al., "PsyGeNET: A knowledge platform on psychiatric disorders and their genes," Bioinf., vol. 31, pp. 3075-3077, 2015.
    [31]
    M. Kanehisa and S. Goto, "KEGG: Kyoto encyclopedia of genes and genomes," Nucleic Acids Res., vol. 28, pp. 27-30, 2000.
    [32]
    M. Ashburner, et al., "Gene ontology: tool for the unification of biology," Nature Genetics, vol. 25, no. 1, pp. 25-29, 2000.
    [33]
    D. L. Dankort and W. J. Muller, "Signal transduction in mammary tumorigenesis: A transgenic perspective," Oncogene, vol. 19, no. 8, pp. 1038-1044, 2000.
    [34]
    J. Huan, et al., "Insights into significant pathways and gene interaction networks underlying breast cancer cell line MCF-7 treated with 17b-Estradiol (E2)," Genetic, vol. 533, no. 1, pp. 346-355, 2014.
    [35]
    E. I. Chen, et al., "Adaptation of energy metabolism in breast cancer brainmetastases," Cancer Res., vol. 67, no. 4, pp. 1472-1486, 2007.
    [36]
    V. Tamási, K. Monostory, R. A. Prough, and A. Falus, "Role of xenobiotic metabolism in cancer: Involvement of transcriptional and mirna regulation of P450s," Cell Mol. Life Sci., vol. 68, no. 7, pp. 1131-1146, 2011.
    [37]
    V. Ouellet, et al., "CCN3 impairs osteoblast and stimulates osteoclast differentiation to favor breast cancer metastasis to bone," Amer. J. Pathol, vol. 178, no. 5, pp. 2377-2388, 2011.
    [38]
    L. J. Standish, et al., "Breast cancer and the immune system," J. Soc. Integr. Oncol., vol. 6, no. 4, pp. 158-168, 2008.
    [39]
    P. L. Fernández, P. Jares, M. J. Rey, E. Campo, and A. Cardesa, "Cell cycle regulators and their abnormalities in breast cancer," Mol. Pathology, vol. 51, no. 6, pp. 305-309, 1998.
    [40]
    R. M. Adibhatla and J. F. Hatcher, "Altered lipid metabolism in brain injury and disorders," Subcell Biochem., vol. 49, pp. 241-268, 2008.
    [41]
    E. Uribe and R. Wix, "Neuronal migration, apoptosis and bipolar disorder," Rev. Psiquiatr. SaludMent., vol. 5, no. 2, pp. 127-133, 2012.
    [42]
    D. P. McKernan, U. Dennison, G. Gaszner, J. F. Cryan, and T. G. Dinan, "Enhanced peripheral toll-like receptor responses in psychosis: Further evidence of a pro-inflammatory phenotype," Translational Psychiatry, vol. 1, no. 8, 2011, Art. no. e36.
    [43]
    S. S. Valvassori, et al., "Sodium butyrate reverses the inhibition of krebs cycle enzymes induced by Amphetamine in the rat brain," J. Neural Transm., vol. 120, no. 12, pp. 1737-1742, 2013.
    [44]
    O.M. Dean, et al., "A role for glutathione in the pathophysiology of bipolar disorder and schizophrenia? Animal models and relevance to clinical practice," Curr. Med. Chem., vol. 16, no. 23, pp. 2965-2976, 2009.
    [45]
    G. Shaltiel, G. Chen, and H. K. Manji, "Neurotrophic signaling cascades in the pathophysiology and treatment of bipolar disorder," Curr. Opin. Pharmacol., vol. 7, no. 1, pp. 22-26, 2007.
    [46]
    T. J. Molloy, P. Roepman, B. Naume, and L. J. van't Veer, "A prognostic gene expression profile that predicts circulating tumor cell presence in breast cancer patients," PLoS One, vol. 7, no. 2, 2012, Art. no. e32426.
    [47]
    R. Z. Liu, K. Graham, D. D. Glubrecht, D. R. Germain, J. R. Mackey, and R. Godbout, "Association of FABP5 expression with poor survival in triple-negative breast cancer: Implication for retinoic acid therapy," Amer. J. Pathol., vol. 178, no. 3, pp. 997- 1008, 2011.
    [48]
    E. Enerly, et al., "Mirna-mrna integrated analysis reveals roles for mirnas in primary breast tumors," PLoS One, vol. 6, no. 2, 2011, Art. no. e16915.
    [49]
    A. Tripathi, et al., "Gene expression abnormalities in histologically normal breast epithelium of breast cancer patients," Int. J. Cancer, vol. 122, no. 7, pp. 1557-1566, 2008.
    [50]
    V. D. Haakensen, et al., "Expression levels of uridine 5'-Diphospho-Glucuronosyltransferase genes in breast tissue from healthy women are associated with mammographic density," Breast Cancer Res., vol. 12, no. 4, 2010, Art. no. R65.
    [51]
    A. Subramanian, et al., "Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles," in Proc. Nat. Acad. Sci. USA, vol. 102, pp. 15545-15550, 2005.
    [52]
    A. Liberzon, C. Birger, H. Thorvaldsd_ottir, M. Ghandi, J. P. Mesirov, and P. Tamayo, "The molecular signatures database (MSigDB) hallmark gene set collection," Cell Syst., vol. 1, no. 6, pp. 417-425, 2015.
    [53]
    M. Smid, et al., "Subtypes of breast cancer show preferential site of relapse," Cancer Res., vol. 68, no. 9, pp. 3108-3114, 2008.
    [54]
    L. Delys, et al., "Gene expression and the biological phenotype of papillary thyroid carcinomas,"Oncogene, vol. 26, no. 57, pp. 7894- 7903, 2007.
    [55]
    A. Muneer, "Bipolar disorder: Role of inflammation and the development of disease biomarkers," Psychiatry Investig., vol. 13, no. 1, pp. 18-33, 2016.
    [56]
    S. Rege and S. J. Hodgkinson, "Immune dysregulation and autoimmunity in bipolar disorder: Synthesis of the evidence and its clinical application," Aust. N Z J Psychiatry, vol. 47, no. 12, pp. 1136-1151, 2013.
    [57]
    H. Tomita, et al., "G protein-linked signaling pathways in bipolar and major depressive disorders," Front Genet., vol. 4, 2013, Art. no. 297.

    Cited By

    View all
    • (2023)A constraint score guided meta-heuristic searching to attribute reductionJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22283244:3(4779-4800)Online publication date: 1-Jan-2023
    • (2019)A Novel Weighted Combination Method for Feature Selection using Fuzzy Sets2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)10.1109/FUZZ-IEEE.2019.8858890(1-6)Online publication date: 23-Jun-2019

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
    IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 15, Issue 1
    January 2018
    352 pages

    Publisher

    IEEE Computer Society Press

    Washington, DC, United States

    Publication History

    Published: 01 January 2018
    Published in TCBB Volume 15, Issue 1

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A constraint score guided meta-heuristic searching to attribute reductionJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22283244:3(4779-4800)Online publication date: 1-Jan-2023
    • (2019)A Novel Weighted Combination Method for Feature Selection using Fuzzy Sets2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)10.1109/FUZZ-IEEE.2019.8858890(1-6)Online publication date: 23-Jun-2019

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media