Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3307339.3342139acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Integrative Feature Ranking by Applying Deep Learning on Multi Source Genomic Data

Published: 04 September 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Extracting cancer-related information from genomic data specially multi-source datasets has been an ever-growing challenge during the past years. The identification of subtype-specific genomic markers can lead to a sounder diagnosis and treatment. While several algorithms are proposed for feature extraction, to best of our knowledge, none of them consider between modality relations to discover modular disease associated biomarkers. In this paper, we represent an integrative deep learning approach to identify modular subtype-associated critical genes from three sets of input modalities for a better diagnosis of cancer subtypes. First, we train deep classifiers with different integration stages and distinct number of input modalities to predict cancer subtypes. Next, we use the optimized weight matrices of the classifier with the best performance to extract interactive top-ranked features among all input modalities. Lastly, we evaluate those ranks with other feature scoring methods according to their classification performance after feature extraction. Our results and analysis illustrate that the modular candidate biomarkers can be useful for cancer subtype detection.

    References

    [1]
    George A Calin and CarloMCroce. 2006. MicroRNA signatures in human cancers. Nature reviews cancer 6, 11 (2006), 857.
    [2]
    S. Ceri, A. Kaitoua, M. Masseroli, P. Pinoli, and F. Venco. 2016. Data Management for Heterogeneous Genomic Datasets. IEEE/ACM Transactions on Computational Biology and Bioinformatics PP, 99 (2016), 1--1.
    [3]
    Kumardeep Chaudhary, Olivier B Poirion, Liangqun Lu, and Lana X Garmire. 2018. Deep learning--based multi-omics integration robustly predicts survival in liver cancer. Clinical Cancer Research 24, 6 (2018), 1248--1259.
    [4]
    Sean R Eddy. 2001. Non--coding RNA genes and the modern RNA world. Nature Reviews Genetics 2, 12 (2001), 919.
    [5]
    Ewan A Gibb, Carolyn J Brown, and Wan L Lam. 2011. The functional role of long non-coding RNA in human carcinomas. Molecular cancer 10, 1 (2011), 38.
    [6]
    Nicolas Goossens, Shigeki Nakagawa, Xiaochen Sun, and Yujin Hoshida. 2015. Cancer biomarker discovery and validation. Translational cancer research 4, 3 (2015), 256.
    [7]
    Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. 2002. Gene selection for cancer classification using support vector machines. Machine learning 46, 1--3 (2002), 389--422.
    [8]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In The IEEE International Conference on Computer Vision (ICCV).
    [9]
    Miles F Jefferson, Neil Pendleton, Sam B Lucas, and Michael A Horan. 1997. Comparison of a genetic algorithm neural network with logistic regression for predicting outcome after surgery for patients with nonsmall cell lung carcinoma. Cancer: Interdisciplinary International Journal of the American Cancer Society 79, 7 (1997), 1338--1342.
    [10]
    Jun Li, Leng Han, Paul Roebuck, Lixia Diao, Lingxiang Liu, Yuan Yuan, John N Weinstein, and Han Liang. 2015. TANRIC: an interactive open platform to explore the function of lncRNAs in cancer. Cancer research (2015), canres--0273.
    [11]
    Muxuan Liang, Zhizhong Li, Ting Chen, and Jianyang Zeng. 2015. Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 12, 4 (2015), 928--937.
    [12]
    Guanming Lu, Yueyong Li, Yanfei Ma, Jinlan Lu, Yongcheng Chen, Qiulan Jiang, Qiang Qin, Lifeng Zhao, Qianfang Huang, Zhizhai Luo, et al. 2018. Long noncoding RNA LINC00511 contributes to breast cancer tumourigenesis and stemness by inducing the miR-185--3p/E2F1/Nanog axis. Journal of Experimental & Clinical Cancer Research 37, 1 (2018), 289.
    [13]
    John S Mattick and Igor V Makunin. 2006. Non-coding RNA. Human molecular genetics 15, suppl_1 (2006), R17--R29.
    [14]
    Cuong Nguyen, YongWang, and Ha Nam Nguyen. 2013. Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic. Journal of Biomedical Science and Engineering 6, 05 (2013), 551.
    [15]
    Brian C Ross. 2014. Mutual information between discrete and continuous data sets. PloS one 9, 2 (2014), e87357.
    [16]
    Ahmad Salameh, Xuejun Fan, Byung-Kwon Choi, Shu Zhang, Ningyan Zhang, and Zhiqiang An. 2017. HER3 and LINC00052 interplay promotes tumor growth in breast cancer. Oncotarget 8, 4 (2017), 6526.
    [17]
    Stephan C Schuster. 2008. Next-generation sequencing transforms today's biology. Nature methods 5, 1 (2008), 16.
    [18]
    Jay Shendure and Hanlee Ji. 2008. Next-generation DNA sequencing. Nature biotechnology 26, 10 (2008), 1135--1145.
    [19]
    Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
    [20]
    Nitish Srivastava and Ruslan R Salakhutdinov. 2012. Multimodal learning with deep boltzmann machines. In Advances in neural information processing systems. 2222--2230.
    [21]
    Dongdong Sun, Minghui Wang, and Ao Li. 2018. A multimodal deep neural network for human breast cancer prognosis prediction by integrating multidimensional data. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2018).
    [22]
    Erwin L van Dijk, Hélène Auger, Yan Jaszczyszyn, and Claude Thermes. 2014. Ten years of next-generation sequencing technology. Trends in genetics 30, 9 (2014), 418--426.
    [23]
    Lin Wei, Zhilin Jin, Shengjie Yang, Yanxun Xu, Yitan Zhu, and Yuan Ji. 2017. TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data. Bioinformatics 34, 9 (2017), 1615--1617.
    [24]
    Xiaoyi Xu, Ya Zhang, Liang Zou, Minghui Wang, and Ao Li. 2012. A gene signature for breast cancer prognosis using support vector machine. In 2012 5th International Conference on BioMedical Engineering and Informatics. IEEE, 928--931.
    [25]
    Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).
    [26]
    Yitan Zhu, Peng Qiu, and Yuan Ji. 2014. TCGA-assembler: open-source software for retrieving and processing TCGA data. Nature methods 11, 6 (2014), 599.

    Cited By

    View all
    • (2023)Machine Learning from Multi-omics: Applications and Data IntegrationMachine Learning Methods for Multi-Omics Data Integration10.1007/978-3-031-36502-7_2(13-21)Online publication date: 14-Nov-2023
    • (2021)Deep Learning in Multi-Omics Data Integration in Cancer DiagnosticDeep Learning for Biomedical Data Analysis10.1007/978-3-030-71676-9_11(255-271)Online publication date: 17-Mar-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
    September 2019
    716 pages
    ISBN:9781450366663
    DOI:10.1145/3307339
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 September 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. biomarker discovery
    2. data integration
    3. deep learning
    4. feature ranking
    5. genomic data
    6. neural networks

    Qualifiers

    • Research-article

    Conference

    BCB '19
    Sponsor:

    Acceptance Rates

    BCB '19 Paper Acceptance Rate 42 of 157 submissions, 27%;
    Overall Acceptance Rate 254 of 885 submissions, 29%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Machine Learning from Multi-omics: Applications and Data IntegrationMachine Learning Methods for Multi-Omics Data Integration10.1007/978-3-031-36502-7_2(13-21)Online publication date: 14-Nov-2023
    • (2021)Deep Learning in Multi-Omics Data Integration in Cancer DiagnosticDeep Learning for Biomedical Data Analysis10.1007/978-3-030-71676-9_11(255-271)Online publication date: 17-Mar-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media