Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Molecular Pattern Discovery Based on Penalized Matrix Decomposition

Published: 01 November 2011 Publication History
  • Get Citation Alerts
  • Abstract

    A reliable and precise identification of the type of tumors is crucial to the effective treatment of cancer. With the rapid development of microarray technologies, tumor clustering based on gene expression data is becoming a powerful approach to cancer class discovery. In this paper, we apply the penalized matrix decomposition (PMD) to gene expression data to extract metasamples for clustering. The extracted metasamples capture the inherent structures of samples belong to the same class. At the same time, the PMD factors of a sample over the metasamples can be used as its class indicator in return. Compared with the conventional methods such as hierarchical clustering (HC), self-organizing maps (SOM), affinity propagation (AP) and nonnegative matrix factorization (NMF), the proposed method can identify the samples with complex classes. Moreover, the factor of PMD can be used as an index to determine the cluster number. The proposed method provides a reasonable explanation of the inconsistent classifications made by the conventional methods. In addition, it is able to discover the modules in gene expression data of conterminous developmental stages. Experiments on two representative problems show that the proposed PMD-based method is very promising to discover biological phenotypes.

    References

    [1]
    K. Akashi, X. He, J. Chen, H. Iwasaki, C. Niu, B. Steenhard, J. Zhang, J. Haug, and L. Li, "Transcriptional Accessibility for Genes of Multiple Tissues and Hematopoietic Lineages is Hierarchically Controlled During Early Hematopoiesis," Blood, vol. 101, pp. 383- 389, 2003.
    [2]
    A. A. Alizadeh et al., "Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression Profiling," Nature, vol. 403, pp. 503-511, 2000.
    [3]
    U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine, "Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays," Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 6745-6750, 1999.
    [4]
    S. V. Anisimov et al., "'NeuroStem Chip': A Novel Highly Specialized Tool to Study Neural Differentiation Pathways in Human Stem Cells," BMC Genomics, vol. 8, p. 46, 2007.
    [5]
    J. P. Brunet, P. Tamayo, T. R. Golun, and J. P. Mesirov, "Metagenes and Molecular Pattern Discovery Using Matrix Factorization," Proc Nat'l Academy of Sciences USA, vol. 101, no. 12, pp. 4164-4169, 2004.
    [6]
    I. G. Costa, S. Roepcke, C. Hafemeister, and A. Schliep, "Inferring Differentiation Pathways from Gene Expression," Bioinformatics, vol. 24, no. 13, pp. i156-i164, 2008.
    [7]
    M. B. Eisen et al., "Cluster Analysis and Display of Genome-Wide Expression Patterns," Proc. Nat'l Academy of Sciences USA, vol. 95, pp. 14863-14868, 1998.
    [8]
    F. Ferrari, S. Bortoluzzi, D. Basso, S. Bicciato, R. Zini, C. Gemelli, G. A. Danieli, and S. Ferrari, "Genomic Expression during Human Myelopoiesis," BMC Genomics, vol. 8, p. 264, 2007.
    [9]
    Y. Gao and C. George, "Improving Molecular Cancer Class Discovery through Sparse Non-Negative Matrix Factorization," Bioinformatics, vol. 21, pp. 3970-3975., 2005.
    [10]
    T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander, "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, vol. 286, pp. 531-537, 1999.
    [11]
    K. Houck, N. Nikrui, L. Duska, Y. Chang, A. F. Fuller, D. Bell, and A. Goodman, "Borderline Tumors of the Ovary: Correlation of Frozen and Permanent Histopathologic Diagnosis," Obstetrics and Gynecology, vol. 95, pp. 839-843, 2000.
    [12]
    D. S. Huang and C. H. Zheng, "Independent Component Analysis-Based Penalized Discriminant Method for Tumor Classification Using Gene Expression Data," Bioinformatics, vol. 22, pp. 1855- 1862, 2006.
    [13]
    G. Hyatt, R. Melamed, R. Park, R. Seguritan, C. Laplace, L. Poirot, S. Zucchelli, and R. Obst, "Gene Expression Microarrays: Glimpses of the Immunological Genome," Nature Immunology, vol. 7, pp. 686-691, 2006.
    [14]
    M. Okumi, Y. Matsuoka, M. Tsukikawa, N. Fujimoto, S. Sagawa, and K. Itoh, "A Compound Tumor in the Adrenal Medulla-Pheochromocytoma Combined with Ganglioneuroma: A Case Report," Acta Urologica Japonica, vol. 46, pp. 887-890, 2000.
    [15]
    C. M. Perou et al., "Molecular Portraits of Human Breast Tumours," Nature, vol. 406, pp. 747-752, 2000.
    [16]
    S. L. Pomeroy et al., "Prediction of Central Nervous System Embryonal Tumour Outcome Based on Gene Expression," Nature, vol. 415, pp. 436-442, 2002.
    [17]
    L. Poirot et al., "Natural Killer Cells Distinguish Innocuous and Destructive Forms of Pancreatic Islet Autoimmunity," Proc. Nat'l Academy of Sciences USA, vol. 101, pp. 8102-8107, 2004.
    [18]
    D. K. Slonim, P. Tamayo, J. P. Mesirov, T. R. Golub, and E. S. Lander, "Class Prediction and Discovery Using Gene Expression Data," Proc. Fourth Ann. Int'l Conf. Computational Molecular Biology, pp. 263-272, 2000.
    [19]
    J. Khan, J. S. Wei, M. Ringner, L. H. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C. R. Antonescu, C. Peterson, and P. S. Meltzer, "Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks," Nature Medicine, vol. 7, no. 6, pp. 673-679, 2001.
    [20]
    P. Tamayo et al., "Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation," Proc. Nat'l Academy of Sciences USA, vol. 96, pp. 2907-2912, 1999.
    [21]
    L. E. Tze et al., "Basal Immunoglobulin Signaling Actively Maintains Developmental Stage in Immature B Cells," PLoS Biology, vol. 3, p. e82, 2005.
    [22]
    H. Q. Wang, H. S. Wong, D. S. Huang, and J. Shu, "Extracting Gene Regulation Information for Cancer Classification," Pattern Recognition, vol. 40, pp. 3379-33927, 2007.
    [23]
    J. Wang, J. Delabie, H. Aasheim, E. Smeland, and O. Myklebost, "Clustering of the SOM Easily Reveals Distinct Gene Expression Patterns: Results of a Reanalysis of Lymphoma Study," BMC Bioinformatics, vol. 3, p. 36, 2002.
    [24]
    D. M. Witten, R. Tibshirani, and T. Hastie, "A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis," Biostatistics, vol. 10, no. 3, pp. 515-534, 2009.
    [25]
    D. M. Witten and R. Tibshirani, "Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data," Statistical Applications in Genetics and Molecular Biology, vol. 8, no. 1, article no. 28, 2009.
    [26]
    T. Yamagata, C. Benoist, and D. Mathis, "A Shared Gene-Expression Signature in Innate-Like Lymphocytes," Immunological Rev., vol. 210, pp. 52-66, 2006.
    [27]
    H. Wang, H. Zheng, and F. Azuaje, "Poisson-Based Self-Organizing Feature Maps and Hierarchical Clustering for Serial Analysis of Gene Expression Data," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 2, pp. 163-175, Apr.-June 2007.
    [28]
    C. H. Zheng, D. S. Huang, L. Zhang, and X. Z. Kong, "Tumor Clustering Using Non-Negative Matrix Factorization with Gene Selection," IEEE Trans. Information Technology in Biomedicine, vol. 13, no. 4, pp. 599-607, July 2009.
    [29]
    T. K. Paul and H. Iba, "Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 2, pp. 353-367, Apr.-June 2009.
    [30]
    M. Leone, Sumedha, and M. Weight, "Clustering by Soft-Constraint Affinity Propagation: Applications to Gene-Expression Data," Bioinformatics, vol. 23, pp. 2708-2715, 2007.
    [31]
    J. F. Frey and D. Dueck, "Clustering by Passing Messages between Data Points," Science, vol. 315, pp. 972-976, 2007.
    [32]
    J. P. Bouchaud and M. Potters, "Financial Applications of Random Matrix Theory: A Short Review," http://arxiv.org/abs/ 0910.1205, 2011.
    [33]
    N. E. Karoui, "Spectrum Estimation for Large Dimensional Covariance Matrices Using Random Matrix Theory," http:// www.stat.berkeley.edu/~nkaroui/papers/AOS581Spectrum EstimationRMT.pdf, 2011.
    [34]
    A.Y. Ng, M. I. Jorden, and Y. Weiss, "On Spectral Clustering: Analysis and an Algorithm," Advances in Neural Information Processing Systems, vol. 14, pp. 849-856, 2002.
    [35]
    N. Halabi, O. Rivoire, S. Leibler, and R. Ranganathan, "Protein Sectors: Evolutionary Units of Three-Dimensional Structure," Cell, vol. 138, no. 4, pp. 774-786, 2009.
    [36]
    X. M. Zhao, Y. M. Cheung, and D. S. Huang, "Analysis of Gene Expression Data Using RPEM Algorithm in Normal Mixture Model with Dynamic Adjustment of Learning Rate," Int'l J. Pattern Recognition and Artificial Intelligence, vol. 24, no. 4, pp. 651-666, 2010.
    [37]
    H. Li, Y. Sun, and M. Zhan, "The Discovery of Transcriptional Modules by a Two-Stage Matrix Decomposition Approach," Bioinformatics, vol. 23, no. 4, pp. 473-479, 2007.
    [38]
    X. M. Zhao, R. S. Wang, L. N. Chen, and A. Kazuyuki, "Uncovering Signal Transduction Networks from High-Throughput Data by Integer Linear Programming," Nucleic Acids Research, vol. 36, no. 9, p. e48, 2008.
    [39]
    J. T. Chang, C. Carvalho, S. Mori, A. H. Bild, M. L. Gatza, Q. Wang, J. E. Lucas, A. Potti, P. G. Febbo, M. West, and J. R. Nevins, "A Genomic Strategy to Elucidate Modules of Oncogenic Pathway Signaling Networks," Moleculer Cell, vol. 34, no. 1, pp. 104-14, 2009.

    Cited By

    View all
    • (2021)One-Step Robust Low-Rank Subspace Segmentation for Tumor Sample ClusteringComputational Intelligence and Neuroscience10.1155/2021/99902972021Online publication date: 8-Dec-2021
    • (2021)Protein-Protein Interaction Prediction by Integrating Sequence Information and Heterogeneous Network RepresentationIntelligent Computing Theories and Application10.1007/978-3-030-84532-2_55(617-626)Online publication date: 12-Aug-2021
    • (2020)A New Method Combining DNA Shape Features to Improve the Prediction Accuracy of Transcription Factor Binding SitesIntelligent Computing Theories and Application10.1007/978-3-030-60802-6_8(79-89)Online publication date: 2-Oct-2020
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
    IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 8, Issue 6
    November 2011
    286 pages

    Publisher

    IEEE Computer Society Press

    Washington, DC, United States

    Publication History

    Published: 01 November 2011
    Published in TCBB Volume 8, Issue 6

    Author Tags

    1. Tumor clustering
    2. developmental biology.
    3. gene expression data
    4. metasample
    5. penalized matrix decomposition

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)One-Step Robust Low-Rank Subspace Segmentation for Tumor Sample ClusteringComputational Intelligence and Neuroscience10.1155/2021/99902972021Online publication date: 8-Dec-2021
    • (2021)Protein-Protein Interaction Prediction by Integrating Sequence Information and Heterogeneous Network RepresentationIntelligent Computing Theories and Application10.1007/978-3-030-84532-2_55(617-626)Online publication date: 12-Aug-2021
    • (2020)A New Method Combining DNA Shape Features to Improve the Prediction Accuracy of Transcription Factor Binding SitesIntelligent Computing Theories and Application10.1007/978-3-030-60802-6_8(79-89)Online publication date: 2-Oct-2020
    • (2020)Three-Layer Dynamic Transfer Learning Language Model for E. Coli Promoter ClassificationIntelligent Computing Theories and Application10.1007/978-3-030-60802-6_7(67-78)Online publication date: 2-Oct-2020
    • (2019)An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary InformationIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2018.288242316:3(809-817)Online publication date: 1-May-2019
    • (2019)Identifying Key Genes of Liver Cancer by Networking of Multiple Data SetsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2018.287423816:3(792-800)Online publication date: 1-May-2019
    • (2019)Integration of Multi-Omics Data for Gene Regulatory Network Inference and Application to Breast CancerIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2018.286683616:3(782-791)Online publication date: 1-May-2019
    • (2019)High-Order Convolutional Neural Network Architecture for Predicting DNA-Protein Binding SitesIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2018.281966016:4(1184-1192)Online publication date: 1-Jul-2019
    • (2017)FAACOSEComplexity10.1155/2017/50248672017Online publication date: 1-Jan-2017
    • (2017)miRNA-Disease Association Prediction with Collaborative Matrix FactorizationComplexity10.1155/2017/24989572017Online publication date: 1-Jan-2017
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media