Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Cancer Subtype Discovery Based on Integrative Model of Multigenomic Data

Published: 01 September 2017 Publication History

Abstract

One major goal of large-scale cancer omics study is to understand molecular mechanisms of cancer and find new biomedical targets. To deal with the high-dimensional multidimensional cancer omics data DNA methylation, mRNA expression, etc., which can be used to discover new insight on identifying cancer subtypes, clustering methods are usually used to find an effective low-dimensional subspace of the original data and then cluster cancer samples in the reduced subspace. However, due to data-type diversity and big data volume, few methods can integrate these data and map them into an effective low-dimensional subspace. In this paper, we develop a dimension-reduction and data-integration method for indentifying cancer subtypes, named Scluster. First, Scluster, respectively, projects the different original data into the principal subspaces by an adaptive sparse reduced-rank regression method. Then, a fused patient-by-patient network is obtained for these subgroups through a scaled exponential similarity kernel method. Finally, candidate cancer subtypes are identified using spectral clustering method. We demonstrate the efficiency of our Scluster method using three cancers by jointly analyzing mRNA expression, miRNA expression, and DNA methylation data. The evaluation results and analyses show that Scluster is effective for predicting survival and identifies novel cancer subtypes of large-scale multi-omics data.

References

[1]
M. R. Stratton, P. J. Campbell, and P. A. Futreal, "The cancer genome," Nature, vol. 458, pp. 719-724, 2009.
[2]
J. R. Pollack, et al., "Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors," Proc. Nat. Academy Sci. United States America, vol. 99, pp. 12963-12968, 2002.
[3]
P. A. Jones and S. B. Baylin, "The fundamental role of epigenetic events in cancer," Nature Rev. Genetics, vol. 3, pp. 415-428, 2002.
[4]
P. L. Bedard, A. R. Hansen, M. J. Ratain, and L. L. Siu, "Tumour heterogeneity in the clinic," Nature, vol. 501, pp. 355-64, 2013.
[5]
R. A. Burrell, N. McGranahan, J. Bartek, and C. Swanton, "The causes and consequences of genetic heterogeneity in cancer evolution," Nature, vol. 501, pp. 338-345, 2013.
[6]
R. G. Verhaak, et al., "Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1," Cancer Cell, vol. 17, pp. 98-110, 2010.
[7]
B. Gu, V. S. Sheng, Z. Wang, D. Ho, S. Osman, and S. Li, "Incremental learning for n-support vector regression," Neural Netw., vol. 67, pp. 140-150, 2015.
[8]
B. Gu, V. S. Sheng, K. Y. Tay, W. Romano, and S. Li, "Incremental support vector learning for ordinal regression," IEEE Trans. Neural Netw. Learning Syst., vol. 26, no. 7, pp. 1403-1416, Jul. 2015.
[9]
D.-S. Huang and C.-H. Zheng, "Independent component analysis-based penalized discriminant method for tumor classification using gene expression data," Bioinf., vol. 22, pp. 1855-1862, 2006.
[10]
C.-H. Zheng, L. Zhang, T. Y. Ng, S. C. Shiu, and D. S. Huang, "Molecular pattern discovery based on penalized matrix decomposition," IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 8, pp. 1592-1603, 2011.
[11]
T. J. Hudson, et al., "International network of cancer genome projects," Nature, vol. 464, pp. 993-998, 2010.
[12]
J. Barretina, et al., "The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, vol. 483, pp. 603-607, 2012.
[13]
C.-H. Zheng, L. Zhang, T. Y. Ng, S. C. Shiu, and D. S. Huang, "Metasample-based sparse representation for tumor classification," IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 8, pp. 1273-1282, 2011.
[14]
C. G. A. Network, "Comprehensive molecular portraits of human breast tumours," Nature, vol. 490, pp. 61-70, 2012.
[15]
R. Shen, A. B. Olshen, and M. Ladanyi, "Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis," Bioinf., vol. 25, pp. 2906-2912, 2009.
[16]
A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Statistical Soc. Series B (Methodological), vol. 39, pp. 1-38, 1977.
[17]
R. Tibshirani, "Regression shrinkage and selection via the lasso," J. Royal Statistical Soc. Series B (Methodological), vol. 73, pp. 267- 288, 1996.
[18]
B. Wang, et al., "Similarity network fusion for aggregating data types on a genomic scale," Nat Methods, vol. 11, pp. 333-337, 2014.
[19]
Z. Ma and T. Sun, "Adaptive sparse reduced-rank regression," arXiv:1403.1922, 2014.
[20]
D. D. Lin, H. He, L. Li, H.-W. Deng, V. D. Calhoun, and Y.-P. Wang, "Network-based investigation of genetic modules associated with functional brain networks in schizophrenia," in Proc. IEEE Int. Conf. Bioinf. Biomed., 2013, pp. 9-16.
[21]
A. Y. Ng, M. I. Jordan, and Y. Weiss, "On spectral clustering: Analysis and an algorithm," Adv. Neural Inf. Process. Syst., vol. 2, pp. 849-856, 2002.
[22]
Y.-C. Wei and C.-K. Cheng, "Towards efficient hierarchical designs by ratio cut partitioning," in Proc. IEEE Int. Conf. Comput.- Aided Des., Digest Tech. Papers., 1989, pp. 298-301.
[23]
C. Ding and X. He, "K-means clustering via principal component analysis," in Proc. 21st Int. Conf. Mach. Learning, 2004, Art. no. 29.
[24]
P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," J. Comput. Appl. Math., vol. 20, pp. 53-65, 1987.
[25]
P. Rao, "Applied survival analysis: Regression modeling of time to event data," J. Amer. Statistical Assoc., vol. 95, pp. 681-681, 2000.
[26]
H. Noushmehr, et al., "Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma," Cancer Cell, vol. 17, pp. 510-522, 2010.
[27]
C. W. Brennan, et al., "The somatic genomic landscape of glioblastoma," Cell, vol. 155, pp. 462-77, 2013.
[28]
D. S. Huang, "Radial basis probabilistic neural networks: Model and application," Int. J. Pattern Recognit. Artif. Intell., vol. 13, no.7, pp. 1083-1101, 1999.
[29]
J. R. Zhang, J. Zhang, T. M. Lok, and M. R. Lyu, "A hybrid particle swarm optimization-back- propagation algorithm for feedforward neural network training," Appl. Math. Comput., vol. 185, pp. 1026- 1037, 2007.
[30]
D. S. Huang and J. X. Du, "A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks," IEEE Trans. Neural Netw., vol. 19, no. 12, pp. 2099-2115, Dec. 2008.
[31]
D. S. Huang, Systematic Theory of Neural Networks for Pattern Recognition, (in Chinese). Beijing, China: Publishing House of Electronic Industry of China, May 1996.

Cited By

View all
  • (2021)SNEMO: Spectral Clustering Based on the Neighborhood for Multi-omics DataIntelligent Computing Theories and Application10.1007/978-3-030-84532-2_44(490-498)Online publication date: 12-Aug-2021
  • (2021)Joint Association Analysis Method to Predict Genes Related to Liver CancerIntelligent Computing Theories and Application10.1007/978-3-030-84532-2_33(364-373)Online publication date: 12-Aug-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 14, Issue 5
September 2017
202 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 September 2017
Published in TCBB Volume 14, Issue 5

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)SNEMO: Spectral Clustering Based on the Neighborhood for Multi-omics DataIntelligent Computing Theories and Application10.1007/978-3-030-84532-2_44(490-498)Online publication date: 12-Aug-2021
  • (2021)Joint Association Analysis Method to Predict Genes Related to Liver CancerIntelligent Computing Theories and Application10.1007/978-3-030-84532-2_33(364-373)Online publication date: 12-Aug-2021

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media