Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis Using High-Dimensional Molecular Profiling Data

Published: 11 May 2016 Publication History

Abstract

With the advancement of genome-wide monitoring technologies, molecular expression data have become widely used for diagnosing cancer through tumor or blood samples. When mining molecular signature data, the process of comparing samples through an adaptive distance function is fundamental but difficult, as such datasets are normally heterogeneous and high dimensional. In this article, we present kernelized information-theoretic metric learning (KITML) algorithms that optimize a distance function to tackle the cancer diagnosis problem and scale to high dimensionality. By learning a nonlinear transformation in the input space implicitly through kernelization, KITML permits efficient optimization, low storage, and improved learning of distance metric. We propose two novel applications of KITML for diagnosing cancer using high-dimensional molecular profiling data: (1) for sample-level cancer diagnosis, the learned metric is used to improve the performance of k-nearest neighbor classification; and (2) for estimating the severity level or stage of a group of samples, we propose a novel set-based ranking approach to extend KITML. For the sample-level cancer classification task, we have evaluated on 14 cancer gene microarray datasets and compared with eight other state-of-the-art approaches. The results show that our approach achieves the best overall performance for the task of molecular-expression-driven cancer sample diagnosis. For the group-level cancer stage estimation, we test the proposed set-KITML approach using three multi-stage cancer microarray datasets, and correctly estimated the stages of sample groups for all three studies.

References

[1]
Alan Agresti. 2010. Analysis of Ordinal Categorical Data. Vol. 656. Wiley.
[2]
Ash A. Alizadeh, Michael B. Eisen, R. Eric Davis, Chi Ma, Izidore S. Lossos, Andreas Rosenwald, et al. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 6769, 503--511.
[3]
Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. 2004. A probabilistic framework for semi-supervised clustering. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 59--68.
[4]
Aurélien Bellet, Amaury Habrard, and Marc Sebban. 2013. A survey on metric learning for feature vectors and structured data. arXiv Preprint arXiv:1306.6709 (2013).
[5]
M. D. Birkner, S. Kalantri, V. Solao, P. Badam, R. Joshi, A. Goel, M. Pai, and A. E. Hubbard. 2007. Creating diagnostic scores using data-adaptive regression: An application to prediction of 30-day mortality among stroke victims in a rural hospital in India. Therapeutics and Clinical Risk Management 3, 3, 475--484.
[6]
M. Bittner, P. Meltzer, Y. Chen, Y. Jiang, E. Seftor, M. Hendrix, et al. 2000. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406, 6795, 536--540.
[7]
M. Bredel, C. Bredel, D. Juric, G. R. Harsh, H. Vogel, L. D. Recht, and B. I. Sikic. 2005. Functional network analysis reveals extended gliomagenesis pathway maps and three novel MYC-interacting genes in human gliomas. Cancer Res. 65, 19, 8679--8689.
[8]
L. M. Bregman. 1967. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 3, 200--217.
[9]
Cancer Genome Atlas Research Network and others. 2011. Integrated genomic analyses of ovarian carcinoma. Nature 474, 7353, 609--615.
[10]
Cancer Genome Atlas Research Network and others. 2012. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 7417, 519--525.
[11]
Cancer Genome Atlas Research Network and others. 2013. Integrated genomic characterization of endometrial carcinoma. Nature 497, 7447, 67--73.
[12]
Gavin C. Cawley and Nicola L. C. Talbot. 2006. Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics 22, 19, 2348--2355.
[13]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3, 27-1--27-27.
[14]
Chi-Kan Chen. 2012. The classification of cancer stage microarray data. Comput. Methods Programs Biomed. 108, 3, 1070--1077.
[15]
T. Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory, 13, 1 (Jan. 1967), 21--27.
[16]
Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S. Dhillon. 2007. Information-theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning (ICML'07). ACM, New York, NY, USA, 209--216.
[17]
Jason V. Davis and Inderjit S. Dhillon. 2008. Structured metric learning for high dimensional problems. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 195--203.
[18]
Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S. Dhillon. Information Theoretic Metric Learning. University of Texas at Austin, Austin. Retrieved from http://www.cs.utexas.edu/users/pjain/itml/.
[19]
Marcilio C. P. de Souto, Ivan G. Costa, Daniel S. A. de Araujo, Teresa B. Ludermir, and Alexander Schliep. 2008. Clustering cancer gene expression data: a comparative study. BMC Bioinformatics 9, 1 (2008), 1--14.
[20]
Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, (Dec. 2006), 1--30.
[21]
Sandrine Dudoit, Jane Fridlyand, and Terence P. Speed. 2002. Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Statistical Assoc. 97, 457 (2002), 77--87.
[22]
Rick Durrett. 2010. Probability: Theory and Examples. Cambridge University Press.
[23]
Lars Dyrskjot, Thomas Thykjaer, Mogens Kruhoffer, Jens Ledet Jensen, Niels Marcussen, Stephen Hamilton-Dutoit, Hans Wolf, and Torben F. Orntoft. 2003. Identifying distinct classes of bladder carcinoma using microarrays. Nat. Genet. 33, 1, 90--96. 10.1038/ng1061.
[24]
Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (Jun. 2006), 861--874.
[25]
Mitchell E. Garber, Olga G. Troyanskaya, Karsten Schluens, Simone Petersen, Zsuzsanna Thaesler, Manuela Pacyna-Gengelbach, et al. 2001. Diversity of gene expression in adenocarcinoma of the lung. Proceedings of the National Academy of Sciences 98, 24, 13784--13789.
[26]
T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, et al. 1999. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 5439, 531--537.
[27]
Gavin J. Gordon, Roderick V. Jensen, Li-Li Hsiao, Steven R. Gullans, Joshua E. Blumenstock, Sridhar Ramaswamy, William G. Richards, David J. Sugarbaker, and Raphael Bueno. 2002. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62, 17, 4963--4967.
[28]
Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 1735--1742.
[29]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 10--18.
[30]
David J. Hand and Robert J. Till. 2001. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45, 2, 171--186.
[31]
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2003. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (corrected ed.). Springer.
[32]
Tomer Hertz, Aharon Bar-Hillel, and Daphna Weinshall. 2004. Boosting margin based distance functions for clustering. In Proceedings of the Twenty-first International Conference on Machine Learning (ICML'04). ACM, New York, NY, USA, 50.
[33]
Thomas J. Hudson, Warwick Anderson, Axel Aretz, Anna D. Barker, Cindy Bell, Rosa R. Bernabé, M. K. Bhan, Fabien Calvo, Iiro Eerola, Daniela S. Gerhard et al. 2010. International network of cancer genome projects. Nature 464, 7291 (2010), 993--998.
[34]
Brian Kulis. 2013. Metric learning: a survey. Found. Trends Mach. Learn. 5, 4, 287--364.
[35]
S. Kullback and R. A. Leibler. 1951. On information and sufficiency. Ann. Math. Stat. 22, 1, pp. 79--86.
[36]
Chenlei Leng. 2008. Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Comput. Biol. Chem. 32, 6, 417--425.
[37]
Shuya Lu, Jia Li, Chi Song, Kui Shen, and George C. Tseng. 2010. Biomarker detection in the integration of multiple multi-class genomic studies. Bioinformatics 26, 3, 333--340.
[38]
Martin Renqiang Min, Salim Chowdhury, Yanjun Qi, Alex Stewart, and Rachel Ostroff. 2013. An integrated approach to blood-based cancer diagnosis and biomarker discovery. WORLD SCIENTIFIC, Chapter 9, 87--98.
[39]
Richard M. Neve, Koei Chin, Jane Fridlyand, Jennifer Yeh, Frederick L. Baehner, Tea Fevr, Laura Clark, Nora Bayani, Jean-Philippe Coppe, Frances Tong et al. 2006. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 6, 515--527.
[40]
Nam Nguyen and Yunsong Guo. 2008. Metric learning: a support vector approach. In Machine Learning and Knowledge Discovery in Databases. Springer, 125--136.
[41]
Catherine L. Nutt, D. R. Mani, Rebecca A. Betensky, Pablo Tamayo, J. Gregory Cairncross, Christine Ladd, et al. 2003. Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res. 63, 7, 1602--1607.
[42]
Charles M. Perou, Therese Sørlie, Michael B. Eisen, Matt van de Rijn, Stefanie S. Jeffrey, Christian A. Rees, Jonathan R. Pollack, Douglas T. Ross, Hilde Johnsen, Lars A. Akslen et al. 2000. Molecular portraits of human breast tumours. Nature 406, 6797, 747--752.
[43]
Scott L. Pomeroy, Pablo Tamayo, Michelle Gaasenbeek, Lisa M. Sturla, Michael Angelo, Margaret E. McLaughlin, et al. 2002. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415, 6870, 436--442.
[44]
Guo-Jun Qi, Jinhui Tang, Zheng-Jun Zha, Tat-Seng Chua, and Hong-Jiang Zhang. 2009. An efficient sparse metric learning in high-dimensional space via l 1-penalized log-determinant regularization. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 841--848.
[45]
J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA.
[46]
Sridhar Ramaswamy, Ken Ross, Eric Lander, and Todd Golub. 2002. A molecular signature of metastasis in primary solid tumors. Nat. Genet. 33, 1, 49--54.
[47]
Xavier Robin, Natacha Turck, Alexandre Hainard, Natalia Tiberti, Frédérique Lisacek, Jean-Charles Sanchez, and Markus Müller. 2011. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 1 (2011), 1--8.
[48]
Alexander Statnikov, Constantin F. Aliferis, Ioannis Tsamardinos, Douglas Hardin, and Shawn Levy. 2005. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 5, 631--643.
[49]
Andrew I. Su, John B. Welsh, Lisa M. Sapinoso, Suzanne G. Kern, Petre Dimitrov, Hilmar Lapp, Peter G. Schultz, Steven M. Powell, Christopher A. Moskaluk, Henry F. Frierson, and Garret M. Hampton. 2001. Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res. 61, 20, 7388--7393.
[50]
Scott A. Tomlins, Rohit Mehra, Daniel R. Rhodes, Xuhong Cao, Lei Wang, Saravana M. Dhanasekaran, et al. 2007. Integrative molecular concept modeling of prostate cancer progression. Nat. Genet. 39, 1, 41--51.
[51]
Lawrence True, Ilsa Coleman, others, and Peter S. Nelson. 2006. A molecular correlate to the Gleason grading system for prostate adenocarcinoma. Proceedings of the National Academy of Sciences 103, 29, 10991--10996.
[52]
Virginia Goss Tusher, Robert Tibshirani, and Gilbert Chu. 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences 98, 9, 5116--5121.
[53]
Faqiang Wang, Wangmeng Zuo, Lei Zhang, Deyu Meng, and David Zhang. 2013. A kernel classification framework for metric learning. arXiv Preprint arXiv:1309.5823.
[54]
Hua Wang, Feiping Nie, and Heng Huang. 2014. Robust distance metric learning via simultaneous l1-norm minimization and maximization. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). 1836--1844.
[55]
Jun Wang, Huyen T. Do, Adam Woznica, and Alexandros Kalousis. 2011. Metric learning with multiple kernels. In Advances in Neural Information Processing Systems. 1170--1178.
[56]
Kilian Q. Weinberger and Lawrence K. Saul. 2009. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207--244.
[57]
Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bull. 1, 6 (Dec. 1945), 80--83.
[58]
Rong Wu, Neali Hendrix-Lucas, others, Eric R. Fearon, and Kathleen R. Cho. 2007. Mouse model of human ovarian endometrioid adenocarcinoma based on somatic defects in the Wnt/-catenin and PI3K/Pten signaling pathways. Cancer Cell 11, 4, 321--333.
[59]
Shiming Xiang, Feiping Nie, and Changshui Zhang. 2008. Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognit. 41, 12, 3600--3612.
[60]
Eric Xing, Andrew Ng, Michael Jordan, and Stuart Russell. 2003. Distance Metric Learning with Application to Clustering with Side-Information. MIT Press, 505--512.
[61]
Huilin Xiong and Xue-wen Chen. 2006. Kernel-based distance metric learning for microarray data classification. BMC Bioinformatics 7, 1 (2006), 1--11.
[62]
Liu Yang and Rong Jin. 2006. Distance Metric Learning: A Comprehensive Survey. Michigan State Universiy (2006), 1--51.
[63]
Eng-Juh Yeoh, Mary E. Ross, Sheila A. Shurtleff, W. Kent Williams, Divyen Patel, Rami Mahfouz, et al. 2002. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1, 2, 133--143.
[64]
Yiming Ying, Kaizhu Huang, and Colin Campbell. 2009. Sparse metric learning via smooth optimization. In Advances in Neural Information Processing Systems 22. 2214--2222.
[65]
Yiming Ying and Peng Li. 2012. Distance metric learning with eigenvalue optimization. J. Mach. Learn. Res. 13, 1, 1--26.
[66]
Changshui Zhang, Feiping Nie, and Shiming Xiang. 2010. A general kernelization framework for learning algorithms based on kernel PCA. Neurocomputing 73, 4, 959--967.

Cited By

View all
  • (2023)Laplace Transformation of Eigen Maps of Locally Preserving Projection (LE-LPP) Technique and Time ComplexityProceedings of 3rd International Conference on Mathematical Modeling and Computational Science10.1007/978-981-99-3611-3_28(345-358)Online publication date: 29-Aug-2023
  • (2022)On the Robustness of Metric Learning: An Adversarial PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/350272616:5(1-25)Online publication date: 5-Apr-2022
  • (2022)The Distance Function Optimization for the Near Neighbors-Based ClassifiersACM Transactions on Knowledge Discovery from Data10.1145/343476916:6(1-21)Online publication date: 24-Feb-2022
  • Show More Cited By

Index Terms

  1. Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis Using High-Dimensional Molecular Profiling Data

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 10, Issue 4
        Special Issue on SIGKDD 2014, Special Issue on BIGCHAT and Regular Papers
        July 2016
        417 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/2936311
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 11 May 2016
        Accepted: 01 June 2015
        Revised: 01 June 2015
        Received: 01 October 2014
        Published in TKDD Volume 10, Issue 4

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Metric learning
        2. cancer diagnosis
        3. high-dimensional data

        Qualifiers

        • Research-article
        • Research
        • Refereed

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)13
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 19 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Laplace Transformation of Eigen Maps of Locally Preserving Projection (LE-LPP) Technique and Time ComplexityProceedings of 3rd International Conference on Mathematical Modeling and Computational Science10.1007/978-981-99-3611-3_28(345-358)Online publication date: 29-Aug-2023
        • (2022)On the Robustness of Metric Learning: An Adversarial PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/350272616:5(1-25)Online publication date: 5-Apr-2022
        • (2022)The Distance Function Optimization for the Near Neighbors-Based ClassifiersACM Transactions on Knowledge Discovery from Data10.1145/343476916:6(1-21)Online publication date: 24-Feb-2022
        • (2022)Toward Multidiversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and BeyondIEEE Transactions on Cybernetics10.1109/TCYB.2021.304963352:11(12231-12244)Online publication date: Nov-2022
        • (2020)Learning Methods of Convolutional Neural Network Combined With Image Feature Extraction in Brain Tumor DetectionIEEE Access10.1109/ACCESS.2020.30162828(152659-152668)Online publication date: 2020
        • (2019)Improved PSO based clustering fusion algorithm for multimedia image data projectionMultimedia Tools and Applications10.1007/s11042-019-08015-zOnline publication date: 23-Jul-2019
        • (2018)Mutual kNN based spectral clusteringNeural Computing and Applications10.1007/s00521-018-3836-z32:11(6435-6442)Online publication date: 2-Nov-2018

        View Options

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media