research-article

Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis Using High-Dimensional Molecular Profiling Data

Authors:

Leonid Hrebien,

Yanjun QiAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 10, Issue 4

Article No.: 38, Pages 1 - 23

https://doi.org/10.1145/2789212

Published: 11 May 2016 Publication History

Abstract

With the advancement of genome-wide monitoring technologies, molecular expression data have become widely used for diagnosing cancer through tumor or blood samples. When mining molecular signature data, the process of comparing samples through an adaptive distance function is fundamental but difficult, as such datasets are normally heterogeneous and high dimensional. In this article, we present kernelized information-theoretic metric learning (KITML) algorithms that optimize a distance function to tackle the cancer diagnosis problem and scale to high dimensionality. By learning a nonlinear transformation in the input space implicitly through kernelization, KITML permits efficient optimization, low storage, and improved learning of distance metric. We propose two novel applications of KITML for diagnosing cancer using high-dimensional molecular profiling data: (1) for sample-level cancer diagnosis, the learned metric is used to improve the performance of k-nearest neighbor classification; and (2) for estimating the severity level or stage of a group of samples, we propose a novel set-based ranking approach to extend KITML. For the sample-level cancer classification task, we have evaluated on 14 cancer gene microarray datasets and compared with eight other state-of-the-art approaches. The results show that our approach achieves the best overall performance for the task of molecular-expression-driven cancer sample diagnosis. For the group-level cancer stage estimation, we test the proposed set-KITML approach using three multi-stage cancer microarray datasets, and correctly estimated the stages of sample groups for all three studies.

References

[1]

Alan Agresti. 2010. Analysis of Ordinal Categorical Data. Vol. 656. Wiley.

[2]

Ash A. Alizadeh, Michael B. Eisen, R. Eric Davis, Chi Ma, Izidore S. Lossos, Andreas Rosenwald, et al. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 6769, 503--511.

[3]

Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. 2004. A probabilistic framework for semi-supervised clustering. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 59--68.

Digital Library

[4]

Aurélien Bellet, Amaury Habrard, and Marc Sebban. 2013. A survey on metric learning for feature vectors and structured data. arXiv Preprint arXiv:1306.6709 (2013).

[5]

M. D. Birkner, S. Kalantri, V. Solao, P. Badam, R. Joshi, A. Goel, M. Pai, and A. E. Hubbard. 2007. Creating diagnostic scores using data-adaptive regression: An application to prediction of 30-day mortality among stroke victims in a rural hospital in India. Therapeutics and Clinical Risk Management 3, 3, 475--484.

[6]

M. Bittner, P. Meltzer, Y. Chen, Y. Jiang, E. Seftor, M. Hendrix, et al. 2000. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406, 6795, 536--540.

[7]

M. Bredel, C. Bredel, D. Juric, G. R. Harsh, H. Vogel, L. D. Recht, and B. I. Sikic. 2005. Functional network analysis reveals extended gliomagenesis pathway maps and three novel MYC-interacting genes in human gliomas. Cancer Res. 65, 19, 8679--8689.

[8]

L. M. Bregman. 1967. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 3, 200--217.

[9]

Cancer Genome Atlas Research Network and others. 2011. Integrated genomic analyses of ovarian carcinoma. Nature 474, 7353, 609--615.

[10]

Cancer Genome Atlas Research Network and others. 2012. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 7417, 519--525.

[11]

Cancer Genome Atlas Research Network and others. 2013. Integrated genomic characterization of endometrial carcinoma. Nature 497, 7447, 67--73.

[12]

Gavin C. Cawley and Nicola L. C. Talbot. 2006. Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics 22, 19, 2348--2355.

Digital Library

[13]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3, 27-1--27-27.

Digital Library

[14]

Chi-Kan Chen. 2012. The classification of cancer stage microarray data. Comput. Methods Programs Biomed. 108, 3, 1070--1077.

Digital Library

[15]

T. Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory, 13, 1 (Jan. 1967), 21--27.

Digital Library

[16]

Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S. Dhillon. 2007. Information-theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning (ICML'07). ACM, New York, NY, USA, 209--216.

Digital Library

[17]

Jason V. Davis and Inderjit S. Dhillon. 2008. Structured metric learning for high dimensional problems. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 195--203.

Digital Library

[18]

Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S. Dhillon. Information Theoretic Metric Learning. University of Texas at Austin, Austin. Retrieved from http://www.cs.utexas.edu/users/pjain/itml/.

[19]

Marcilio C. P. de Souto, Ivan G. Costa, Daniel S. A. de Araujo, Teresa B. Ludermir, and Alexander Schliep. 2008. Clustering cancer gene expression data: a comparative study. BMC Bioinformatics 9, 1 (2008), 1--14.

[20]

Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, (Dec. 2006), 1--30.

Digital Library

[21]

Sandrine Dudoit, Jane Fridlyand, and Terence P. Speed. 2002. Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Statistical Assoc. 97, 457 (2002), 77--87.

[22]

Rick Durrett. 2010. Probability: Theory and Examples. Cambridge University Press.

Digital Library

[23]

Lars Dyrskjot, Thomas Thykjaer, Mogens Kruhoffer, Jens Ledet Jensen, Niels Marcussen, Stephen Hamilton-Dutoit, Hans Wolf, and Torben F. Orntoft. 2003. Identifying distinct classes of bladder carcinoma using microarrays. Nat. Genet. 33, 1, 90--96. 10.1038/ng1061.

[24]

Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (Jun. 2006), 861--874.

Digital Library

[25]

Mitchell E. Garber, Olga G. Troyanskaya, Karsten Schluens, Simone Petersen, Zsuzsanna Thaesler, Manuela Pacyna-Gengelbach, et al. 2001. Diversity of gene expression in adenocarcinoma of the lung. Proceedings of the National Academy of Sciences 98, 24, 13784--13789.

[26]

T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, et al. 1999. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 5439, 531--537.

[27]

Gavin J. Gordon, Roderick V. Jensen, Li-Li Hsiao, Steven R. Gullans, Joshua E. Blumenstock, Sridhar Ramaswamy, William G. Richards, David J. Sugarbaker, and Raphael Bueno. 2002. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62, 17, 4963--4967.

[28]

Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 1735--1742.

Digital Library

[29]

Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 10--18.

Digital Library

[30]

David J. Hand and Robert J. Till. 2001. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45, 2, 171--186.

Digital Library

[31]

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2003. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (corrected ed.). Springer.

[32]

Tomer Hertz, Aharon Bar-Hillel, and Daphna Weinshall. 2004. Boosting margin based distance functions for clustering. In Proceedings of the Twenty-first International Conference on Machine Learning (ICML'04). ACM, New York, NY, USA, 50.

Digital Library

[33]

Thomas J. Hudson, Warwick Anderson, Axel Aretz, Anna D. Barker, Cindy Bell, Rosa R. Bernabé, M. K. Bhan, Fabien Calvo, Iiro Eerola, Daniela S. Gerhard et al. 2010. International network of cancer genome projects. Nature 464, 7291 (2010), 993--998.

[34]

Brian Kulis. 2013. Metric learning: a survey. Found. Trends Mach. Learn. 5, 4, 287--364.

[35]

S. Kullback and R. A. Leibler. 1951. On information and sufficiency. Ann. Math. Stat. 22, 1, pp. 79--86.

[36]

Chenlei Leng. 2008. Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Comput. Biol. Chem. 32, 6, 417--425.

Digital Library

[37]

Shuya Lu, Jia Li, Chi Song, Kui Shen, and George C. Tseng. 2010. Biomarker detection in the integration of multiple multi-class genomic studies. Bioinformatics 26, 3, 333--340.

Digital Library

[38]

Martin Renqiang Min, Salim Chowdhury, Yanjun Qi, Alex Stewart, and Rachel Ostroff. 2013. An integrated approach to blood-based cancer diagnosis and biomarker discovery. WORLD SCIENTIFIC, Chapter 9, 87--98.

[39]

Richard M. Neve, Koei Chin, Jane Fridlyand, Jennifer Yeh, Frederick L. Baehner, Tea Fevr, Laura Clark, Nora Bayani, Jean-Philippe Coppe, Frances Tong et al. 2006. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 6, 515--527.

[40]

Nam Nguyen and Yunsong Guo. 2008. Metric learning: a support vector approach. In Machine Learning and Knowledge Discovery in Databases. Springer, 125--136.

[41]

Catherine L. Nutt, D. R. Mani, Rebecca A. Betensky, Pablo Tamayo, J. Gregory Cairncross, Christine Ladd, et al. 2003. Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res. 63, 7, 1602--1607.

[42]

Charles M. Perou, Therese Sørlie, Michael B. Eisen, Matt van de Rijn, Stefanie S. Jeffrey, Christian A. Rees, Jonathan R. Pollack, Douglas T. Ross, Hilde Johnsen, Lars A. Akslen et al. 2000. Molecular portraits of human breast tumours. Nature 406, 6797, 747--752.

[43]

Scott L. Pomeroy, Pablo Tamayo, Michelle Gaasenbeek, Lisa M. Sturla, Michael Angelo, Margaret E. McLaughlin, et al. 2002. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415, 6870, 436--442.

[44]

Guo-Jun Qi, Jinhui Tang, Zheng-Jun Zha, Tat-Seng Chua, and Hong-Jiang Zhang. 2009. An efficient sparse metric learning in high-dimensional space via l 1-penalized log-determinant regularization. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 841--848.

Digital Library

[45]

J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA.

Digital Library

[46]

Sridhar Ramaswamy, Ken Ross, Eric Lander, and Todd Golub. 2002. A molecular signature of metastasis in primary solid tumors. Nat. Genet. 33, 1, 49--54.

[47]

Xavier Robin, Natacha Turck, Alexandre Hainard, Natalia Tiberti, Frédérique Lisacek, Jean-Charles Sanchez, and Markus Müller. 2011. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 1 (2011), 1--8.

[48]

Alexander Statnikov, Constantin F. Aliferis, Ioannis Tsamardinos, Douglas Hardin, and Shawn Levy. 2005. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 5, 631--643.

Digital Library

[49]

Andrew I. Su, John B. Welsh, Lisa M. Sapinoso, Suzanne G. Kern, Petre Dimitrov, Hilmar Lapp, Peter G. Schultz, Steven M. Powell, Christopher A. Moskaluk, Henry F. Frierson, and Garret M. Hampton. 2001. Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res. 61, 20, 7388--7393.

[50]

Scott A. Tomlins, Rohit Mehra, Daniel R. Rhodes, Xuhong Cao, Lei Wang, Saravana M. Dhanasekaran, et al. 2007. Integrative molecular concept modeling of prostate cancer progression. Nat. Genet. 39, 1, 41--51.

[51]

Lawrence True, Ilsa Coleman, others, and Peter S. Nelson. 2006. A molecular correlate to the Gleason grading system for prostate adenocarcinoma. Proceedings of the National Academy of Sciences 103, 29, 10991--10996.

[52]

Virginia Goss Tusher, Robert Tibshirani, and Gilbert Chu. 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences 98, 9, 5116--5121.

[53]

Faqiang Wang, Wangmeng Zuo, Lei Zhang, Deyu Meng, and David Zhang. 2013. A kernel classification framework for metric learning. arXiv Preprint arXiv:1309.5823.

[54]

Hua Wang, Feiping Nie, and Heng Huang. 2014. Robust distance metric learning via simultaneous l1-norm minimization and maximization. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). 1836--1844.

[55]

Jun Wang, Huyen T. Do, Adam Woznica, and Alexandros Kalousis. 2011. Metric learning with multiple kernels. In Advances in Neural Information Processing Systems. 1170--1178.

[56]

Kilian Q. Weinberger and Lawrence K. Saul. 2009. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207--244.

Digital Library

[57]

Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bull. 1, 6 (Dec. 1945), 80--83.

[58]

Rong Wu, Neali Hendrix-Lucas, others, Eric R. Fearon, and Kathleen R. Cho. 2007. Mouse model of human ovarian endometrioid adenocarcinoma based on somatic defects in the Wnt/-catenin and PI3K/Pten signaling pathways. Cancer Cell 11, 4, 321--333.

[59]

Shiming Xiang, Feiping Nie, and Changshui Zhang. 2008. Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognit. 41, 12, 3600--3612.

Digital Library

[60]

Eric Xing, Andrew Ng, Michael Jordan, and Stuart Russell. 2003. Distance Metric Learning with Application to Clustering with Side-Information. MIT Press, 505--512.

[61]

Huilin Xiong and Xue-wen Chen. 2006. Kernel-based distance metric learning for microarray data classification. BMC Bioinformatics 7, 1 (2006), 1--11.

[62]

Liu Yang and Rong Jin. 2006. Distance Metric Learning: A Comprehensive Survey. Michigan State Universiy (2006), 1--51.

[63]

Eng-Juh Yeoh, Mary E. Ross, Sheila A. Shurtleff, W. Kent Williams, Divyen Patel, Rami Mahfouz, et al. 2002. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1, 2, 133--143.

[64]

Yiming Ying, Kaizhu Huang, and Colin Campbell. 2009. Sparse metric learning via smooth optimization. In Advances in Neural Information Processing Systems 22. 2214--2222.

[65]

Yiming Ying and Peng Li. 2012. Distance metric learning with eigenvalue optimization. J. Mach. Learn. Res. 13, 1, 1--26.

Digital Library

[66]

Changshui Zhang, Feiping Nie, and Shiming Xiang. 2010. A general kernelization framework for learning algorithms based on kernel PCA. Neurocomputing 73, 4, 959--967.

Digital Library

Cited By

Saeed SHussain MNaqvi MSabah H(2023)Laplace Transformation of Eigen Maps of Locally Preserving Projection (LE-LPP) Technique and Time ComplexityProceedings of 3rd International Conference on Mathematical Modeling and Computational Science10.1007/978-981-99-3611-3_28(345-358)Online publication date: 29-Aug-2023
https://doi.org/10.1007/978-981-99-3611-3_28
Huai MZheng TMiao CYao LZhang A(2022)On the Robustness of Metric Learning: An Adversarial PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/350272616:5(1-25)Online publication date: 5-Apr-2022
https://dl.acm.org/doi/10.1145/3502726
Jiřina MKrayem S(2022)The Distance Function Optimization for the Near Neighbors-Based ClassifiersACM Transactions on Knowledge Discovery from Data10.1145/343476916:6(1-21)Online publication date: 24-Feb-2022
https://dl.acm.org/doi/10.1145/3434769
Show More Cited By

Index Terms

Kernelized Information-Theoretic Metric Learning for Cancer Diagnosis Using High-Dimensional Molecular Profiling Data
1. Applied computing
  1. Life and medical sciences
    1. Bioinformatics
2. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning algorithms

Recommendations

Selection of relevant genes in cancer diagnosis based on their prediction accuracy

Motivations: One of the main problems in cancer diagnosis by using DNA microarray data is selecting genes relevant for the pathology by analyzing their expression profiles in tissues in two different phenotypical conditions. The question we pose is the ...
A pathway-based classification method that can improve microarray-based colorectal cancer diagnosis
ICIC'11: Proceedings of the 7th international conference on Intelligent Computing: bio-inspired computing and applications

Colorectal cancer is the third most commonly diagnosed cancer in the world. Microarray-based colorectal cancer diagnosis is increasingly paid more and more attentions. In view of a number of pathway information available in the KEGG database, this paper ...
New Approach for Cancer Computer Aided Diagnosis and Treatment
IIH-MSP '09: Proceedings of the 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing

Cancer has been known as one of the deadliest diseases. Curing from this malicious disease is mainly based upon early and accurate diagnosis. Computer aided diagnosis can be a very good helper in this area. Cancer computer aided diagnosis can be based ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 10, Issue 4

Special Issue on SIGKDD 2014, Special Issue on BIGCHAT and Regular Papers

July 2016

417 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/2936311

Editor:
Philip S. Yu
University of Illinois at Chicago, USA

Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2016

Accepted: 01 June 2015

Revised: 01 June 2015

Received: 01 October 2014

Published in TKDD Volume 10, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
291
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Saeed SHussain MNaqvi MSabah H(2023)Laplace Transformation of Eigen Maps of Locally Preserving Projection (LE-LPP) Technique and Time ComplexityProceedings of 3rd International Conference on Mathematical Modeling and Computational Science10.1007/978-981-99-3611-3_28(345-358)Online publication date: 29-Aug-2023
https://doi.org/10.1007/978-981-99-3611-3_28
Huai MZheng TMiao CYao LZhang A(2022)On the Robustness of Metric Learning: An Adversarial PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/350272616:5(1-25)Online publication date: 5-Apr-2022
https://dl.acm.org/doi/10.1145/3502726
Jiřina MKrayem S(2022)The Distance Function Optimization for the Near Neighbors-Based ClassifiersACM Transactions on Knowledge Discovery from Data10.1145/343476916:6(1-21)Online publication date: 24-Feb-2022
https://dl.acm.org/doi/10.1145/3434769
Huang DWang CLai JKwoh C(2022)Toward Multidiversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and BeyondIEEE Transactions on Cybernetics10.1109/TCYB.2021.304963352:11(12231-12244)Online publication date: Nov-2022
https://doi.org/10.1109/TCYB.2021.3049633
Wang WBu FLin ZZhai S(2020)Learning Methods of Convolutional Neural Network Combined With Image Feature Extraction in Brain Tumor DetectionIEEE Access10.1109/ACCESS.2020.30162828(152659-152668)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3016282
Pan FChen DLu L(2019)Improved PSO based clustering fusion algorithm for multimedia image data projectionMultimedia Tools and Applications10.1007/s11042-019-08015-zOnline publication date: 23-Jul-2019
https://doi.org/10.1007/s11042-019-08015-z
Tan MZhang SWu L(2018)Mutual kNN based spectral clusteringNeural Computing and Applications10.1007/s00521-018-3836-z32:11(6435-6442)Online publication date: 2-Nov-2018
https://dl.acm.org/doi/10.1007/s00521-018-3836-z

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents