Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/369133.369228acmconferencesArticle/Chapter ViewAbstractPublication PagesrecombConference Proceedingsconference-collections
Article

Gene functional classification from heterogeneous data

Published: 22 April 2001 Publication History

Abstract

In our attempts to understand cellular function at the molecular level, we must be able to synthesize information from disparate types of genomic data. We consider the problem of inferring gene functional classifications from a heterogeneous data set consisting of DNA microarray expression measurements and phylogenetic profiles from whole-genome sequence comparisons. We demonstrate the application of the support vector machine (SVM) learning algorithm to this functional inference task. Our results suggest the importance of exploiting prior information about the heterogeneity of the data. In particular, we propose an SVM kernel function that is explicitly heterogeneous. We also show how to use knowledge about heterogeneity to aid in feature selection.

References

[1]
S. F. Altschul, T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 25:3389-3402, 1997.
[2]
C. Bishop. Neural Networks for Pattern Recognition. Oxford UP, Oxford, UK, 1995.
[3]
M. P. S. Brown, W. N. Grundy, D. Lin, N. Cristianini, C. Sugnet, T. S. Furey, Jr. M. Ares, and D. Haussler. Knowledge-based analysis of microarray gene expression data using support vector machines. Proceedings of the National Academy of Sciences of the United States of America, 97(1):262-267, 2000.
[4]
C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121-167, 1998.
[5]
O. Chapelle and V. Vapnik. Model selection for support vector machines. In Sara A. Solla, Todd K. Leen, and Klaus-Robert M~ller, editors, Advances in Neural Information Processing Systems 12. MIT Press, 2000.
[6]
S. Chu, J. DeRisi, M. Eisen, J. Mulholland, D. Botstein, P. Brown, and I. Herskowitz. The transcriptional program of sporulation in budding yeast. Science, 282:699-705, 1998.
[7]
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge UP, 2000.
[8]
J.L. DeRisi, V.R. Iyer, and P.O. Brown. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 278:680-686, 1997.
[9]
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, 1973.
[10]
M. Eisen, P. Spellman, P.O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95:14863-14868, 1998.
[11]
T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16:906-914, 2000.
[12]
T. Jaakkola, M. Diekhans, and D. Haussler. Using the Fisher kernel method to detect remote protein homologies. In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, pages 149-158, Menlo Park, CA, 1999. AAAI Press.
[13]
T. Jaakkola and D. Haussler. Exploiting generative models in discriminative classifiers. In Advances in Neural Information Processing Systems 11, San Mateo, CA, 1998. Morgan Kauffmann.
[14]
E. M. Marcotte, M. Pellegrini, M. J. Thompson, T. O. Yeates, and D. Eisenberg. A combined algorithm for genome-wide prediction of protein function. Nature, 402(6757):83-86, 1999.
[15]
S. Mika, G. R~tsch, J. Weston, B. Sch~lkopf, and K.-R. M~ller. Fisher discriminant analysis with kernels. In Proceedings of the IEEE Neural Networks for Signal Processing Workshop 1999, 1999.
[16]
P. Pavlidis, T. S. Furey, M. Liberto, D. Haussler, and W. N. Grundy. Promoter region-based classification of genes. In Proceedings of the Pacific Symposium on Biocomputing, 2001. To appear.
[17]
M. Pellegrini, E. M. Marcotte, M. J. Thompson, D. Eisenberg, and T. O. Yeates. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proceedings of the National Academy of Sciences of the United States of America, 96(8):4285-4288, 1999.
[18]
B. Sch~lkopf, C. J. C. Burges, and A. J. Smola, editors. Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, MA, 1999.
[19]
B. Sch~lkopf, A. Smola, and K.-R. M~ller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299-1319, 1996.
[20]
B. Sch~lkopf, A. Smola, and K.-R. M~ller. Kernel principal component analysis. In Proceedings ICANN97, Springer Lecture Notes in Computer Science, page 583, 1997.
[21]
P. T. Spellman, G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein, and B. Futcher. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell, 9:3273-3297, 1998.
[22]
P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. Lander, and T. Golub. Interpreting patterns of gene expression with self-organizing maps. Proceedings of the National Academy of Sciences of the United States of America, 96:2907-2912, 1999.
[23]
V. N. Vapnik. Statistical Learning Theory. Adaptive and learning systems for signal processing, communications, and control. Wiley, New York, 1998.
[24]
J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik. Feature selection for SVMs. In Sara A Solla, Todd K Leen, and Klaus-Robert M~ller, editors, Advances in Neural Information Processing Systems 13. MIT Press, 2001.

Cited By

View all
  • (2024)A maximal accuracy and minimal difference criterion for multiple kernel learningExpert Systems with Applications10.1016/j.eswa.2024.124378254(124378)Online publication date: Nov-2024
  • (2024)Newton-Type Methods with the Proximal Gradient Step for Sparse EstimationOperations Research Forum10.1007/s43069-024-00307-x5:2Online publication date: 20-Mar-2024
  • (2023)An integrated analysis of air pollution and meteorological conditions in JakartaScientific Reports10.1038/s41598-023-32817-913:1Online publication date: 9-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RECOMB '01: Proceedings of the fifth annual international conference on Computational biology
April 2001
316 pages
ISBN:1581133537
DOI:10.1145/369133
  • Chairman:
  • Thomas Lengauer
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

RECOMB01
Sponsor:

Acceptance Rates

RECOMB '01 Paper Acceptance Rate 35 of 128 submissions, 27%;
Overall Acceptance Rate 148 of 538 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)5
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A maximal accuracy and minimal difference criterion for multiple kernel learningExpert Systems with Applications10.1016/j.eswa.2024.124378254(124378)Online publication date: Nov-2024
  • (2024)Newton-Type Methods with the Proximal Gradient Step for Sparse EstimationOperations Research Forum10.1007/s43069-024-00307-x5:2Online publication date: 20-Mar-2024
  • (2023)An integrated analysis of air pollution and meteorological conditions in JakartaScientific Reports10.1038/s41598-023-32817-913:1Online publication date: 9-Apr-2023
  • (2023)Introduction to SVMLearning with Fractional Orthogonal Kernel Classifiers in Support Vector Machines10.1007/978-981-19-6553-1_1(3-18)Online publication date: 19-Mar-2023
  • (2022)Evolving transferable neural pruning functionsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3512290.3528694(385-394)Online publication date: 8-Jul-2022
  • (2022)Analyzing Group-Level Emotion with Global Alignment Kernel based ApproachIEEE Transactions on Affective Computing10.1109/TAFFC.2019.295366413:2(713-728)Online publication date: 1-Apr-2022
  • (2022)Mixture Model Framework for Traumatic Brain Injury Prognosis Using Heterogeneous Clinical and Outcome DataIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2021.309974526:3(1285-1296)Online publication date: Mar-2022
  • (2022)Class-Discriminative CNN Compression2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956066(2070-2077)Online publication date: 21-Aug-2022
  • (2022)Coupled support tensor machine classification for multimodal neuroimaging dataStatistical Analysis and Data Mining: The ASA Data Science Journal10.1002/sam.1158715:6(797-818)Online publication date: 23-May-2022
  • (2021)Fast Block Coordinate Descent for Sparse Group LassoSparse Group Lassoのための高速なBlock Coordinate DescentTransactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.36-1_A-JB136:1(A-JB1_1-11)Online publication date: 1-Jan-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media