Abstract
This paper presents two non-parametric statistical test methods, called Kolmogorov-Smirnov (KS) and U statistic test methods, respectively, for informative gene selection of a tumor from microarray data, with help of the theory of false discovery rate. To test the effectiveness of these non-parametric statistical test methods, we use the support vector machine (SVM) to construct a tumor diagnosis system (i.e., a binary classifier) based on the identified informative genes on the colon and leukemia data. It is shown by the experiments that the constructed tumor diagnosis system with both the KS and U statistic test methods can reach a good prediction accuracy on both the colon and leukemia data sets.
This work was supported by the Natural Science Foundation of China for Project 60471054.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alon, U., Barkai, N., Notterman, D.A., et al.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)
Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data. Bioinformatics 16, 906–914 (2000)
Ben-Dor, A., Friedman, N., Yakhini, Z.: Scoring Genes for Relevance. Agilent Technical Report, no. AGL-2000-13 (2000)
Deng, L., Ma, J., Pei, J.: Rank Sum Method for Related Gene Selection and Its Application to Tumor Diagnosis. Chinese Science Bulletin 49, 1652–1657 (2004)
Benjamini, Y., Hochberg, Y.: Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing. J. R. Statist. Soc. B 57, 289–300 (1995)
Storey, J.D.: A Direct Approach to False Discovery Rates. J. R. Statist. Soc. B 64, 479–498 (2002)
Storey, J.D., Tibshirani, R.: Statistical Significance for Genomewide Studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ma, J., Li, F., Liu, J. (2005). Non-parametric Statistical Tests for Informative Gene Selection. In: Wang, J., Liao, XF., Yi, Z. (eds) Advances in Neural Networks – ISNN 2005. ISNN 2005. Lecture Notes in Computer Science, vol 3498. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427469_111
Download citation
DOI: https://doi.org/10.1007/11427469_111
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25914-5
Online ISBN: 978-3-540-32069-2
eBook Packages: Computer ScienceComputer Science (R0)