Abstract
The increased availability of biological databases containing representations of complex objects permits access to vast amounts of data. In spite of the recent renewed interest in knowledge-discovery techniques (or data mining), there is a dearth of data analysis methods intended to facilitate understanding of the represented objects and related systems by their most representative features and those relationship derived from these features (i.e., structural data). In this paper we propose a conceptual clustering methodology termed EMO-CC for Evolutionary Multi-Objective Conceptual Clustering that uses multi-objective and multi-modal optimization techniques based on Evolutionary Algorithms that uncover representative substructures from structural databases. Besides, EMO-CC provides annotations of the uncovered substructures, and based on them, applies an unsupervised classification approach to retrieve new members of previously discovered substructures. We apply EMO-CC to the Gene Ontology database to recover interesting substructures that describes problems from different points of view and use them to explain inmuno-inflammatory responses measured in terms of gene expression profiles derived from the analysis of longitudinal blood expression profiles of human volunteers treated with intravenous endotoxin compared to placebo.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Siripurapu, V., Meth, J., Kobayashi, N., Hamaguchi, M.: Dbc2 significantly influences cell-cycle, apoptosis, cytoskeleton and membrane-trafficking pathways. Journal of Molecular Biology 346 (2005) 83–89
Nikitin, A., Egorov, S., Daraselia, N., Mazo, I.: Pathway studio–the analysis and navigation of molecular networks. Bioinformatics 19 (2003) 2155–2157
Consortium, T.G.O.: Gene ontology: tool for the unification of biology. Nature Genet 25, 25–29 (2000)
Cook, D., Holder, L., Su, S., Maglothin, R., Jonyer, I.: Structural mining of molecular biology data. IEEE Engineering in Medicine and Biology, special issue on Advances in Genomics 4, 67–74 (2001)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Ruspini, E., Zwir, I.: Automated generation of qualitative representations of complex object by hybrid soft-computing methods. In: Pal, S., Pal, A. (eds.) Pattern Recognition: From Classical to Modern Approaches, pp. 453–474. World Scientific Company, Singapore (2001)
Back, T., Fogel, D., Michalewicz, Z. (eds.): Handbook of Evolutionary Computation. IOP Publishing Ltd., Bristol (1997)
Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithms. John Wiley & Sons, Chichester (2001)
Coello-Coello, C., Veldhuizen, D.V., Lamont, G.: Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers, Dordrecht (2002)
Romero-Zaliz, R., Cord´on, O., Rubio-Escudero, C., Zwir, I., Cobb, J. A multiobjective evolutionary conceptual clustering methodology for gene annotation from networking databases (Submited)
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley- Interscience, Chichester (2000)
Der, G., Everitt, B.: A handbook of statistical analyses using SAS. CHAPMANHALL (1996)
Cheeseman, P., Oldfors, R.W.: Selecting models from data. Springer, Heidelberg (1994)
Bezdek, J.: Fuzzy clustering. In: Ruspini, E., Bonissone, P., Pedrycz, W. (eds.) Handbook of Fuzzy Computation, pp. f6.1:1–f6.6:19. Institute of Physics Press (1998)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6, 182–197 (2002)
Koza, J.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)
Goldberg, D.: Genetic Algorithms in Search Optimization and Machine Learning. Addison-Wesley, London (1989)
Jaccard, P.: The distribution of flora in the alpine zone. The New Phytologist 11, 37–50 (1912); Mining Structural Databases: An EMO-CC Methodology 171
Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: A comparative case study and the Strength Pareto Approach. IEEE Transactions on Evolutionary Computation 3, 257–271 (1999)
Romero-Zaliz, R., Zwir, I., Ruspini, E.: Generalized Analysis of Promoters (GAP): A method for DNA sequence description. In: Applications of Multi-Objective Evolutionary Algorithms, pp. 427–450. World Scientific, Singapore (2004)
Gasch, A., Eisen, M.: Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology 3 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Romero-Zaliz, R., Rubio-Escudero, C., Cordón, O., Harari, O., del Val, C., Zwir, I. (2006). Mining Structural Databases: An Evolutionary Multi-Objetive Conceptual Clustering Methodology. In: Rothlauf, F., et al. Applications of Evolutionary Computing. EvoWorkshops 2006. Lecture Notes in Computer Science, vol 3907. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732242_15
Download citation
DOI: https://doi.org/10.1007/11732242_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33237-4
Online ISBN: 978-3-540-33238-1
eBook Packages: Computer ScienceComputer Science (R0)