Abstract
The biclustering is a useful tool in analysis of massive gene expression data, which performs simultaneous clustering on rows and columns of the data matrix to find subsets of coherently expressed genes and conditions. Especially, in analysis of time-series gene expression data, it is meaningful to restrict biclusters to contiguous time points concerning coherent evolutions. In this paper, the BCCC-Bicluster is proposed as an extension of the CCC-Bicluster. An algorithm based on the frequent sequential mining is proposed to find all maximal BCCC-Biclusters. The newly defined Frequent-Infrequent Tree-Array (FITA) is constructed to speed up the traversal process, with useful strategies originating from Apriori Property to avoid redundant search. To make it more efficient, the bitwise operation XOR is applied to capture identical or opposite contiguous patterns between two rows. The algorithm is tested on the yeast microarray data. Experimental results show that the proposed algorithm is able to find all embedded BCCC-Biclusters, which are proven to reveal significant GO terms involved in biological processes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press (2000)
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving submatrix problem. In: RECOMB 2002: Proceedings of the Sixth Annual International Conference on Computational Biology, pp. 49–57. ACM, New York (2002)
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002)
Divina, F., Aguilar-Ruiz, J.S.: A multi-objective approach to discover biclusters in microarray data. In: GECCO 2007: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, pp. 385–392. ACM, New York (2007)
Gu, J., Liu, J.S.: Bayesian biclustering of gene expression data. BMC Genomics 9(suppl. 1), S4 (2008)
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. J Statistica Sinica 12, 61–86 (2002)
Barkow, S., Bleuler, S., Prelic, A., Zimmermann, P., Zitzler, E.: Bicat: a biclustering analysis toolbox. Bioinformatics 22(10), 1282–1283 (2006)
Bleuler, S., Prelic, A., Zitzler, E.: An ea framework for biclustering of gene expression data. In: Proceedings of Congress on Evolutionary Computation, pp. 166–173 (2004)
Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: BIBE 2003: Proceedings of the 3rd IEEE Symposium on Bioinformatics and Bio Engineering, pp. 321. IEEE Computer Society, Washington, DC (2003)
Prelic, A., Bleuler, S., Zimmermann, P., Buhlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 1(1), 24–45 (2004)
Madeira, S.C., Oliveira, A.L.: A Linear Time Biclustering Algorithm for Time Series Gene Expression Data. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 39–52. Springer, Heidelberg (2005)
Murali, T.M., Kasif, S.: Extracting Conserved Gene Expression Motifs from Gene Expression Data. In: Proc. Pacific Symp. Biocomputing, vol. 8, pp. 77–88 (2003)
Liu, J., Yang, J., Wang, W.: Biclustering in gene expression data by tendency. In: Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004, August 16-19, pp. 182–193 (2004)
Peeters, R.: The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics 131(3), 651–654 (2003)
Zhang, Y., Zha, H., Chu, C.-H.: A time-series biclustering algorithm for revealing co-regulated genes. In: International Conference on Information Technology: Coding and Computing, ITCC 2005, April 4-6, vol. 1, pp. 32–37 (2005)
Madeira, S.C., Teixeira, M.C., Sá-Correia, I., Oliveira, A.L.: Identification of Regulatory Modules in Time Series Gene Expression Data using a Linear Time Biclustering Algorithm. In: IEEE/ACM Transactions on Computational Biology and Bioinformatics (March 21, 2008)
Cheung, L., Cheung, D.W., Kao, B.: On mining micro-array data by Order-Preserving Submatrix. International Journal of Bioinformatics Research and Applications 3, 42–64 (2007)
Gao, B.J., Griffith, O.L., Ester, M., Xiong, H., Zhao, Q., Jones, S.J.M.: On the Deep Order-Preserving Submatrix Problem: A Best Effort Approach. IEEE Trans. Knowl. Data Eng. 24, 309–325 (2012)
Yordzhev, K.: An Example for the Use of Bitwise Operations in programming. Mathematics and Education in Mathematics 38, 196–202 (2009)
Gottesman, D.: A theory of fault-tolerant quantum computation. Phys. Rev. A 57, 127–137 (1998)
Hall, K.L., Rauschenbach, K.A.: 100-Gbit/s bitwise logic. Opt. Lett. 23(16), 1271–1273 (1998)
Tan, K.-L., Eng, P.-K., Ooi, B.C.: Efficient progressive skyline computation. In: Proc. of the Conf. on Very Large Data Bases, Rome, Italy (September 2001)
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press (2000)
Martin, D., Brun, C., Remy, E., Mouren, P., Thieffry, D., Jacq, B.: GOToolBox: functional investigation of gene datasets based on Gene Ontology. Genome Biology 5 (12R101) (2004). http://burgundy.cmmt.ubc.ca/GOToolBox/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, H. et al. (2014). An Effective Biclustering Algorithm for Time-Series Gene Expression Data. In: Wang, X., Pedrycz, W., Chan, P., He, Q. (eds) Machine Learning and Cybernetics. ICMLC 2014. Communications in Computer and Information Science, vol 481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45652-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-662-45652-1_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45651-4
Online ISBN: 978-3-662-45652-1
eBook Packages: Computer ScienceComputer Science (R0)