Abstract
Most of the clustering algorithms are sensitive to noise. Many of them cluster all the genes of the dataset. However, it may be possible that only a small part of genes of the gene expression dataset is involved in the biological processes for a particular set of experiment conditions or sample. To identify these genes clusters, we propose a method which identifies the co-expressed genes having chances of co-regulation in presence of non-functional genes and high level of noise. The proposed method clusters those genes that are within distance threshold t with respect to a specific gene in each experiment conditions and works on column wise distance calculation approach. To validate the proposed method an experimental analysis has been done with a real gene expression data and the experimental results show the significance of proposed method over existing one.
Similar content being viewed by others
References
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Nat. Acad. Sci. USA 95(25), 14863–14868 (1998)
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nat. Genet. 22(3), 281–285 (1999)
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Nat. Acad. Sci. USA 96(6), 2907–2912 (1999)
Sharan, R., Shamir, R., CLICK: A clustering algorithm with applications to gene expression analysis. In Proceeding of Intelligent Systems for Molecular Biology (ISMB), pp. 307–316 (2000)
Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19(8), 973–980 (2003)
Bandyopadhyay, S., Mukhopadhyay, A., Maulik, U.: An improved algorithm for clustering gene expression data. Bioinformatics 23(21), 2859–2865 (2007)
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)
Yee, Y.K., Haynor, D.R., Ruzzo, W.L.: Validating clustering for gene expression data. Bioinformatics 17(4), 309–318 (2001)
Cho, Raymond: J.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)
Ma, P.C.H., Chan, K.C.C.: A novel approach for discovering overlapping clusters in gene expression data. IEEE Trans. Biomed. Eng. 56(7), 1803–1809 (2009)
Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4(1), 95–104 (1974)
Bolshakova, N., Azuaje, F.: Cluster validation techniques for genome expression data. Sig. Process. 83(4), 825–833 (2003)
Brock, G., Pihur, V., Datta, S., Datta, S.: clValid, an R package for cluster validation. J. Stat. Softw. (Brock et al., March 2008)Â (2011)
Kerr, G., Ruskin, H.J., Crane, M., Doolan, P.: Techniques for clustering gene expression data. Comput. Biol. Med. 38(3), 283–293 (2008)
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. 96(6), 2907–2912 (1999)
Nieweglowski, L., Maintainer Nieweglowski, L.: Package ‘clv’ (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chandra, G., Tripathi, S. (2017). A New Approach for Clustering Gene Expression Data. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_5
Download citation
DOI: https://doi.org/10.1007/978-981-10-6430-2_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6429-6
Online ISBN: 978-981-10-6430-2
eBook Packages: Computer ScienceComputer Science (R0)