Discovering pan-correlation patterns from time course data sets by efficient mining algorithms

Liu, Qian; Ghosh, Shameek; Li, Jinyan; Wong, Limsoon; Ramamohanarao, Kotagiri

doi:10.1007/s00607-018-0606-9

Discovering pan-correlation patterns from time course data sets by efficient mining algorithms

Published: 21 March 2018

Volume 100, pages 421–437, (2018)
Cite this article

Computing Aims and scope Submit manuscript

Qian Liu¹,
Shameek Ghosh¹,
Jinyan Li¹,
Limsoon Wong² &
…
Kotagiri Ramamohanarao³

221 Accesses
4 Citations
Explore all metrics

Abstract

Time-course correlation patterns can be positive or negative, and time-lagged with gaps. Mining all these correlation patterns help to gain broad insights on variable dependencies. Here, we prove that diverse types of correlation patterns can be represented by a generalized form of positive correlation patterns. We prove a correspondence between positive correlation patterns and sequential patterns, and present an efficient single-scan algorithm for mining the correlations. Evaluations on synthetic time course data sets, and yeast cell cycle gene expression data sets indicate that: (1) the algorithm has linear time increment in terms of increasing number of variables; (2) negative correlation patterns are abundant in real-world data sets; and (3) correlation patterns with time lags and gaps are abundant. Existing methods have only discovered incomplete forms of many of these patterns, and have missed some important patterns completely.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Mining of Pan-Correlation Patterns from Time Course Data

Frequent Temporal Pattern Mining with Extended Lists

Correlation Set Discovery on Time-Series Data

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell 2(1):65–73
Article Google Scholar
Chuang CL, Jen CH, Chen CM, Shieh GS (2008) A pattern recognition approach to infer time-lagged genetic interactions. Bioinformatics 24(9):1183–1190
Article Google Scholar
Getz G, Levine E, Domany E (2000) Coupled two-way clustering analysis of gene microarray data. Proc Nat Acad Sci 97(22):12,079–12,084
Article Google Scholar
Ji L, Tan KL (2004) Mining gene expression data for positive and negative co-regulated gene clusters. Bioinformatics 20(16):2711–2718
Article Google Scholar
Ji L, Tan KL (2005) Identifying time-lagged gene clusters using gene expression data. Bioinformatics 21(4):509–516
Article Google Scholar
Jiang D, Pei J, Ramanathan M, Tang C, Zhang A (2004a) Mining coherent gene clusters from gene-sample-time microarray data. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’04, pp 430–439
Koch K, Schonauer S, Jansen I, van den Bussche J, Burzykowski T (2007) Finding clusters of positive and negative coregulated genes in gene expression data. In: Proceedings of the 7th IEEE international conference on bioinformatics and bioengineering, 2007. BIBE 2007, pp 93–99
Li J, Liu Q, Zeng T (2010) Negative correlations in collaboration: concepts and algorithms. In: KDD, pp 463–472
Li X, Rao S, Jiang W, Li C, Xiao Y, Guo Z, Zhang Q, Wang L, Du L, Li J, Li L, Zhang T, Wang Q (2006) Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling. BMC Bioinform 7(1):26
Article Google Scholar
Madeira S, Oliveira A (2009) A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms Mol Biol 4(1):8
Article Google Scholar
Madeira SC, Teixeira MC, Sa-Correia I, Oliveira AL (2010) Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm. IEEE/ACM Trans Comput Biol Bioinf 7(1):153–165
Article Google Scholar
Parsons L, Haque E, Liu H (2004) clustering for high dimensional data: a review. SIGKDD Explor Newsl 6(1):90–105
Article Google Scholar
Roy S, Bhattacharyya DK, Kalita JK (2013) CoBi: pattern based co-regulated biclustering of gene expression data. Pattern Recogn Lett 34(14):1669–1678
Article Google Scholar
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B (1998) Comprehensive identification of cell cycleregulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297
Article Google Scholar
Van Mechelen I, Bock HH, De Boeck P (2004) Two-mode clustering methods: a structured overview. Stat Methods Med Res 13(5):363–394
Article MathSciNet MATH Google Scholar
Wang J, Han J (2004) BIDE: efficient mining of frequent closed sequences. In: 20th international conference on data engineering, 2004. Proceedings, pp 79–90
Yin L, Wang G, Mao K, Zhao Y (2006) Mining time-delayed coherent patterns in time series gene expression data. In: Li X, Zaiane O, Li Zh (eds) Advanced data mining and applications, vol 4093. Lecture notes in computer science. Springer, Berlin, pp 711–722
Chapter Google Scholar
Zeng T, Li J (2010) Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways. Nucleic Acids Res 38(1):e1
Article Google Scholar
Zhao Y, Yu J, Wang G, Chen L, Wang B, Yu G (2008b) Maximal coregulated gene clustering. IEEE Trans Knowl Data Eng 20(1):83–98
Article Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Analytics Institute, University of Technology Sydney, Broadway, NSW, 2007, Australia
Qian Liu, Shameek Ghosh & Jinyan Li
School of Computing, National University of Singapore, 13 Computing Drive, Singapore, Singapore
Limsoon Wong
Department of Computing and Information Systems, The University of Melbourne, Parkville, VIC, 3010, Australia
Kotagiri Ramamohanarao

Authors

Qian Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shameek Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Jinyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Limsoon Wong
View author publications
You can also search for this author in PubMed Google Scholar
Kotagiri Ramamohanarao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinyan Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Q., Ghosh, S., Li, J. et al. Discovering pan-correlation patterns from time course data sets by efficient mining algorithms. Computing 100, 421–437 (2018). https://doi.org/10.1007/s00607-018-0606-9

Download citation

Received: 22 February 2018
Accepted: 06 March 2018
Published: 21 March 2018
Issue Date: April 2018
DOI: https://doi.org/10.1007/s00607-018-0606-9

Keywords

Mathematics Subject Classification

68R01 (General)

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering pan-correlation patterns from time course data sets by efficient mining algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Mining of Pan-Correlation Patterns from Time Course Data

Frequent Temporal Pattern Mining with Extended Lists

Correlation Set Discovery on Time-Series Data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Discovering pan-correlation patterns from time course data sets by efficient mining algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Mining of Pan-Correlation Patterns from Time Course Data

Frequent Temporal Pattern Mining with Extended Lists

Correlation Set Discovery on Time-Series Data

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation