Abstract
Patterns in time-course gene expression data can represent the biological processes that are active over the measured time period. However, the orthogonality constraint in standard pattern-finding algorithms, including notably principal components analysis (PCA), confounds expression changes resulting from simultaneous, non-orthogonal biological processes. Previously, we have shown that Markov chain Monte Carlo nonnegative matrix factorization algorithms are particularly adept at distinguishing such concurrent patterns. One such matrix factorization is implemented in the software package CoGAPS. We describe the application of this software and several technical considerations for identification of age-related patterns in a public, prefrontal cortex gene expression dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Klevecz RR, Bolen J, Forrest G, Murray DB (2004) A genomewide oscillation in transcription gates dna replication and cell cycle. Proc Natl Acad Sci USA 101(5):1200–1205
Colantuoni C, Lipska BK, Ye T, Hyde TM, Tao R, Leek JT, Colantuoni EA, Elkahloun AG, Herman MM, Weinberger DR, Kleinman JE (2011) Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature 478(7370):519–523
Kossenkov AV, Ochs MF (2010) Matrix factorisation methods applied in microarray data analysis. Int J Data Min Bioinform 4(1):72–90
Ochs MF, Fertig EJ (2012) Matrix factorization for transcriptional regulatory network inference. In: IEEE symposium on computational intelligence in bioinformatics and computational biology (CIBCB), 2012, pp. 387–396
Moloshok TD, Klevecz RR, Grant JD, Manion FJ, Speier WF 4th, Ochs MF (2002) Application of bayesian decomposition for analysing microarray data. Bioinformatics 18(4):566–575
MF Ochs, Rink L, Tarn C, Mburu S, Taguchi T, Eisenberg B, Godwin AK (2009) Detection of treatment-induced changes in signaling pathways in gastrointestinal stromal tumors using transcriptomic data. Cancer Res 69(23):9125–9132
Fertig EJ, Ding J, Favorov AV, Parmigiani G, Ochs MF (2010) CoGAPS: an R/C++ package to identify patterns and biological process activity in transcriptomic data. Bioinformatics 26(21):2792–2793
Fertig EJ, Ren Q, Cheng H, Hatakeyama H, Dicker AP, Rodeck U, Considine M, Ochs MF, Chung CH (2012) Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma. BMC Genom 13:160
Bidaut G, Ochs MF (2004) Clutrfree: cluster tree visualization and interpretation. Bioinformatics 20(16):2869–2871
Bidaut G, Suhre K, Claverie J-M, Ochs MF (2006) Determination of strongly overlapping signaling activity from microarray data. BMC Bioinformatics 7:99
Devarajan K (2008) Nonnegative matrix factorization: an analytical and interpretive tool in computational biology. PLoS Comput Biol 4(7):e1000029
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Ochs MF, Stoyanova RS, Arias-Mendoza F, Brown TR (1999) A new method for spectral decomposition using a bilinear bayesian approach. J Magn Reson 137(1):161–176
Ochs MF (2003) Bayesian decomposition. In: Parmigiani G, Garrett ES, Irizarry RA, Zeger SL (eds) The analysis of gene expression data: methods and software. Springer, New York
Sibisi S, Skilling J (1997) Prior distributions on measure space. J Roy Stat Soc B 59(1):217–235
Plummer M (2003) JAGS: A program for analysis of bayesian graphical models using gibbs sampling. In: Hornik K, Leisch F, Zeileis A (eds) Proceedings of the 3rd international workshop on distributed statistical computing, Vienna, Austria, March 20–22
Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):1724–1735
Kossenkov AV, Peterson AJ, Ochs MF (2007) Determining transcription factor activity from microarray data using Bayesian Markov chain Monte Carlo sampling. Stud Health Technol Inform 129(Pt 2):1250–1254
Wang G, Kossenkov AV, Ochs MF (2006) Ls-nmf: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates. BMC Bioinformatics 7:175
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. the gene ontology consortium. Nat Genet 25(1):25–29
Parker HS, Leek JT (2012) The practical effect of batch on genomic prediction. Stat Appl Genet Mol Biol 11(3):Article 10
Fertig EJ, Markova A, Danilova LV, Gaykalova DA, Cope L, Chung CH, Califano JA, Ochs MF (2013) Epigenetically driven expression changes define HNSCC clinical subtypes and GLI1 activity is specific to HPV-negative HNSCC. Submitted
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Fertig, E.J., Stein-O’Brien, G., Jaffe, A., Colantuoni, C. (2014). Pattern Identification in Time-Course Gene Expression Data with the CoGAPS Matrix Factorization. In: Ochs, M. (eds) Gene Function Analysis. Methods in Molecular Biology, vol 1101. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-721-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-62703-721-1_6
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-720-4
Online ISBN: 978-1-62703-721-1
eBook Packages: Springer Protocols