Abstract
Genome-wide transcriptome data together with statistical analysis enable us to reverse-engineer gene networks that can be a kind of views useful for understanding dynamic behaviour of biological elements in cells. In this chapter, we elucidate statistical models for estimating gene networks based on two types of microarray gene expression data, gene knock-down and time-course. In our modeling, nonparametric regression model is combined with Bayesian networks to capture nonlinear relationships between genes and a derived Bayesian information criterion with efficient structure learning algorithm selects network structure. Some efficient algorithms for structure learning of Bayesian networks, which is known as an NP-hard problem for optimal solutions, are also introduced. To demonstrate the statistical gene network analysis shown in this chapter, we estimate gene networks based on microarray data of human endothelial cell treated with an anti-hyperlipidaemia drug fenofibrate. Based on the constructed gene networks, we illustrate computational strategies for discovering drug target genes and pathways.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ago, T., & Sadoshima J. (2006). GDF15, a cardioprotective TGF-beta superfamily protein. Circulation Research, 98, 294–297.
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Proc. 2nd International Symposium on Information Theory (pp. 267–281). Budapest: Akademiai Kiadó.
Akutsu, T., Miyano, S., & Kuhara, S. (1999). Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. Pacific Symposium on Biocomputing, 4, 17–28.
Akutsu, T., Miyano, S., & Kuhara, S. (2000). Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and fingerprint function. Journal of Computational Biology, 7, 331–344.
Akutsu, T., Kuhara, S., Maruyama, O., & Miyano, S. (2003) Identification of genetic networks by strategic gene disruptions and gene overexpressions under a boolean model. Theoretical Computer Science, 298, 235–251.
Araki, H., Tamada, Y., Imoto, S., Dunmore, B., Sanders, D., Humphrey, S., Nagasaki, M., Doi, A., Nakanishi, Y., Yasuda, K., Tomiyasu, Y., Tashiro, K., Print, C., Charnock-Jones, D. S., Kuhara, S., & Miyano, S. (2009). Analysis of PPAR alpha-dependent and PPAR alpha-independent transcript regulation following fenofibrate treatment of human endothelial cells. Angiogenesis, 12(3), 221–229.
Bader, G. D., Donaldson, I., Wolting, C., Ouellette, B. F. F., Pawson, T., & Hogue, C. W. V. (2001). BIND-The biomolecular interaction network database. Nucleic Acids Research, 29, 242–245.
Bannai, H., Inenaga, S., Shinohara, A., Takeda, M., & Miyano, S. (2002). A string pattern regression algorithm and its application to pattern discovery in long introns. Genome Informatics, 13, 3–11.
Bernard, A., & Hartemink, A. (2005). Informative structure priors: Joint learning of dynamic regulatory networks from multiple types of data. Pacific Symposium on Biocomputing, 10, 459–470.
Bussemaker, H. J., Li, H., & Siggia, E. D. (2001). Regulatory element detection using correlation with expression. Nature Genetics, 27, 167–171.
Cell Illustrator: http://www.cellillustrator.com/
Chickering, D. (1996). Learning Bayesian networks is NP-complete. In Learning from data: Artificial intelligence and statistics V (pp. 121–130). Springer-Verlag.
Davison, A. C. (1986). Approximate predictive likelihood. Biometrika, 73, 323–332.
Di Bernardo, D., Thompson, M. J., Gardner, T. S., Chobot, S. E., Eastwood, E. L., Wojtovich, A. P., Elliott, S. J., Schaus, S. E., & Collins, J. J. (2005). Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks. Nature Biotechnology, 23(3) 377–383.
Ferrari, N., Pfeffer, U., Dell’Eva, R., Ambrosini, C., Noonan, D. M., & Albini, A. (2005). The transforming growth factor-beta family members bone morphogenetic protein-2 and macrophage inhibitory cytokine-1 as mediators of the antiangiogenic activity of N-(4-hydroxyphenyl) retinamide. Clinical Cancer Research, 11, 4610–4619.
Friedman, N., Nachman, I., & Pe’er, D. (1999). Learning Bayesian network structure from massive datasets: The “sparse candidate” algorithm. In Fifteenth conference on uncertainty in artificial intelligence, UAI-99.
Friedman, N., Linial, M., Nachman, I., & Pe’er, D. (2000). Using Bayesian network to analyze expression data. Journal of Computational Biology, 7, 601–620.
Fujita, A., Sato, J. R., Garay-Malpartida, H. M., Yamaguchi, R., Miyano, S., Sogayar, M. C., & Ferreira, C. E. (2007). Modeling gene expression regulatory networks with the sparse vector autoregressive model, BMC Systems Biology, 1, 39.
Gervois, P., Fruchart, J. C., & Staels, B. (2007). Drug insight: Mechanisms of action and therapeutic applications for agonists of peroxisome proliferator-activated receptors. Nature Clinical Practice Endocrinology & Metabolism, 3, 145–156.
Goya, K., Sumitani, S., Xu, X., Kitamura, T., Yamamoto, H., Kurebayashi, S., Saito, H., Kouhara, H., Kasayama, S., & Kawase, I. (2004). Peroxisome proliferator-activated receptor α agonists increase nitric oxide synthase expression in vascular endothelial cells. Arteriosclerosis, Thrombosis, and Vascular Biology, 24, 658–663.
Granger C. W. J. (1969). Investigating causal relationships by econometric models and cross-spectral methods. Econometrica, 37, 424–438.
Green, P. J., & Silverman, B. W. (1994). Nonparametric regression and generalized linear models. Chapman & Hall/CRC Monographs on Statistics & Applied Probability.
Gupta, P. K., Yoshida, R., Imoto, S., Yamaguchi, R., & Miyano, S. (2007). Statistical absolute evaluation of gene ontology terms with gene expression data. In Proceedings of the 3rd international symposium on bioinformatics research and applications. Lecture note in bioinformatics (Vol. 4463, pp. 146–157). Atlanta, Springer-Verlag.
Hartemink, A. J., Gifford, D. K., Jaakkola, T. S., & Young, R. A. (2001). Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pacific Symposium on Biocomputing, 6, 422–433.
Hartemink, A. J., Gifford, D. K., Jaakkola, T. S., & Young, R. A. (2002). Combining location and expression data for principled discovery of genetic regulatory network models. Pacific Symposium on Biocomputing, 7, 437–449.
Hastie, T., & Tibshirani, R. (1990). Generalized additive models. Chapman & Hall.
Hayashida, K., Kume, N., Minami, M., Inui-Hayashida, A., Mukai, E., Toyohara, M., & Kita, T. (2004). Peroxisome proliferator-activated receptor alpha ligands activate transcription of lectin-like oxidized low density lipoprotein receptor-1 gene through GC box motif. Biochemical and Biophysical Research Communications, 323, 1116–1123.
Heckerman, D. (1998). A tutorial on learning with Bayesian networks. In M. I. Jordan (Ed.). Learning in Graphical Models, Kluwer Academic Publisher.
Heckerman, D., & Geiger, D. (1995). Learning Bayesian networks: A unification for discrete and Gaussian domains. In Proceedings of the eleventh conference on uncertainty in artificial intelligence (pp. 274–284).
Hirose, O., Yoshida, R., Imoto, S., Yamaguchi, R., Higuchi, T., Charnock-Jones, D. S., Print, C., & Miyano, S. (2008). Statistical inference of transcriptional module-based gene networks from time course gene expression profiles by using state space models. Bioinformatics, 24, 932–942.
Hopkins, A. L., & Groom, C. R. (2002). The druggable genome. Nature Reviews Drug Discovery, 1, 727–730.
Ideker, T., Ozier, O., Schwikowski, B., & Siegel, A. F. (2002). Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics, 18(ISMB 2002), S233–S240.
Imoto, S., Goto, T., & Miyano, S. (2002). Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pacific Symposium on Biocomputing, 7, 175–186.
Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., & Miyano, S. (2003). Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks. In Proceedings of the IEEE 2nd computational systems bioinformatics (CSB2003) (pp. 104–113).
Imoto, S., Kim, S., Goto, T., Aburatani, S., Tashiro, K., Kuhara, S., & Miyano, S. (2003). Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network. Journal of Bioinformatics and Computational Biology, 1, 231–252.
Imoto, S., Savoie, C. J., Aburatani, S., Kim, S., Tashiro, K., Kuhara, S., & Miyano, S. (2003). Use of gene networks for identifying and validating drug targets. Journal of Bioinformatics and Computational Biology, 1(3), 459–474.
Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., & Miyano, S. (2004). Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks. Journal of Bioinformatics and Computational Biology, 2(1), 77–98.
Imoto, S., Higuchi, T., Goto, T., & Miyano, S. (2006). Error tolerant model for incorporating biological knowledge with expression data in estimating gene networks. Statistical Methodology, 3(1), 1–16.
Imoto, S., Tamada, Y., Araki, H., Yasuda, K., Print, C. G., Charnock-Jones, D. S., Sanders, D., Savoie, C. J., Tashiro, K., Kuhara, S., & Miyano, S. (2006). Computational strategy for discovering druggable gene networks from genome-wide RNA expression profiles. Pacific Symposium on Biocomputing, 11, 559–571.
Islam, K. K., Knight, B. L., Frayn, K. N., Patel, D. D., & Gibbons, G. F. (2005). Deficiency of PPARalpha disturbs the response of lipogenic flux and of lipogenic and cholesterogenic gene expression to dietary cholesterol in mouse white adipose tissue. Biochimica et Biophysica Acta, 1734, 259–268.
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., & Sakaki, Y. (2001). A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proceedings of the Nationl Academy of Sciences of the United States of America, 97, 4569–4574.
Kersten, S., Mandard, S., Escher, P., Gonzalez, F. J., Tafuri, S., Desvergne, B., & Wahli, W. (2001). The peroxisome proliferator-activated receptor alpha regulates amino acid metabolism. FASEB Journal, 15, 1971–1978.
Kim, J., Ahn, J. H., Kim, J. H., Yu, Y. S., Kim, H. S., Ha, J., Shinn, S. H., & Oh, Y. S. (2007). Fenofibrate regulates retinal endothelial cell survival through the AMPK signal transduction pathway. Experimental Eye Research, 84, 886–893.
Kim, S., Imoto, S., & Miyano, S. (2004). Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems, 75(1–3), 57–65.
Kojima, K., Fujita, A., Shimamura, T., Imoto, S., & Miyano, S. (2008). Estimation of nonlinear gene regulatory networks via L 1 regularized NVAR from time series gene expression data, Genome Informatics, 20, 37–51.
Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I., Zeitlinger, J., Jennings, E. G., Murray, H. L., Gordon, D. B., Ren, B., Wyrick, J. J., Tagne, J. -B., Volkert, T. L., Fraenkel, E., Gifford, D. K., & Young, R. A. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799–804.
Lefebvre, P., Chinetti, G., Fruchart, J. C., & Staels, B. (2006). Sorting out the roles of PPAR in energy metabolism and vascular homeostasis. Journal of Clinical Investigation, 116, 571–580.
Masys, D. R. (2001). Linking microarray data to the literature. Nature Genetics, 28, 9–10.
Murakami. H., Murakami. R., Kambe. F., Cao. X., Takahashi. R., Asai. T., Hirai, T., Numaguchi, Y., Okumura, K., Seo, H., & Murohara, T. (2006). Fenofibrate activates AMPK and increases eNOS phosphorylation in HUVEC. Biochemical and Biophysical Research Communications, 341, 973–978.
Murphy, K., & Mian, S. (1999). Modelling gene expression data using dynamic Bayesian networks (Tech. rep.). Berkeley, CA: Computer Science Division, University of California.
Nagasaki, M., Doi, A., Matsuno, H., & Miyano, S. (2003). Genomic Object Net: I. a platform for modeling and simulating biopathways. Applied Bioinformatics, 2, 181–184.
Nagasaki, M., Yamaguchi, R., Yoshida, R., Imoto, S., Doi, A., Tamada, Y., Matsuno, H., Miyano, S., & Higuchi, T. (2006). Genomic data assimilation for estimating hybrid functional petri net from time-course gene expression data. Genome Informatics, 17(1), 46–61.
Ong, I. M., Glasner, J. D., & Page, D. (2002). Modelling regulatory pathways in E. coli from time series expression profiles. Bioinformatics, 18(ISMB2002), S241–S248.
Ott, S., Imoto, S., & Miyano, S. (2004). Finding optimal models for small gene networks. Pacific Symposium on Biocomputing, 9, 557–567.
Pearce, D. J., & Kelly, P. H. (2006). A dynamic topological sort algorithm for directed acyclic graphs. ACM Journal of Experimental Algorithmics, 11, 1.7.
Pe’er, D., Regev, A., Elidan, G., & Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17(Suppl. 1, ISMB 2001), 215–224.
Perrier, E., Imoto, S., & Miyano, S. (2008). Finding optimal Bayesian network given a super-structure. Journal of Machine Learning Research, 9, 2251–2286.
Pilpel, Y., Sudarsanam, P., & Church, G. M. (2001). Identifying regulatory networks by combinatorial analysis of promoter elements. Nature Genetics, 29, 153–159.
Savoie, C. J., Aburatani, S., Watanabe, S., Eguchi, Y., Muta, S., Imoto, S., Miyano, S., Kuhara, S., & Tashiro, K. (2003). Use of gene networks from full genome microarray libraries to identify functionally relevant drug-affected genes and gene regulation cascades. DNA Research, 10, 19–25.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Segal, E., Barash, Y., Simon, I., Friedman, N., & Koller, D. (2002). From promoter sequence to expression: a probabilistic framework. In Proceedings of the 6th annual international conference on research in computational molecular biology (RECOMB2002) (pp. 263–272).
Segal, E., Wang, H., & Koller, D. (2003). Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics, 19(ISMB2003), i264–i272.
Segal, E., Yelensky, R., & Koller, D. (2003). Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. Bioinformatics, 19(ISMB2003), i273–i282.
Shimamura, T., Imoto, S., Yamaguchi, R., Fujita, A., Nagasaki, M., & Miyano, S. (2009). Recursive elastic net for inferring large-scale gene networks from time course microarray data. BMC Systems Biology, 3, 41.
Shimamura, T., Yamaguchi, R., Imoto, S., & Miyano, S. (2007). Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data. Genome Informatics, 19, 142–153.
Shipley, J. M., & Waxman, D. J. (2003). Down-regulation of STAT5b transcriptional activity by ligand-activated peroxisome proliferator-activated receptor (PPAR) alpha and PPARgamma. Molecular Pharmacology, 64, 355–364.
Shmulevich, I., Dougherty, E. R., Kim, S., & Zhang, W. (2002). Probabilistic Boolean networks: A rule-based uncertainty model for gene regulatory networks. Bioinformatics, 18, 261–274.
Staels, B., Maes, M., & Zambon, A. (2008). Fibrates and future PPARα agonists in the treatment of cardiovascular disease. Nature Clinical Practice Cardiovascular Medicine, 5, 542–553
Tamada, Y., Araki, H., Imoto, S., Nagasaki, M., Doi, A., Nakanishi, Y., Tomiyasu, Y., Yasuda, K., Dunmore, B., Sanders, D., Humphries, S., Print, C., Charnock-Jones, D. S., Sanders, D., Tashiro, K., Kuhara, S., & Miyano, S. (2009). Unraveling dynamic activities of autocrine pathways that control drug-response transcriptome networks. Pacific Symposium on Biocomputing, 14, 251–263.
Tamada, Y., Imoto, S., Tashiro, K., Kuhara, S., & Miyano, S. (2005). Identifying drug active pathways from gene networks estimated by gene expression data. Genome Informatics, 16(1), 182–191.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of Royal Statistical Society B, 58, 267–288.
Tinerey, L., & Kadane, J. B. (1996). Accurate approximations for posterior moments and marginal densities. Journal of American Statistical Association, 81, 82–86.
Toh, H., & Horimoto, K. (2002). Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling. Bioinformatics, 18, 287–297.
Tsamardinos, I., Brown, L. E., & Aliferi, C. F. (2006). The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65, 31–78.
Tusher, V. G., Tibshirani, R., & Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 98, 5116–5121.
Yamaguchi, R.,Yoshida, R., Imoto, S., Higuchi, T., & Miyano, S. (2007). Finding module-based gene networks with state-space models – Mining high-dimensional and short time-course gene expression data. IEEE Signal Processing Magazine, 24, 37–46.
Yamamoto, T., Nishizaki, I., Nukada, T., Kamegaya, E., Furuya, S., Hirabayashi, Y., Ikeda, K., Hata, H., Kobayashi, H., Sora, I., & Yamamoto, H. (2004). Functional identification of ASCT1 neutral amino acid transporter as the predominant system for the uptake of L-serine in rat neurons in primary culture. Neuroscience Research, 49, 101–111.
Zandbergen, F., & Plutzky, J. (2007). PPARα in atherosclerosis and inflammation. Biochimica et Biophysica Acta, 1771, 972–982.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of Royal Statistical Society B, 67, 301–320.
Acknowledgements
The authors wish to thank Satoru Kuhara, Kousuke Tashiro, Masao Nagasaki, Yukiko Nakanishi, Atsushi Doi, Yuki Tomiyasu, Kaori Yasuda, Cristin Print, D. Stephen Charnock-Jones, Sally Humphreys, Ben Dunmore, Deborah Sanders, Christopher J. Savoie for their continuous efforts on our HUVEC study. Computation time was provided by the Super Computer System, Human Genome Center, Institute of Medical Science, University of Tokyo.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Imoto, S., Tamada, Y., Araki, H., Miyano, S. (2011). Computational Drug Target Pathway Discovery: A Bayesian Network Approach. In: Lu, HS., Schölkopf, B., Zhao, H. (eds) Handbook of Statistical Bioinformatics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16345-6_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-16345-6_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16344-9
Online ISBN: 978-3-642-16345-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)