Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2649387.2660846acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

On the impact of data integration and edge enrichment in mining significant signals from biological networks

Published: 20 September 2014 Publication History

Abstract

The influx of high-throughput biotechnologies has resulted in considerable amounts of available and untapped data, useful for both interpretation and extrapolation. Due to the fact that the noise to signal ratio in most biological databases are non-trivial, single source analysis techniques may suffer from relatively high false-positive and false-negative rates. In addition, use of a single data source does not allow for the discovery of the novel relationships that can only be derived from multiple sources. Recently, the use of gene correlation networks has emerged to assist in the discovery of previously unknown genetic relationships and the identification of significant biological functions. Such networks provide a useful mechanism to model experimental results obtained from expression data and capture a snapshot of the expression as well as the temporal changes in various experiments. In addition, gene Ontology is often integrated with biological networks within the analysis process as a source of domain knowledge. In this project, we evaluate the use of Gene Ontology, not simply as an assessment tool, but as a basic component in building the correlation networks. We implemented a network integration algorithm that uses both gene expression data (experimental knowledge) and gene ontology data (domain knowledge) to build a biologically-rich correlation model. Then, we analyzed the resulting networks for topological changes and biological significance changes. Our main hypothesis is that the integrated networks would reduce the harmful effects of outliers from imperfect data while maintaining the high concentration of network substructures that are likely to reveal novel, biologically-significant relationships. In addition, using the concept of "guilt by association", we analyzed the clusters of the integrated networks and found that there was a significant increase of enrichment scores relative to the original networks. We show, through motif and pathway analysis, that integrated networks tend to cluster with higher biological significance.

References

[1]
Stuart, J. M., Segal, E., Koller, D., & Kim, S. K. (2003). A gene-coexpression network for global discovery of conserved genetic modules. science, 302(5643), 249--255.
[2]
Allen, E., Moing, A., Ebbels, T. M., Maucourt, M., Tomos, A. D., Rolin, D., & Hooks, M. A. (2010). Correlation Network Analysis reveals a sequential reorganization of metabolic and transcriptional states during germination and gene-metabolite relationships in developing seedlings of Arabidopsis. BMC systems biology, 4(1), 62.
[3]
Carter, S. L., Brechbühler, C. M., Griffin, M., & Bond, A. T. (2004). Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics, 20(14), 2242--2250.
[4]
D'haeseleer, P., Liang, S., & Somogyi, R. (2000). Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics, 16(8), 707--726.
[5]
Allocco, D. J., Kohane, I. S., & Butte, A. J. (2004). Quantifying the relationship between co-expression, co-regulation and gene function. BMC bioinformatics, 5(1), 18.
[6]
Yu, H., Luscombe, N. M., Qian, J., & Gerstein, M. (2003). Genomic analysis of gene expression relationships in transcriptional regulatory networks. TRENDS in Genetics, 19(8), 422--427.
[7]
Segal, E., Shapira, M., Regev, A., Pe'er, D., Botstein, D., Koller, D., & Friedman, N. (2003). Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature genetics, 34(2), 166--176.
[8]
Barabási, A. L., Gulbahce, N., & Loscalzo, J. (2011). Network medicine: a network-based approach to human disease. Nature Reviews Genetics, 12(1), 56--68.
[9]
Jenssen, T. K., Lægreid, A., Komorowski, J., & Hovig, E. (2001). A literature network of human genes for high-throughput analysis of gene expression. Nature genetics, 28(1), 21--28.
[10]
Saris, C. G., Horvath, S., van Vught, P. W., van Es, M. A., Blauw, H. M., Fuller, T. F., ... & Ophoff, R. A. (2009). Weighted gene co-expression network analysis of the peripheral blood from Amyotrophic Lateral Sclerosis patients. BMC genomics, 10(1), 405.
[11]
Hanisch, D., Zien, A., Zimmer, R., & Lengauer, T. (2002). Co-clustering of biological networks and gene expression data. Bioinformatics, 18(suppl 1), S145--S154.
[12]
Cline, M. S., Smoot, M., Cerami, E., Kuchinsky, A., Landys, N., Workman, C., & Bader, G. D. (2007). Integration of biological networks and gene expression data using Cytoscape. Nature protocols, 2(10), 2366--2382.
[13]
Fuller, T. F., Ghazalpour, A., Aten, J. E., Drake, T. A., Lusis, A. J., & Horvath, S. (2007). Weighted gene coexpression network analysis strategies applied to mouse weight. Mammalian Genome, 18(6--7), 463--472.
[14]
Maere, S., Heymans, K., & Kuiper, M. (2005). BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics, 21(16), 3448--3449.
[15]
Bindea, G., Mlecnik, B., Hackl, H., Charoentong, P., Tosolini, M., Kirilovsky, A., & Galon, J. (2009). ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics, 25(8), 1091--1093.
[16]
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., & Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nature biotechnology, 25(11), 1251--1255.
[17]
Agarwal, A. K., Xu, T., Jacob, M. R., Feng, Q., Lorenz, M. C., Walker, L. A., & Clark, A. M. (2008). Role of heme in the antifungal activity of the azaoxoaporphine alkaloid sampangine. Eukaryotic cell, 7(2), 387--400.
[18]
Xu, T., Feng, Q., Jacob, M. R., Avula, B., Mask, M. M., Baerson, S. R., & Agarwal, A. K. (2011). The marine sponge-derived polyketide endoperoxide plakortide F acid mediates its antifungal activity by interfering with calcium homeostasis. Antimicrobial agents and chemotherapy, 55(4), 1611--1621.
[19]
Medintz, I. L., Vora, G. J., Rahbar, A. M., & Thach, D. C. (2007). Transcript and proteomic analyses of wild-type and gpa2 mutant Saccharomyces cerevisiae strains suggest a role for glycolytic carbon source sensing in pseudohyphal differentiation. Molecular BioSystems, 3(9), 623--634.
[20]
Zhang, B., & Horvath, S. (2005). A general framework for weighted gene co-expression network analysis. Statistical applications in genetics and molecular biology, 4(1), 1128.
[21]
Thorne, T., & Stumpf, M. P. (2007). Generating confidence intervals on biological networks. BMC bioinfo, 8(1), 467.
[22]
Obayashi, T., & Kinoshita, K. (2009). Rank of correlation coefficient as a comparable measure for biological significance of gene coexpression. DNA research, 16(5), 249--260.
[23]
Ingram, P. J., Stumpf, M. P., & Stark, J. (2006). Network motifs: structure does not determine function. BMC genomics, 7(1), 108.
[24]
Kashani, Z. R., Ahrabian, H., Elahi, E., Nowzari-Dalini, A., Ansari, E. S., Asadi, S., & Masoudi-Nejad, A. (2009). Kavosh: a new algorithm for finding network motifs. BMC bioinformatics, 10(1), 318.
[25]
Halevy, A., Rajaraman, A., & Ordille, J. (2006, September). Data integration: the teenage years. In Proceedings of the 32nd international conference on Very large data bases (pp. 9--16). VLDB Endowment. Bowman, M., Debray, S. K., and Peterson, L. L. 1993. Reasoning about naming systems. ACM Trans. Program. Lang. Syst. 15, 5 (Nov. 1993), 795--825.

Index Terms

  1. On the impact of data integration and edge enrichment in mining significant signals from biological networks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      BCB '14: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
      September 2014
      851 pages
      ISBN:9781450328944
      DOI:10.1145/2649387
      • General Chairs:
      • Pierre Baldi,
      • Wei Wang
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 September 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. co-regulation
      2. correlation networks
      3. data integration
      4. gene expression
      5. gene ontology
      6. hubs and clusters

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      BCB '14
      Sponsor:
      BCB '14: ACM-BCB '14
      September 20 - 23, 2014
      California, Newport Beach

      Acceptance Rates

      Overall Acceptance Rate 254 of 885 submissions, 29%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 57
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media