The common goal for biological research is to develop models for the biological processes we seek to understand. Such models, in the form of biochemical pathway networks which describe the physical interactions between a living cell's genes, transcripts, proteins, and metabolites ("Omics"), accumulate in different repositories for several model organisms as well as non-model organisms. This thesis presents a set of integrated statistical bioinformatics tools that address key problems in integrating large-scale Omics datasets with pathway network models. A hardware accelerated non-parametric Omics mining method (Monte Carlo on the GPU) allows faster screening of custom test statistics and functions. A software platform for mining pathway databases (PathwayAccess) confers knowledge integration and comparison. Omics and pathway mining are combined for a novel method for statistically discriminating functionally meaningful subnetworks for their interaction with lists of entities mined from Omics data, so that software can intelligently mine large and complex pathway databases to answer a wide variety of questions and generate hypotheses (Discriminating Omics Response Groups in Pathways). The method, called PathwayFlow, can discriminate pathways, reactions, metabolite classes, or any other biological entity grouping (Response Groups), and automatically accounts for connectivity-caused biases in the pathway network. It also differentiates between regulators (or inputs) and regulatees (or outputs) for a given Query List of Omics entities. It is applied to three real datasets: a simple E. coli gene expression dataset which validates the method, a more complex Vitis gene expression dataset which complements functional enrichment analysis (Grapevine's Response to Short Days), and an ultra-high throughput re-sequencing dataset for assessing genetic differences between two wine grape varieties (DNA Sequencing Appendix).
Recommendations
NFB pathway analysis
Display Omitted An analysis based on cycles among genes of gene co-expression networks is proposed.Cycles are associated with feedback mechanisms, very common in biological networks.NFB pathway analysis in tumor specimens of GBM compared to normal brain ...
A novel signaling pathway impact analysis
Motivation: Gene expression class comparison studies may identify hundreds or thousands of genes as differentially expressed (DE) between sample groups. Gaining biological insight from the result of such experiments can be approached, for instance, ...