Research Interests: Algorithms, Computational Biology, Multidisciplinary, Humans, Computer Simulation, and 15 moreMice, Animals, Biological Networks, Mutual Information, Gene Regulatory Networks, Biological Network, Microarray Analysis, Functional Group, Metabolic pathway, Data Cleansing, Microarray Data, Gene expression profiling, Interaction Network, Association Analysis, and Gene Expression Data
... Proteomics is the _____ 1 susan.havre@pnl.gov 2 mudita.singhal@pnl.gov 3 debbie.payne@pnl.gov 4 bobbie-jo.webb-robertson@pnl.gov ... The proteome is an essential key to understanding the complex processes of cells. ...
Research Interests:
Research Interests:
The use of computer tools and technologies is unavoidable when it comes to conducting mass spectrometry (MS) research at any significant level. This is mainly due to the large volume of MS data and the processing rates required. Most of... more
The use of computer tools and technologies is unavoidable when it comes to conducting mass spectrometry (MS) research at any significant level. This is mainly due to the large volume of MS data and the processing rates required. Most of the existing tools focus on one particular task: be it storing and maintaining the data or visualizing the dataset to
Research Interests:
Scientists face an ever-increasing challenge in investigating biological systems with high throughput experimental methods such as mass spectrometry and gene arrays because of the scale and complexity of the data and the need to integrate... more
Scientists face an ever-increasing challenge in investigating biological systems with high throughput experimental methods such as mass spectrometry and gene arrays because of the scale and complexity of the data and the need to integrate results broadly with heterogeneous other types of information. Many analyses require merging the experimental results with datasets returned from public databases, such as those hosted
Research Interests: Genetics, Mass Spectrometry, Molecular Biophysics, Open Source, Problem Solving, and 14 moreGraphic User Interface Design, Data Collection, Proteins, Graphical User Interfaces, Protein Interaction, Biological systems, Very high throughput, Software Systems, Mass Spectroscopy, Application Server, Experimental Method, Biological Data, Interaction Network, and Data Retrieval
Human experts can annotate peaks in MALDI-TOF profiles of detached N-glycans with some degree of accuracy. Even though MALDI-TOF profiles give only intact masses without any fragmentation information, expert knowledge of the most common... more
Human experts can annotate peaks in MALDI-TOF profiles of detached N-glycans with some degree of accuracy. Even though MALDI-TOF profiles give only intact masses without any fragmentation information, expert knowledge of the most common glycans and biosynthetic pathways in the biological system can point to a small set of most likely glycan structures at the "cartoon" level of detail. Cartoonist is a recently developed, fully automatic annotation tool for MALDI-TOF glycan profiles. Here we benchmark Cartoonist's automatic annotations against human expert annotations on human and mouse N-glycan data from the Consortium for Functional Glycomics. We find that Cartoonist and expert annotations largely agree, but the expert tends to annotate more specifically, meaning fewer suggested structures per peak, and Cartoonist more comprehensively, meaning more annotated peaks. On peaks for which both Cartoonist and the expert give unique cartoons, the two cartoons agree in over 90% of all cases. This article is part of a Special Issue entitled: Computational Proteomics.
Research Interests:
Processing rules in a distributed active database involves evaluating distributed rules correctly. For databases whose rules are not changed very often, the per- formance of the distributed rule evaluation algorithm plays the key role in... more
Processing rules in a distributed active database involves evaluating distributed rules correctly. For databases whose rules are not changed very often, the per- formance of the distributed rule evaluation algorithm plays the key role in the overall performance of distributed active databases. In this paper, we study the performance of a distributed rule evaluation algorithm. An analytical model is developed
Research Interests:
Research Interests:
The Software Environment for BIological Network Inference (SEBINI) has been created to provide an interactive environment for the deployment and testing of network inference algorithms that use high-throughput expression data. Networks... more
The Software Environment for BIological Network Inference (SEBINI) has been created to provide an interactive environment for the deployment and testing of network inference algorithms that use high-throughput expression data. Networks inferred from the SEBINI software ...
Research Interests: Genetics, Data Analysis, Mass Spectrometry, Protein-Protein Interaction, Case Study, and 9 moreProteins, Protein-protein interaction networks, Biological Network, Interactive Learning Environment, Protein Interaction, Very high throughput, Bayesian Estimator, Interaction Network, and Protein- Protein interaction network analysis
Searching a large document collection to learn about a broad subject involves the iterative process of figuring out what to ask, filtering the results, identifying useful documents, and deciding when one has covered enough material to... more
Searching a large document collection to learn about a broad subject involves the iterative process of figuring out what to ask, filtering the results, identifying useful documents, and deciding when one has covered enough material to stop searching. We are calling this activity "discoverage," discovery of relevant material and tracking coverage of that material. We built a visual analytic tool called Footprints that uses multiple coordinated visualizations to help users navigate through the discoverage process. To support discovery, Footprints displays topics extracted from documents that provide an overview of the search space and are used to construct searches visuospatially. Footprints allows users to triage their search results by assigning a status to each document (To Read, Read, Useful), and those status markings are shown on interactive histograms depicting the user's coverage through the documents across dates, sources, and topics. Coverage histograms help users notice biases in their search and fill any gaps in their analytic process. To create Footprints, we used a highly iterative, user-centered approach in which we conducted many evaluations during both the design and implementation stages and continually modified the design in response to feedback.
... Joshua N. Adkins Pacific Northwest National Lab Richland, WA, USA +1-509-371-6583 joshua.adkins@pnl.gov Roslyn Brown Pacific Northwest National Lab Richland, WA, USA +1-509-371-7629 roslyn.brown@pnl.gov ABSTRACT ...
Research Interests:
... Anuj R. Shah, Mudita Singhal, Tara D. Gibson, Chandrika Sivaramakrishnan, Katrina M. Waters, Ian Gorton* Pacific Northwest National Laboratory ... provides a consistent framework that combines a database management system, a... more
... Anuj R. Shah, Mudita Singhal, Tara D. Gibson, Chandrika Sivaramakrishnan, Katrina M. Waters, Ian Gorton* Pacific Northwest National Laboratory ... provides a consistent framework that combines a database management system, a collection of tools to store, manage, and query ...
Research Interests: Bioinformatics, Data Analysis, Data Management, Software Architecture, Resource Allocation, and 11 moreSystem Biology, Middleware, Database Management Systems, Process Integration, eScience, Management System, Spectrum, Very high throughput, Experimental Data, Information Retrieval systems, and Tool Integration
Attaining a detailed understanding of the various biological networks in an organism lies at the core of the emerging discipline of systems biology. A precise description of the relationships formed between genes, mRNA molecules, and... more
Attaining a detailed understanding of the various biological networks in an organism lies at the core of the emerging discipline of systems biology. A precise description of the relationships formed between genes, mRNA molecules, and proteins is a necessary step toward a complete description of the dynamic behavior of an organism at the cellular level, and toward intelligent, efficient, and directed modification of an organism. The importance of understanding such regulatory, signaling, and interaction networks has fueled the development of numerous in silico inference algorithms, as well as new experimental techniques and a growing collection of public databases. The Software Environment for BIological Network Inference (SEBINI) has been created to provide an interactive environment for the deployment, evaluation, and improvement of algorithms used to reconstruct the structure of biological regulatory and interaction networks. SEBINI can be used to analyze high-throughput gene expression, protein abundance, or protein activation data via a suite of state-of-the-art network inference algorithms. It also allows algorithm developers to compare and train network inference methods on artificial networks and simulated gene expression perturbation data. SEBINI can therefore be used by software developers wishing to evaluate, refine, or combine inference techniques, as well as by bioinformaticians analyzing experimental data. Networks inferred from the SEBINI software platform can be further analyzed using the Collective Analysis of Biological Interaction Networks (CABIN) tool, which is an exploratory data analysis software that enables integration and analysis of protein-protein interaction and gene-to-gene regulatory evidence obtained from multiple sources. The collection of edges in a public database, along with the confidence held in each edge (if available), can be fed into CABIN as one "evidence network," using the Cytoscape SIF file format. Using CABIN, one may increase the confidence in individual edges in a network inferred by an algorithm in SEBINI, as well as extend such a network by combining it with species-specific or generic information, e.g., known protein-protein interactions or target genes identified for known transcription factors. Thus, the combined SEBINI-CABIN toolkit aids in the more accurate reconstruction of biological networks, with less effort, in less time.A demonstration web site for SEBINI can be accessed from https://www.emsl.pnl.gov/SEBINI/RootServlet . Source code and PostgreSQL database schema are available under open source license. ronald.taylor@pnl.gov. For commercial use, some algorithms included in SEBINI require licensing from the original developers. CABIN can be downloaded from http://www.sysbio.org/dataresources/cabin.stm . mudita.singhal@pnl.gov.
Research Interests: Algorithms, Data Analysis, Systems Biology, Software Development, Forecasting, and 15 moreSystem Biology, Gene expression, Signal Transduction, Software, Humans, Animals, Gene Regulatory Networks, Transcription Factor, Biological Network, Exploratory Data Analysis, Very high throughput, Protein Expression, Source Code, Biochemistry and cell biology, and Interaction Network
We present a platform for the reconstruction of protein-protein interaction networks inferred from Mass Spectrometry (MS) bait-prey data. The Software Environment for Biological Network Inference (SEBINI), an environment for the... more
We present a platform for the reconstruction of protein-protein interaction networks inferred from Mass Spectrometry (MS) bait-prey data. The Software Environment for Biological Network Inference (SEBINI), an environment for the deployment of network inference algorithms that use high-throughput data, forms the platform core. Among the many algorithms available in SEBINI is the Bayesian Estimator of Probabilities of Protein-Protein Associations (BEPro3) algorithm, which is used to infer interaction networks from such MS affinity isolation data. Also, the pipeline incorporates the Collective Analysis of Biological Interaction Networks (CABIN) software. We have thus created a structured workflow for protein-protein network inference and supplemental analysis.
Research Interests: Algorithms, Bayesian, Computational Biology, Mass Spectrometry, Pipeline, and 15 morePredator-prey interaction, Data Mining in Bioinformatics, Biological Sciences, Analysis, Software, Mathematical Sciences, Ms, Proteins, Biological Network, Interactive Learning Environment, Very high throughput, Analisis, Indexation, Bayesian Estimator, and Interaction Network
Research Interests:
For the SC|06 analytics challenge, we demonstrate an end-to-end solution for processing data produced by high-throughput mass spectrometry (MS)-based proteomics so biological hypotheses can be explored. This approach is based on a tool... more
For the SC|06 analytics challenge, we demonstrate an end-to-end solution for processing data produced by high-throughput mass spectrometry (MS)-based proteomics so biological hypotheses can be explored. This approach is based on a tool called the Bioinformatics Resource Manager (BRM) which will interact with high-performance architecture and experimental data sources to provide high-throughput analytics to a specific experimental dataset. Peptide identification
Research Interests:
Research Interests: Computational Biology, Mass Spectrometry, Flow Cytometry, Proteomics, Multidisciplinary, and 14 moreSignal Transduction, Protein-Protein Interaction, Humans, Platelet aggregation, Phosphorylation, PLoS one, Integrins, Protein-protein interaction networks, Resting State, Proteome, Gene Expression Regulation, Contextual Information, Signaling pathway, and Blood platelets
Research Interests: Algorithms, Artificial Intelligence, Machine Learning, Oxidative Stress, Gene expression, and 15 moreGene Prediction, Biological Sciences, Mathematical Sciences, Mice, Animals, Cellular Network, Mutual Information, Lung, Gene Regulatory Networks, Classification Algorithm, Molecular Interactions, Microarray Data, Gene expression profiling, Interaction Network, and Gene Expression Data
Research Interests:
For scientific data visualizations, real-time data streams present many interesting challenges when compared to static data. Real-time data are dynamic, transient, high-volume and temporal. Effective visualizations need to be able to... more
For scientific data visualizations, real-time data streams present many interesting challenges when compared to static data. Real-time data are dynamic, transient, high-volume and temporal. Effective visualizations need to be able to accommodate dynamic data behavior as well as and present the data in ways that make sense to and are usable by humans. The Visual Content Analysis of Real-Time Data Streams project at the Pacific Northwest National Laboratory is researching and prototyping dynamic visualization techniques and tools to help facilitate human understanding and comprehension of high-volume, real-time data. The general strategy of the project is to develop and evolve visual contexts that will organize and orient high-volume dynamic data in conceptual and perceptive views. The goal is to allow users to quickly grasp dynamic data in forms that are intuitive and natural without requiring intensive training in the use of specific visualization or analysis tools and methods. Thus...
Research Interests:
Proteins play a key role in cellular processes, making proteomics central to understanding systems biology. MS techniques provide a means to observe entire proteomes at a global level. Yet, high-throughput MS proteomics techniques... more
Proteins play a key role in cellular processes, making proteomics central to understanding systems biology. MS techniques provide a means to observe entire proteomes at a global level. Yet, high-throughput MS proteomics techniques generate data faster than it can currently be analyzed. The success of proteomics depends on high-throughput experimental techniques coupled with sophisticated visual analysis and data-mining methods. Visual analysis has been applied successfully in a number of fields plagued with huge, complex data sets and will likely be an important tool in proteomics discovery. PQuad, a novel visualization of MS proteomics data, provides powerful analysis capabilities that support a number of proteomic data applications. In particular, PQuad supports differential proteomics by simplifying the comparison of peptide sets from different experimental conditions as well as different protein identification or confidence scoring techniques. Finally, PQuad supports data validation and quality control by providing a variety of resolutions for huge amounts of data to reveal errors undetected by other methods.
Research Interests: Algorithms, Computer Graphics, Biomedical Engineering, Systems Biology, Mass Spectrometry, and 15 moreProteomics, Molecular Biophysics, System Biology, Quality Control, Data Visualisation, Software, Visual Analysis, Pacific Northwest, Peptides, Proteins, Computer User Interface Design, Protein Sequence Analysis, Data Validation, Electrical And Electronic Engineering, and Gene expression profiling
The recent advances in high-throughput data acquisition have driven a revolution in the study of human disease and determination of molecular biomarkers of disease states. It has become increasingly clear that many of the most important... more
The recent advances in high-throughput data acquisition have driven a revolution in the study of human disease and determination of molecular biomarkers of disease states. It has become increasingly clear that many of the most important human diseases arise as the result of a complex interplay between several factors including environmental factors, such as exposure to toxins or pathogens, diet, lifestyle, and the genetics of the individual patient. Recent research has begun to describe these factors in the context of networks which describe relationships between biological components, such as genes, proteins and metabolites, and have made progress towards the understanding of disease as a dysfunction of the entire system, rather than, for example, mutations in single genes. We provide a summary of some of the recent work in this area, focusing on how the integration of different kinds of complementary data, and analysis of biological networks and pathways can lead to discovery of r...
Research Interests: Genetics, Data Analysis, Systems Biology, Life Sciences, System Biology, and 15 moreHumans, Diagnosis, Data acquisition, Signal Transduction Pathway Models, Medical Physiology, Biological Network, Very high throughput, Flux Balance Analysis, Biological markers, Topological Analysis, Data Processing, Human Disease, Networked Systems, Gene Expression Data, and Environmental factor
The importance of understanding biological interaction networks has fueled the development of numerous interaction data generation techniques, databases and prediction tools. However, not all prediction tools and databases predict... more
The importance of understanding biological interaction networks has fueled the development of numerous interaction data generation techniques, databases and prediction tools. However, not all prediction tools and databases predict interactions with one hundred percent accuracy. Generation of high-confidence interaction networks formulates the first step towards deciphering unknown protein functions, determining protein complexes and inventing drugs. The CABIN: Collective Analysis of Biological Interaction Networks software is an exploratory data analysis tool that enables analysis and integration of interactions evidence obtained from multiple sources, thereby increasing the confidence of computational predictions as well as validating experimental observations. CABIN has been written in Java and is available as a plugin for Cytoscape--an open source network visualization tool.
Research Interests: Computational Biology, Visual Analytics, Network Visualization, Open Source, Biological Sciences, and 12 moreProtein-protein interactions, Protein-Protein Interaction, Software, Proteins, Protein Function, Computer User Interface Design, CHEMICAL SCIENCES, Protein Complex Detection, Exploratory Data Analysis, Exploratory Analysis, Internet, and Interaction Network
Research Interests: Algorithms, Biological Sciences, Protein-Protein Interaction, Computer Simulation, Mathematical Sciences, and 14 moreCase Study, BMC Bioinformatics, Proteins, Protein Sequence Analysis, Protein Interaction, Biological Process, Amino Acid Sequence, Parameter Optimization, Sensitivity and Specificity, Literature Search, Protein Binding, Molecular Sequence Data, Prediction Method, and binding sites
Research Interests: Bioinformatics, Algorithms, Computer Graphics, Data Analysis, Systems Integration, and 11 moreBiological Sciences, Software, Visual Analysis, Mathematical Sciences, Data Exploration, Computer User Interface Design, Multi Resolution Transform, Proteome, Gene expression profiling, Differential expression, and Difference Operator
Research Interests:
Research Interests: Bioinformatics, Computational Biology, Systems Biology, Visual Analytics, Data Management, and 10 moreProteomics, System Biology, Database Management Systems, Open Source, Biological Sciences, Software, Mathematical Sciences, Information Storage and Retrieval, Very high throughput, and Functional Annotation
Research Interests: Algorithms, Computational Biology, Multidisciplinary, Humans, Computer Simulation, and 15 moreMice, Animals, Biological Networks, Mutual Information, Gene Regulatory Networks, Biological Network, Microarray Analysis, Functional Group, Metabolic pathway, Data Cleansing, Microarray Data, Gene expression profiling, Interaction Network, Association Analysis, and Gene Expression Data
Biologists and bioinformaticists face the ever-increasing challenge of managing large datasets queried from diverse data sources. Genomics and proteomics databases such as the National Center for Biotechnology (NCBI), Kyoto Encyclopedia... more
Biologists and bioinformaticists face the ever-increasing challenge of managing large datasets queried from diverse data sources. Genomics and proteomics databases such as the National Center for Biotechnology (NCBI), Kyoto Encyclopedia of Genes and Genomes (KEGG), and the European Molecular Biology Laboratory (EMBL) are becoming the standard biological data department stores that biologists visit on a regular basis to obtain the supplies necessary for conducting their research. However, much of the data that biologists retrieve from these databases needs to be further managed and organized in a meaningful way so that the researcher can focus on the problem that they are trying to investigate and share their data and findings with other researchers. We are working towards developing a problem-solving environment called the Computational Cell Environment (CCE) that provides connectivity to these diverse data stores and provides data retrieval, management, and analysis through all aspects of biological study. In this paper we discuss the system and database design of CCE. We also outline a few problems encountered at various stages of its development and the design decisions taken to resolve them.
Research Interests:
The recent development of high-throughput proteomics techniques has resulted in the exponential growth of experimental proteomics data. At the same time, the amount of published biological information--which includes not only journal... more
The recent development of high-throughput proteomics techniques has resulted in the exponential growth of experimental proteomics data. At the same time, the amount of published biological information--which includes not only journal articles but also gene sequences, ...
Research Interests:
ABSTRACT
Research Interests:
The Energy Citations Database (ECD) provides access to historical and current research (1948 to the present) from the Department of Energy (DOE) and predecessor agencies.
Research Interests:
Systems biology research demands the availability of tools and technologies that span a comprehensive range of computational capabilities, including data management, transfer, processing, integration, and interpretation. To address these... more
Systems biology research demands the availability of tools and technologies that span a comprehensive range of computational capabilities, including data management, transfer, processing, integration, and interpretation. To address these needs, we have created the bioinformatics resource manager (BRM), a scalable, flexible, and easy to use tool for biologists to undertake complex analyses. This paper describes the underlying software architecture of the BRM that integrates multiple commodity platforms to provide a highly extensible and scalable software infrastructure for bioinformatics. The architecture integrates a J2EE 3-tier application with an archival experimental data management system, the GAGGLE framework for desktop tool integration, and the MeDICi integration framework for high-throughput data analysis workflows. This architecture facilitates a systems biology software solution that enables the entire spectrum of scientific activities, from experimental data access to hig...
Research Interests: Bioinformatics, Data Analysis, Data Management, Software Architecture, Resource Allocation, and 11 moreSystem Biology, Middleware, Database Management Systems, Process Integration, eScience, Management System, Spectrum, Very high throughput, Experimental Data, Information Retrieval systems, and Tool Integration
Research Interests:
Research Interests: Bioinformatics, Computational Biology, Systems Biology, Visual Analytics, Data Management, and 10 moreProteomics, System Biology, Database Management Systems, Open Source, Biological Sciences, Software, Mathematical Sciences, Information Storage and Retrieval, Very high throughput, and Functional Annotation
Simulation and modeling is becoming one of the standard approaches to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as... more
Simulation and modeling is becoming one of the standard approaches to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as support for the use of these methods. Here, we present a new software tool that is platform independent, user friendly and offers several unique features. In addition, we discuss numerical considerations and support for the switching between simulation methods.
Research Interests:
For scientific data visualizations, real-time data streams present many interesting challenges when compared to static data. Real-time data are dynamic, transient, high-volume and temporal. Effective visualizations need to be able to... more
For scientific data visualizations, real-time data streams present many interesting challenges when compared to static data. Real-time data are dynamic, transient, high-volume and temporal. Effective visualizations need to be able to accommodate dynamic data behavior as well as Abstract and present the data in ways that make sense to and are usable by humans. The Visual Content Analysis of Real-Time
Research Interests:
Motivation: Simulation and modeling is becoming a standard approach to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as... more
Motivation: Simulation and modeling is becoming a standard approach to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as support for the usage of these methods. Results: Here, we present COPASI, a platform-independent and user-friendly biochemical simulator that offers several unique features. We discuss numerical issues with these features; in particular, the criteria to switch between stochastic ...