ASM-Sys Microbio Text
ASM-Sys Microbio Text
ASM-Sys Microbio Text
Systems Microbiology:
Beyond Microbial Genomics
By Merry R. Buckley
Copyright 2004
American Academy of Microbiology
1752 N Street, NW
Washington, DC 20052
http://www.asmusa.org
BOARD OF GOVERNORS,
AMERICAN ACADEMY OF MICROBIOLOGY
Carol A. Colgan
Director, American Academy of Microbiology
COLLOQUIUM PARTICIPANTS
John F. Alderete, Ph.D.
University of Texas Health Sciences Center, San Antonio
EXECUTIVE SUMMARY
ity, computational limitations associated with data integration, the lack of sufficient functional gene annotations,
needs for quantitative proteomics, and the inapplicability of current high throughput methods to all areas of
systems microbiology. Difficulties have also been
encountered in acquiring the necessary data, assuring
the quality of that data, and in making data available to
the community in a useful format.
Problems with data quality assurance and data availability could be partially offset by launching a dedicated
systems microbiology database. To be of greatest value
to the field, a database should include systems data
from all levels of analysis, including sequences, microarray data, proteomics data, metabolite measurements,
data on protein-protein or protein-nucleic interactions,
carbohydrate and small RNA profiles, information on
cell surface markers, and appropriate supporting data.
Regular updates of these databases and adherence to
agreed upon data format standards are critical to the
success of these resources.
It was recommended that educational requirements
for undergraduate and graduate students in microbiology be amended to better prepare the next generation
of researchers for the quantitative requirements of
applying systems microbiology methods in their work.
1
INTRODUCTION
Since the inception of microbiology, the field has generally embraced reductionism, focusing on increasingly
smaller details of microorganisms over time. Systems
microbiology complements that trend, seeking to
explain the properties that arise from interactions of the
smaller parts of an organism or between members of a
microbial community. Identifying each of the genes and
applications, including the degradation of persistent toxic chemicals that would otherwise
poison soils and water supplies. Engineered bacterial strains have also been used as microbial
factories for generating ethanol ( an important
biofuel), feed additives, and pharmaceuticals.
Microbial production of these materials can be
more cost-effective than production by traditional methods.
experiments under strictly controlled conditions. Systems microbiology is poised to capitalize on these
aspects of microbes in order to provide new insights
into their operation and to develop the platforms that
can be applied to other living systems.
With the exception of fungi and certain other eukaryotic microbes, microorganisms carry relatively small
genomes. The size of microbial genomes has enabled
researchers to sequence the entire genetic material of
thousands of viruses and hundreds of bacteria and
archaea, and many more are sequenced every year.
The ability to easily sequence major stretches of microbial genomes is particularly useful in studying the
molecular and genetic basis of evolution, a phenomenon that is amenable to an experimental approach in
microbial systems but is more difficult to explore in
multicellular organisms.
POTENTIAL APPLICATIONS
MICROBIOLOGY
OF
SYSTEMS
Biocontrol. The use of helper microbes to eliminate or control undesired microbial populations
could be made possible through a systems
understanding of their interactions in soil and
aquatic communities.
Pollution and bioremediation. Water and soil
quality management systems could be optimized
with systems approaches.
require a consortium of scientists. Some of the professionals that are needed for these investigations include:
Microbiologists,
Biochemists,
Evolutionary biologists
Mathematicians,
Computer scientists,
Physicists,
Chemists,
Control theorists,
Systems engineers,
Geochemists,
Atmospheric chemists,
Chemical and physical oceanographers,
Earth scientists, and
Biostatisticians.
While specialists in each of these fields can make
contributions to developing a systems approach to
microbiology, interdisciplinary researchers familiar with
microbiology will be in an excellent position to advance
the field.
5
REGULATION
RESEARCH IN
SYSTEMS MICROBIOLOGY
A number of specific issues in systems microbiology
research merit discussion, including the scientific fields
that should be involved in projects of this type, how
systems microbiology can be applied to studying biological regulation and microbial communities, the role of
a systems approach in hypothesis-generating research,
and how the presence of noise in biological systems
will affect progress in systems microbiology.
TO
SYSTEMS
OF
BIOLOGICAL SYSTEMS
TO
species. Prior to the genome mining work, these organisms were thought to be non-motile, but following the
discovery of motility genes, experiments were implemented that confirmed their motility. Both tactics,
hypothesis-generating research and hypothesis-driven
research, are required for successful application of systems approaches.
MEASURING NOISE
IN
BIOLOGICAL SYSTEMS
Noise is a problematic factor in all branches of experimental science, biology included. In biology, some of
the difficulty lies in separating measurement noise, the
irreproducible quantitative variation due to observational
factors, from true biological noise, the irreproducible
variation exhibited by biological systems. Moreover, stochastic processes have been shown to drive some
biological phenomena, so biological noise cannot be
dismissed from data analysis lest a significant underlying process be ignored. Hence, noise is an inescapable
part of biological science and must be addressed in
investigations that employ systems approaches. The
ability to measure noise in a given biological system will
depend upon the ability to quantify its relevant features.
TECHNICAL CHALLENGES IN
SYSTEMS MICROBIOLOGY
As an evolving field, most of the successes of systems microbiology lie in the future. However, to realize
this potential the technical challenges that riddle the
path ahead must be acknowledged, managed, and overcome. Technical bottlenecks range from difficulties in
identifying and quantifying cellular constituents to limitations in the ability to cultivate diverse microbes or
monitor the activity of complex microbial communities.
Difficulties also arise in acquiring and cataloging the
necessary information and research data and in assuring data quality.
A database of systems microbiology information
would facilitate the process of overcoming these difficulties in data acquisition and quality assurance and
would most likely prove to be hugely advantageous for
the field.
TECHNICAL BOTTLENECKS
Data accessibility
Since its inception, progress in microbiology has proceeded in lockstep with advances in technology. This
dependence on technology also extends to systems
microbiology, and a number of methodological difficulties must be resolved for the field to move forward.
Many of these bottlenecks are universal and apply to
multiple systems, but others are more system-specific.
Enormous quantities of biological data have been accumulated over the years, including genome sequences,
annotations, biochemical information, microarray profiles, and other types of information. These data could
serve as an invaluable resource for developing a systems
understanding of microbes, however, many important
data sets are unavailable to public databases and others
are not amenable to storage or comparison in a database format ( video images etc.). Ideally, these data sets
should be stored in searchable, cross-referenced databases that allow researchers ready access to
information pertaining to their individual research
Cultivation
The inability to isolate and cultivate many types of
microbes has long limited the range of organisms that
are available for analysis. Although the vast majority of
microbes resist cultivation by traditional methods, it has
been proposed that many more strains would yield to
cultivation efforts if novel, imaginative approaches were
used. Systems microbiology would benefit greatly from
renewed efforts to cultivate diverse strains or consortia
of microbes from different environments, since the ability to study individual strains under lab conditions is
often a key to experimentation.
OTHER
MICROBIOLOGY INCLUDE:
Computational limitations
Even given the unprecedented advancements in computing power that have been achieved over the past
decade, certain computational limitations still impose
restrictions on the type and dimension of systems modeling that can be accomplished. Although a
bottoms-up modeling approach, in which the activity
of a system is simulated from the known or suspected
activity of its components, may be too complicated to
tackle for a number of years to come, greater computing
power could enable a top-down microbial modeling
approach in the near future. In top-down models, measures of the end products of the system are used to
predict the activity of the system. In these models, it
may be most appropriate to focus on the role of those
proteins known as master regulators ( which direct the
responses of the cell and the cell cycle ) and the topdown regulatory architecture. The development of this
type of model could enable the subsequent modeling of
collections of interacting organisms in simple consortia
or complex communities.
Proteomics
In order to promote the success of systems microbiology, the development of new proteomics approaches
needs to continue. To date, proteomics ( the study of
the full complement of proteins in the genome of an
organism ) has been slow to generate the types of quantitative data needed for use in systems applications.
High-throughput technologies
Although some high-throughput technologies are
relatively mature such as the use of microarrays for
gene expression measurements many of the data
needed for systems microbiology currently cannot
be obtained using high-throughput techniques, presenting a serious limitation on the rate at which the
field can proceed. Additional high-throughput technologies are needed. These include, but are not
limited to, the production and characterization of the
vast array of proteins encoded in microbial genomes,
identifying and characterizing the interactions among
the various proteins and other macromolecules,
determining the type and concentration of intracellular metabolites and extracellular signal molecules at a
given cell state, accurately monitoring the presence
of key regulatory RNA molecules, and determining
the suite of surface components that can often drive
microbe-microbe and microbe-surface interactions
in biofilms.
INFORMATION
AND
DATA GAPS
Needed data
A great deal of data that may be of use in systems
microbiology has already been generated. Genome
sequences from cultivated and non-cultivated organisms are a considerable resource for this effort, as are
the gene inventories that have characterized the genetic
diversity of a broad range of different environments.
Other information, including microarray data, exists but
is not readily available to the research community in the
form of an easily accessible public database. There is
also concern regarding the relative quality of these data.
Strategic decisions are needed in collecting the additional data necessary to pursue systems microbiology.
It may be advisable, for example, in the effort to define
microbial systems, to weigh the relative advantages of
studying one microbe under multiple different conditions rather than studying many different strains under
one condition.
Continued genome sequencing is crucial to the
efforts of systems biologists. In particular, more
sequences are needed from more organisms derived
from a greater range of habitats. Dense sequencing of
certain branches of the bacterial and archaeal phylogenetic trees can provide important information about the
mechanisms of cellular differentiation and evolution.
For example, many closely-related Proteobacteria have
widely diverse morphologies, niches, and metabolic
systems. Although certain regulatory molecules are
highly conserved in this branch of the tree, they have
been found to control widely different functions in each
of the different species. Systems analysis of genome
sequences and organization could determine how these
control systems evolved.
Genome sequences can only get systems biology so
far, however. It is also critical to obtain more data on
the functions of the proteins or nucleic acids encoded
by these sequences. Scientists currently rely too heavily
on homology mapping in inferring the functions of proteins and protein domains. Homology mapping involves
comparing a given protein sequence to the sequences
of proteins that have been characterized previously.
Unfortunately, not all protein functions assigned in public databases are accurate, creating a situation in which
the functions of many proteins remain unknown or are
possibly misconstrued. Such cases of misassignment
can hamper advances in the field. The results of
sequence homology analyses can also be vague; often,
logical networks,
Data that expand the phenotypic characteriza-
tems, and
Data on cell cycle or spatial events.
ACQUIRING INFORMATION
10
AND
DATA
Difficulties have arisen in collecting the data necessary for systems microbiology investigations. Existing
disparities in formatting and data standardization
between databases, for example, can hinder the ability
of researchers trying to acquire information from prior
investigations. Often, the necessary information is
stored in a database or in the literature, but the representations or archive layout can prohibit easy access.
These data are seldom subjected to a quality assessment, and researchers access these resources at their
peril. Also, the representation of biological data in these
databases often reflects a cultural gap in understanding
between the computational scientists who design these
databases and the biologists who use the data. In general, there is not enough support for the databases and
other information sources critical to the advancement of
systems microbiology.
The documentation of metabolic pathways poses a
particular problem; there is a pressing need for standardization in these data sets. Scientific societies,
journals and/or funding agencies should move to set
standards for formatting of metabolic and regulatory
pathway information. These formatting guidelines
could apply to the type and organization of metabolic
pathway information provided in research publications
and submitted to public databases. It is also crucial
that these data are stored in a searchabl way that
allows researchers to explore pathways through a database interface.
Due to the high costs of many of the technologies
necessary for carrying out the work of systems microbiology, it may be advisable to develop central locations
of excellence to serve as a resource for the scientific
community. High resolution microscopy, other imaging
or modeling technologies, and high throughput data
production facilities, for example, which are too costly
for many labs or universities to provide, could be housed
ASSURING
DATA QUALITY
The problem of how to ensure and validate the quality of data available in public databases has become
more and more challenging with the genomics data
explosion of recent years. Physics and other fields have
grappled with large data sets in the past and the manner in which the dilemma was solved in those areas
may offer lessons to biology today. The contributions of
computer scientists and information specialists will be
crucial in resolving the issue of information management, but it is important that biologists work closely
with these professionals to ensure the utility of the
information systems that are developed.
Genomic data should not be treated as static; mechanisms need to be put into place to continuously
update the sequence and annotation data available in
public databases. One serious impediment to maintaining data timeliness is the inability or reluctance of most
scientists to commit to the long-term upkeep of their
publicly available data. When a student graduates, or
the funding for a project is exhausted, the impetus to
continue updating and actively curating small databases is often lost.
The manner in which genomics and other critical
data are generated should be standardized and
reported in both the archival databases and publications. For example, microarray data are more useful
only if the details of the experimental conditions and
analyses are recorded and documented in an accessible format. This type of information must be included
with genomics data sets to ensure the results are interpreted properly.
Quality assurance and reproducibility standards for
systems microbiology data that are consistent across all
the relevant journals and agencies need to be established. Once proper standards are developed they could
be implemented through peer review and incentives
from funding agencies.
In applying for funding, researchers are strongly
advised to allot money and personnel time to quality
assurance tasks.
AND
AVAILABLE
Efforts to standardize experiments, analyses, and biological materials across microbiological fields will aid in
A PROPOSAL
DATABASE
FOR A
SYSTEMS MICROBIOLOGY
11
12
sometimes from day to day during the course of a sequencing project. The database will need to be updated with
respect to new species designations and gene numbering, but it should be carried out in a rational way that
minimizes confusion for database users.
EDUCATION, TRAINING,
AND COLLABORATION IN
SYSTEMS MICROBIOLOGY
Systems microbiology represents a new and powerful way of thinking that was necessitated by the
ever-increasing volumes of biological data becoming
available and the comprehensive nature of genome
sequencing, which provides exhaustive information on
the genetic contents of cells and organisms. As highthroughput technologies and access to large data sets
accelerate the pace of discovery the need for a systems
approach to interpret and exploit these volumes of data
becomes increasingly apparent. This new point of view
needs to be reflected in our approach to education and
training in microbiology at the undergraduate and graduate level. Of critical concern is the encouragement and
training of professionals who are interested in learning
how to integrate complex biological information using a
systems approach.
TO
USE
INSTRUMENTATION TRAINING
Progress in biology is paced by instrumentation; steps
forward are enabled by new technologies, and the field
can be stymied if new tools are not made available
or are unused. Broader training in the high throughput
techniques and new analytical, microscopic and modeling
instrumentation associated with systems microbiology
is highly recommended so that more microbiologists
can take advantage of the most current technological
advances. In their training, students should be exposed
to a number of techniques and approaches so that they
COLLABORATION
IN
SYSTEMS MICROBIOLOGY
Collaboration is a key ingredient for success in systems microbiology. However, the details of collaboration,
including communication between distant institutions,
questions of where to situate collaborating scientists,
student co-mentoring, co-authorship, and issues to do
with funding, tenure or promotion can hamper effective
cooperation and career development of involved researchers. These challenges must be dealt with to enable
effective interdisciplinary research and collaboration.
13
proposals may exceed the expertise of individual assessors. It may be necessary do develop special panels to
evaluate these proposals.
International collaborations
Researchers in systems microbiology should take full
advantage of opportunities to engage in international
collaboration. The scientific community involved in this
particular topic is extremely cosmopolitan, and researchers from different nations have their own traditional
strengths to lend to a given project. In Europe, for example, researchers in Germany and the Netherlands are
known for their work in physiological diversity, Poland
and Hungary have strong credentials in marrying biology
to math and physics, and Romanian scientists have
maintained expertise in histological techniques that can
be applied to solving systems microbiology questions.
14
COMMUNICATING SYSTEMS
MICROBIOLOGY TO THE PUBLIC
MODES
OF
COMMUNICATION
into high school curricula; experimentation with microbial systems can offer relatively low-cost, high-impact
lessons for young people. A program that encourages
graduate students in microbiology to reach out to the
community could establish a long-term commitment to
conveying the importance and excitement of microbiology research to the public.
THE ROLE
OF THE
SCIENTIFIC COMMUNITY
RECOMMENDATIONS
Systems microbiology requires a balance between
breadth and depth of knowledge in the professionals engaged in collaborative research. It is
necessary for each of the specialists involved to be
knowledgeable in their own field, but they must also
have sufficient knowledge of the other areas of the
project to contribute fully. Academic and research
programs should be aware of these issues as they
prepare professionals to function as leaders in their
field. Mathematics and computational training for
students of microbiology must also be improved
and expanded to encourage the development of the
numerical skills necessary to meet the challenges of
systems approaches.
A centralized database for systems microbiology
data is critical. The ability to use archival data will be
necessary to realize the full potential of systems
microbiology research.
- The database should enables rapid submission and
retrieval of high quality data in standardized formats.
This database should include: annotated genome
sequence data, metabolic pathway data, microarray
data, proteomics data, data from metabolite studies, data on protein-protein interactions, data on
protein-nucleic acid complexes, carbohydrate profiles, and profiles of cell surface markers.
15