Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
33 views11 pages

Rhodes 2010

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 11

Environmental Microbiology (2010) 12(9), 2613–2623 doi:10.1111/j.1462-2920.2010.02232.

Amino acid signatures of salinity on an environmental


scale with a focus on the Dead Sea emi_2232 2613..2623

Matthew E. Rhodes,1,2* Sorel T. Fitz-Gibbon,1,3 metagenome and the V9 amplicon library support the
Aharon Oren4 and Christopher H. House1,2 conclusion that the dominant microorganism inhabit-
1
Penn State Astrobiology Research Center and ing the Dead Sea is most closely related to a thus far
2
Department of Geosciences, Pennsylvania State uncultured relative of an alkaliphilic haloarchaeon.
University, University Park, PA 16802, USA.
3
Center for Astrobiology, Institute of Geophysics and
Planetary Physics, University of California, Los Angeles, Introduction
CA 90095, USA. Advances in DNA sequencing have made it possible to
4
The Institute of Life Sciences, and the Moshe Shilo study the genetic make-up of entire environments with
Minerva Center for Marine Biogeochemistry, The Hebrew theoretically little bias. Thus, the GC content or gene
University of Jerusalem, 91904 Jerusalem, Israel. content of an environment can be ascertained yielding
valuable information as to the make-up and metabolic
Summary capabilities of microorganisms present (Tyson et al., 2004;
Venter et al., 2004). Another parameter of interest is the
The increase of the acidic nature of proteins as an distribution of amino acids coded within the genomes of an
adaptation to hypersalinity has been well docu- environment. It has been documented for individual
mented within halophile isolates. Here we explore the species that various environmental stresses, such as
effect of salinity on amino acid preference on an extreme acidity or extreme salinity (Haney et al., 1999;
environmental scale. Via pyrosequencing, we have Goodarzi et al., 2008; Paul et al., 2008), can bias their
obtained two distinct metagenomic data sets from the amino acid composition due to their desired chemical
Dead Sea, one from a 1992 archaeal bloom and one characteristics. Here we investigate whether the pattern of
from the modern Dead Sea. Our data, along with encoded amino acids can be indicative of environments as
metagenomes from environments representing a well.
range of salinities, show a strong linear correlation Halophilic microorganisms have developed two strate-
(R 2 = 0.97) between the salinity of an environment and gies to deal with the multimolar salinities of their environ-
the ratio of acidic to basic amino acids encoded by its ments. All eukaryotic species, most halophilic Bacteria
inhabitants. Using the amino acid composition of and the halophilic methanogenic Archaea build up con-
putative protein-encoding reads and the results of centrations of organic solutes (osmolytes), to balance the
16S rRNA amplicon sequencing, we differentiate osmotic pressure. This ‘salt-out’ method allows the inter-
recovered sequences representing microorganisms nal mechanisms of the cell to remain in their native states
indigenous to the Dead Sea from lateral gene transfer but requires a high energy cost to manufacture the
events and foreign DNA. Our methods demonstrate organic molecules. In contrast, Archaea of the order Halo-
lateral gene transfer events between a halophilic bacteriales, as well as a limited number of halophilic Bac-
archaeon and relatives of the thermophilic bacterial teria, accumulate high concentrations of salts, typically
genus Thermotoga and suggest the presence of KCl, within their cytoplasm. This ‘salt-in’ method is ener-
indigenous Dead Sea representatives from 10 tradi- getically more efficient but requires the adaptation of intra-
tionally non-hyperhalophilic bacterial lineages. The cellular proteins to high salt concentrations (Reistad,
work suggests the possibility that amino acid bias of 1970; Lanyi, 1974; Oren, 1986; 1999; 2002). The pres-
hypersaline environments might be preservable in ence of high quantities of K+ alters the intracellular envi-
fossil DNA or fossil amino acids, serving as a proxy ronment, thereby interfering with protein interactions
for the salinity of an ancient environment. Finally, (Lanyi, 1974; Madern et al., 2000). This necessitates a
both the amino acid profile of the 2007 Dead Sea number of changes in protein structure to maintain proper
protein function. Lanyi in 1974 (Lanyi, 1974) summarized
Received 7 July, 2009; accepted 2 March, 2010. *For correspon-
dence. E-mail mer251@psu.edu; Tel. (+1) 617 519 4778; Fax (+1) the adaptations of proteins to extreme salinity. Included in
814 863 7823. Lanyi’s summary is an overall increase in acidic amino
© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd
2614 M. E. Rhodes, S. T. Fitz-Gibbon, A. Oren and C. H. House

acids which is offset by an overall decrease in basic sample and the unamplified 2007 sample were
amino acids. This trend has been demonstrated in the sequenced on a 454 Life Sciences/Roche FLX
genomes of hyperhalophilic salt-in Archaea such as Halo- sequencer. The 2007 half plate yielded a total of 243 816
bacterium NRC-1 (Ng et al., 2000). In salt-out halophiles, unique reads with an average length of 250.8 base pairs.
however, only proteins exposed to the hypersaline The 1992 half plate yielded a total of 137 137 unique
medium exhibit an excess of acidic amino acids (Oren reads with an average read length of 240.1 base pairs.
et al., 2005). These data sets were compared with the non-
At a salinity of 347 g l-1, the modern Dead Sea repre- redundant protein database using BLASTX (Altschul et al.,
sents an especially inhospitable environment at the 1997). With a cut-off e-value of 10-5, the 2007 data set
extreme of hypersalinity. Due to changing weather pat- returned 139 345 or 57% of the reads as having homology
terns and increased water usage, the salinity of the Dead and the 1992 data set returned 15 301 or 11% of the reads
Sea surface water has risen steadily from 269 g l-1 in the as having homology. The discrepancy in the proportion of
1930s (Volcani, 1944) to its current value, with two hits is likely caused by the amplification process produc-
notable exceptions. In 1980 and 1992 heavy winter rains ing chimeric DNA in the 1992 sample. For each top hit the
created a net positive water budget and diluted the portion of the read that matched to a homologous protein
surface waters of the Dead Sea to 200 g l-1 and 170 g l-1 by BLASTX was extracted, and its amino acid composition
respectively (Gavrieli et al., 1999). The dilutions allowed was tallied. A similar analysis was performed on BLASTX
the establishment of a bloom of the alga Dunaliella. The output from three data sets from the Peru Margin subsea-
algal blooms released organic material, most likely includ- floor (Biddle et al., 2008) as well as a number of publically
ing the organic osmolyte glycerol. These compounds in available protein metagenomes for which there were mul-
turn provided the energy for subsequent blooms of halo- tiple data sets. These consisted of metagenomes from a
philic Archaea with cell counts exceeding 3 ¥ 107 per ml in Spanish saltern (Legault et al., 2006), 10 layers of a mod-
the late spring of 1992 (Oren and Gurevich, 1995). erately saline microbial mat from Guerrero Negro (Kunin
We have used metagenomic methods to analyse the et al., 2008), three metagenomes from whale falls (Tringe
encoded amino acid distribution of two disparate Dead et al., 2005) and seven metagenomes from mammalian
Sea ecosystems, that of the modern Dead Sea of March guts (Gill et al., 2006; Turnbaugh et al., 2006). Additionally
2007 and that of a properly prepared and frozen sample the approximately 70-base-pair V9 hypervariable region
from the archaeal bloom of September 1992. We have of the 16S rRNA gene was amplified from the 2007
compared these ecosystems with a number of environ- sample and the amplification product was sequenced on
ments for which multiple similar data sets are available: a the 454 FLX sequencer. The resulting 29 673 quality-
Spanish saltern (Legault et al., 2006), the deep sea sub- controlled 16S amplicons were assigned to taxa by a
surface (Biddle et al., 2008), mammalian guts (Gill et al., BLASTN comparison as described in Experimental proce-
2006; Turnbaugh et al., 2006), whale falls (Tringe et al., dures.
2005), and the moderately saline Guerrero Negro, Baja
California microbial mats (Spear et al., 2003; Ley et al.,
Cluster analysis
2006; Kunin et al., 2008; Robertson et al., 2009). We
employed these results to demonstrate that a metage- We standardized the raw counts data by first dividing
nomic amino acid profile is characteristic of an environ- through by the site (row) totals and then by the amino acid
ment and we suggest the use of the ratio of acidic amino (column) maximums. Subsequently we performed a hier-
acids to basic amino acids encoded within a metagenome archical cluster analysis on both the site data and the
or preserved within fossilized peptides as a proxy for the amino acid data using Ward’s method (Ward, 1963) and
salinity of highly hypersaline paleoenvironments. We also the Euclidean distance (Fig. 1). Both analogous environ-
used the acidic nature of proteins in salt-in halophiles to ments and different samplings from the related environ-
differentiate with sequence data alone, organisms that ments cluster together. The two most hypersaline
naturally inhabit the Dead Sea from lateral gene transfer environments, the 2007 Dead Sea and the Spanish
(LGT) events, and/or the presence of foreign species. saltern, group closely together. These two in turn cluster
with the 1992 Dead Sea, albeit rather distantly, and the
three hypersaline environments are most closely related
Results
to all 10 of the moderately saline Guerrero Negro mat
DNA was extracted from two water samples collected metagenomes. In the bottom half of the dendrogram we
from the Dead Sea in September of 1992 and March of see the deep sea subsurface clustering together, the
2007 respectively. A portion of the 1992 sample was sub- mammalian guts clustering together, and two of the three
jected to whole-genome amplification (Dean et al., 2001). whale falls clustering together. Presumably the selective
Subsequently, a half plate of both the amplified 1992 pressure on amino acid preference imparted by salinity

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
Amino acid signatures of the Dead Sea 2615

Fig. 1. Two-dimensional ‘heat plot’ showing the hierarchical clustering of environments (y-axis) and amino acids (x-axis). The two clusterings
are based on comparisons of the amino acid content of each metagenome. The scales adjacent to the dendrograms give the Euclidean
distances. Also shown along the right side, for reference, are the salinities and the lysine and aspartic acid contents of the environments.

does not become a significant factor until salinities greater expected to have a lower proportion in saline environ-
than those of the marine subsurface (35‰). Thus, when ments. Nevertheless, the proportion of encoded Arg
taken as a whole, environments have characteristic sig- increases with salinity.
natures of amino acid usage throughout their entire A potential driver for these opposing tendencies is the
metagenomes. GC content of the codons associated with each amino
The right most portion of the upper dendrogram (Fig. 1) acid. The amino acid Lys is encoded by the two codons
includes all the amino acids encoded primarily by As and AAA and AAG, both of which have a bias towards AT
Ts except for methionine. This clustering is likely created nucleotides. In contrast, Arg is encoded by the six codons,
by the extremely high GC content of most hyperhalophilic CGT, CGC, CGA, CGG, AGA and AGG, for which five of
genomes, often upwards of 60%. The 2007 Dead Sea the six codons have a bias towards GC nucleotides. A
metagenome and the 1992 Dead Sea metagenome number of studies have suggested that high genomic GC
encode 67% and 62% GCs respectively (see Appen- content is an almost universal adaptation to hypersaline
dix S1). Due to the dominance of the particularly GC-poor environments (Kennedy et al., 2001; Soppa, 2006) and,
hyperhalophile, Haloquadratum walsbyi, the Spanish as mentioned above, the two Dead Sea metagenomes
Saltern has an overall GC content of only 54%. have GC contents of over 60%. In contrast, the Guerrero
As expected, the Asp values are especially high in the Negro mat metagenomes have GC contents of about
hypersaline environments and decrease with decreasing 55%, and the non-saline metagenomes have GC contents
salinity. Also as expected, the Glu levels are extremely of about 50%. Thus, based upon mutational bias alone we
high in both the Spanish saltern and the Dead Sea, yet would expect to see a discrepancy between Lys and Arg
they do not display an increased prevalence in the mod- in the GC-rich Dead Sea.
erately saline Guerrero Negro mats. Conversely, the Lys This might suggest that mutational bias alone is the
values are especially low in the Dead Sea and increase driving force behind the decrease in Lys and the increase
with decreasing salinity. Arg, however, displays a counter- in Arg. However there is one known exception to the
intuitive trend. As a basic amino acid, Arg would be GC-rich salt-in hyperhalophile rule. The hyperhalophilic

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
2616 M. E. Rhodes, S. T. Fitz-Gibbon, A. Oren and C. H. House

archaeon Hqr. walsbyi has a genome with a GC content of GC content, and Ile, an amino acid encoded by three
only 47.9% (Bolhuis et al., 2006). As with other hyperh- AT-biased codons, plots negatively associated with GC
alophiles, Hqr. walsbyi also encodes a low proportion of content. Finally the amino acids His and Phe do not
Lys, 2.33% (see Appendix S1), suggesting that a conform to expectation. Although Phe is encoded by two
decrease in Lys is universal and not merely a product of AT-biased codons, it plots negatively with salinity, not with
GC bias. On the other hand, the Arg content encoded by GC content. This can potentially be explained by the low
Hqr. walsbyi is at relatively normal levels, 5.80%, indicat- proportion of Phe in the GC-poor marine subsurface. His,
ing that the increase in Arg in the Dead Sea is potentially a positively charged amino acid, plots positively with salin-
caused by the GC bias of the environment. Other notable ity. Unlike Arg this cannot be explained by a GC bias. His
trends are the high Ala values in the saline environments, is encoded equally by both AT- and GC-biased codons.
the high Cys values in the deep sea subsurface and the Nevertheless halophilic Archaea do appear to encode
mammalian guts, and the high Tyr values in the Dead greater proportions of His than non-halophilic Archaea
Sea. Tyr like Lys is encoded solely by AT-biased codons (see Appendix S1).
and would be expected to be relatively uncommon in the
GC-rich Dead Sea.
Environmental amino acid profiles versus salinity
We observe significant deviation in the percentage of
Redundancy analysis
encoded Glu, Asp and Lys between the hypersaline
A redundancy analysis was performed with salinity and metagenomes and all other metagenomes (Fig. 1). Fur-
GC content encoded as the dependent variables (Fig. 2). thermore, the moderately saline, Guerrero Negro micro-
Fifty-two per cent of the variance is explained by the first bial mat metagenomes display a slight excess of Asp and
constrained axis (x-axis) and another 4% by the second a more moderate deficit of Lys relative to non-saline envi-
constrained axis (y-axis). The two environmental vari- ronments. The excess of Glu and Asp and the deficit of
ables both plot strongly negative along the first axis, while Lys in the saline metagenomes are products of the sur-
they differ along the second axis. GC content plots nega- vival mechanisms used by salt-in halophiles to cope with
tively along the second axis and salinity plots positively. the stresses of hypersaline environments. The adapta-
As predicted, Asp and Glu plot tightly along the salinity tions generally include an increase of the acidic nature of
trend. Meanwhile Arg, Ala and Trp, all amino acids intracellular proteins. The radical nature of the salt-in
encoded exclusively or primarily by GC-biased codons, method makes it likely that the three distantly related
plot tightly with GC content. In the opposite direction, Lys, halophilic lineages adopting this strategy derived it inde-
Asp and Cys appear to be affected by both salinity and pendently (Santos and da Costa, 2002). Thus, the salt-in

Fig. 2. Redundancy analysis of the amino


acid content of the environments with salinity
and GC content encoded as the dependent
variables. The arrows show the directions of
increasing GC content and salinity
respectively. The bottom and left-hand axes
provide scales for the loading of the amino
acids onto the restricted axes and the top and
right-hand axes provide scales for the loading
of the environmental variables onto the
restricted axes.

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
Amino acid signatures of the Dead Sea 2617

ascertain the total spread of AB ratios for non-saline and


slightly saline environments.

Archaea in the modern Dead Sea

The MEGAN program (Huson et al., 2007) assigned 77% of


the 2007 Dead Sea reads to the order Halobacteriales
and a number of halobacterial species have been isolated
from its waters. We compared the 2007 Dead Sea
metagenomic amino acid values with those of the fully
sequenced Dead Sea isolates and other sequenced halo-
philes. We chose to focus on the five charged amino
Fig. 3. Correlation between the ratio of acidic to basic amino acids
acids, Asp, Glu, Lys, His and Arg, found in the 2007 Dead
encoded in an environment and the salinity of that environment. Sea metagenome as it is the more robust data set. The
The salinity values range from marine (~28 g l-1) to the extreme of metagenomic Asp value falls comfortably within the range
hypersalinity (~370 g l-1).
of values for the fully sequenced hyperhalophiles (see
Appendix S1). In contrast the Glu value exceeds those of
all known hyperhalophiles except the alkaliphile, Natrono-
method potentially marks an example of convergent evo- monas pharaonis. The Dead Sea itself is slightly acidic,
lution. However, Kunin and colleagues (2008) observed with a pH of approximately 6. For the basic amino acids,
the acidic nature of proteins in a moderately saline envi- the Arg value exceeds those of all but the hyperhalophilic
ronment to be far more widespread than previously bacterium Salinibacter ruber, which has not been
thought, indicating that the salt-in method of halophilicity observed in the Dead Sea. The Lys value exceeds those
may in fact be relatively common in halophiles. of all but Hqr. walsbyi, and the high GC content of the
Consequently, the degree of excess acidic amino acids Dead Sea metagenome precludes Hqr. walsbyi from com-
and dearth of basic amino acids reflects the prevalence of posing of the majority of the ecosystem. Finally, the His
the ‘salt-in’ strategy in an environment and the amount of value exceeds those of all hyperhalophiles. Therefore,
adaptation necessary to cope with the environmental based upon the amino acid profile data, we can conclude
stress. This can be quantified by looking at the ratio of the that the hyperhalophilic communities within the 2007
acidic amino acids Glu and Asp to the basic amino acids Dead Sea are predominantly composed of as of yet
Lys, His and Arg which we term the AB ratio (see Appen- genomically unsequenced organisms. The 16S rRNA
dix S2). Here the difference between the 1992 Dead Sea amplicon library of the 2007 Dead Sea supports this
and the 2007 Dead Sea becomes apparent. The 2007 assertion. The majority of the classifiable 16S tags, 66%,
Dead Sea and the Spanish saltern both have AB ratios of are most closely related to uncultured haloarchaea in the
approximately 1.46. Meanwhile the 1992 dilution of the RDP database (Cole et al., 2007). This includes the most
surface waters of the Dead Sea leading to the haloar- frequently observed amplicon sequence which alone
chaeal bloom lowered the AB ratio of 1.24. Since both the composes greater than one-fourth of the archaeal com-
1992 Dead Sea and 2007 Dead Sea are archaeal domi- munity and matches identically to uncultured sequences
nated environments (MEGAN assigned > 90% of reads to extracted from the archaeal community of the alkaline-
Archaea for both Dead Sea metagenomes), this shift is saline soil of the former lake, Texcoco (Valenzuela-
presumably a reflection of the change in the haloarchaeal Encinas et al., 2008). When taken together the V9
community concurrent with salinity. The Guerrero Negro amplicon library and the amino acid profile of the 2007
mats fall on a range between 1.00 and 1.03 with an Dead Sea suggest that the dominant organism inhabiting
average of 1.01, and the marine environments range the modern Dead Sea is an uncultured and unisolated
between 0.86 and 0.96 with an average of 0.90. We relative of an alkaliphilic haloarchaeon.
therefore see a distinct correlation as we increase from
marine to hypersaline salinity levels between the amino
acid proportions and the salinity of an environment Screening of bacterial taxa
(Fig. 3). At lower salinities, however, it is unclear whether
A total of 6.7% of the 2007 metagenomic reads were
this linear relationship holds. The mammalian gut metage-
assigned by MEGAN to bacterial taxa. These bacterial
nomes observe a dichotomy and are unexpectedly high.
reads can be broken down into three categories:
Both human guts have an AB ratio of 1.02 while the
mouse guts range from 0.95 to 0.97. The cause for this (i) reads from bacteria indigenous to the ecosystem;
discrepancy is unclear and more work is required to (ii) reads of DNA foreign to the ecosystem; and

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
2618 M. E. Rhodes, S. T. Fitz-Gibbon, A. Oren and C. H. House

(iii) reads showing LGT events involving indigenous


members of the ecosystem.
Here we have used the metagenomic data, a collection of
16S rRNA tags from the V9 hypervariable region, and the
amino acid bias employed by salt-in halophiles in a novel
method to differentiate between the three categories of
bacterial reads.
Representatives of salt-in taxa from category I and LGT
events involving salt-in taxa from category III should dem-
onstrate an overall signal of protein adaptation to salinity.
All representatives of category II taxa and salt-out repre-
sentatives of the other two categories should not. We
quantified the salinity adaptation quotient (SAQ) by first
pooling all hits to a given taxonomic group, in essence
creating an artificial genome. We then compared the Fig. 4. Plot of the observed ratio of 16S rRNA amplicon reads to
metagenomic reads versus the calculated salinity adaptation
amino acid ratios of the metagenomic reads with those of quotient (SAQ) for bacterial taxonomic groups found in the 2007
the homologous portions of the respective best BLAST hits: Dead Sea data sets. Taxonomic groups distributed along the x-axis
are those found only in the metagenomic data set indicating that
SAQ = AB Reads AB BLAST = [ Asp + Glu (Reads ) they are represented by reads of protein-encoding genes that have
His + Arg + Lys (Reads )] [ Asp + Glu (BLAST ) His + been transferred into indigenous Dead Sea taxa. Taxonomic groups
Arg + Lys (BLAST )] plotting vertically along an SAQ of about 1, indicating a lack of
adaptation to salinity, are interpreted to be reads from either taxa
foreign to the Dead Sea or indigenous salt-out taxa. Taxonomic
Essentially, an SAQ of greater than one indicates that the groups represented by 16S rRNA amplicons and showing an
taxonomic group as a whole has undergone some adap- elevated SAQ are interpreted to be microorganisms indigenous in
tation to salinity. We chose an SAQ of greater than 1.2 to the modern Dead Sea.
be indicative of significant adaptation to salinity (see
Appendix S2). Furthermore, it is accepted that informa- All category III taxa participating in LGT with salt-in halo-
tional genes including 16S rRNA genes exhibit a lower philes should only be present in the metagenome, not in
propensity for LGT than operational genes (Rivera et al., the 16S tags and should demonstrate evidence of adapta-
1998). Therefore, the presence of an organism’s 16S tion to hypersalinity.
rRNA gene in an environment indicates the presence of
Figure 4 depicts a plot of the ratio of bacterial 16S reads
genomic DNA from that organism.
versus bacterial metagenomic reads against the SAQ
Thus, while all three categories of bacterial taxa should
values. The analysis separates the chart into three
be present in the metagenome, only categories I and II
regions. The putative category II taxa and salt-out taxa
should be represented by 16S rRNA amplicons. However,
plot vertically along an SAQ of approximately one. The
while the presence of particular 16S rRNA amplicons pre-
putative category III salt-in taxa plot horizontally along the
sents convincing evidence of the presence of specific
x-axis and the putative category I salt-in taxa plot in a
genomic DNA in a sample, the absence of 16S rRNA
cloud between them. Among the category III taxa we
amplicons does not necessarily eliminate the possibility of
observe significant LGT between the haloarchaea and
a taxon from occurring in a metagenome. Extraction
both the halophilic Bacteria of the genus Salinibacter and
biases and/or PCR biases could potentially prevent a
the thermophilic Bacteria of the genus Thermotoga.
taxon from showing up in a 16S rRNA library.
Meanwhile included within the category I taxa are 10
After scaling the number of reads for the 16S amplicons
bacterial lineages that are either not commonly associ-
to the number of reads in the metagenome, we defined a
ated with extreme hypersalinity and/or found in the Dead
ratio of greater than 0.15 of 16S amplicons to metagenomic
Sea (see Appendix S2).
reads to indicate the presence of the taxa in the environ-
As a case sample we chose to investigate the purported
ment. This value allows for the potential occurrence of
occurrences of LGT involving Thermotoga-related
sequencing errors and/or misidentifications. Therefore:
species. In total there were 105 reads that were assigned
All salt-in category I taxa should be represented in both the by MEGAN to Thermotoga-related species. Of these 105
metagenome and the 16S tags and should demonstrate reads, 26 were assigned to an AAA family ATPase
adaptation to hypersalinity. (YP_001245352.1, Tpet_1776) and 57 were assigned to
All category II taxa and all salt-out associated taxa should an adenine-specific DNA methylase (YP_001245348.1,
be represented in both the metagenome and the 16S tags Tpet_1772), both from the species Thermotoga petro-
but should not demonstrate adaptation to hypersalinity. phila. The very fact that these two genes are so overrep-

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
Amino acid signatures of the Dead Sea 2619

resented in the metagenomic data set is another nomenon was first demonstrated on an environmental
indication that these reads did not originate from T. petro- scale by Kunin and colleagues. In their work, the environ-
phila. It is more likely that these reads belong to a species mental pressures imparted by a salinity of approximately
composing a significant proportion of the Dead Sea micro- 90 g l-1 appeared to have caused widespread interspecific
biome that received or donated these genes through LGT convergent evolution towards proteins with increased pro-
events. portions of Asp. Here we have investigated the amino
Consequently it is not surprising that in the year since the acid-coding bias within a number of environments.
initial metagenomic analysis, additional halophiles have Included within our sample sites are environments dis-
been added to the databases and the majority of the reads playing a wide range of salinities, from non-saline to over
originally assigned to T. petrophila are now assigned to a 300 g l-1. Our analysis demonstrates that the amino acid-
haloarchaeal species, Halorubrum lacusprofundi. Thermo- coding pattern within an environment is sufficient to dis-
toga petrophila now generates the second strongest BLAST tinguish between saline and non-saline environments and
hits. It thus appears that a LGT event has taken place can potentially offer finer levels of differentiation. We
between the thermophilic bacterium T. petrophila and the therefore propose the use of the amino acid-coding profile
halophilic archaeon Hrr. lacusprofundi. This also indicates as a summary statistic of an environment.
that as more genomes are added to the databases, includ- The distinction between saline and non-saline environ-
ing the recipients of LGT events, phylogenetic based ments is primarily caused by Asp and Lys with other
lateral gene-finding methods, such as the one described amino acids such as Glu and Cys contributing as well. At
above, will need to be adapted accordingly. the same time, there is a strong association of increased
To further test our hypothesis that a LGT event has GC content and hypersalinity. It is therefore important to
occurred between Hrr. lacusprofundi and T. petrophila we differentiate between amino acids actually coevolving with
generated phylogenetic trees for both the AAA family salinity and amino acids such as Arg and Trp which
ATPase and the adenine-specific DNA methylase from appear to be influenced largely by nucleotide mutational
Hrr. lacusprofundi. We included the 10 most homologous bias. Furthermore, we identified a strong linear relation-
orthologues to these genes found in the KEGG database, ship between the salinity of an environment and the ratio
from both the archaeal and bacterial domains. We also of acidic to basic amino acids encoded within its metage-
included the sequences from the haloarchaea Natrialba nome. This relationship suggests the use of the acidic to
magadii, as they appeared highly homologous to Hrr. basic amino acid-coding ratio as a potential salinity proxy
lacusprofundi, but were not included in the KEGG data- for environments with moderate to high salinities
base (Fig. 5). In both trees T. petrophila and Hrr. lacus- (> 100 ppt). Assuming the acid residue bias of genes is
profundi are tightly linked. Additionally in both instances preserved in expression, the salinity of a paleoenviron-
the genes appear to fall in clades included in the bacterial ment could potentially be preserved in fossil DNA or fossil
domain, indicating that genetic material was transferred amino acids.
from T. petrophila to Hrr. lacusprofundi. The amino acid profiles of the metagenome should
Finally, we investigated the location of both genes in the mirror the amino acid profiles of the dominant organisms.
genomes of Hrr. lacusprofundi and T. petrophila. In both The encoded amino acid profile of the modern Dead Sea,
species the genes are located almost adjacent to each however, does not match the profiles of the hyperhalo-
other (Fig. 5). In Hrr. lacusprofundi, there is one small philic organisms sequenced thus far. The implication is
intermediate gene (YP_002567463, Hlac_3346) encoding that the halophilic isolates currently available are not the
a PglZ domain protein. In T. petrophila, there are three dominant microorganisms inhabiting the Dead Sea. This
small intermediate genes, one of which (YP_001245351.1, assertion is corroborated by our 16S Dead Sea ampli-
Tpet_1775) is most homologous to Hlac_3346. Down- cons.
stream of Hlac_3347 there is another small hypothetical Finally, with only two distinct types of sequence data,
gene (YP_002567465, Hlac_3348) which is also most we can differentiate members of the indigenous popula-
homologous to the corresponding gene in T. petrophila. tion from both probable foreign organisms and LGT
Thus, in total there exists a cassette of four genes in Hrr. events. While both the indigenous salt-in population and
lacusprofundi that are both adjacent to one another and are LGT events to salt-in halophiles should display a molecu-
most homologous to four genes in T. petrophila confirming lar adaptation to salinity, only the indigenous population
the occurrence of an inter-domain LGT event. and the foreign organisms should be represented in 16S
clone libraries. We utilized this technique to identify the
exchange of a roughly 7-kilobase region of DNA between
Discussion
the bacterial species T. petrophila and the haloarchaeal
While the acidic enrichment of proteins has been well species Hrr. lacusprofundi. The occurrence of such an
documented within individual halophilic species, this phe- inter-domain transfer of genetic material involving

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
2620 M. E. Rhodes, S. T. Fitz-Gibbon, A. Oren and C. H. House

Fig. 5. Trees depicting the phylogeny of an AAA family ATPase and an adeninine-specific DNA methylase with relevant bootstrap values
included. Bacterial species [Alcanivorax borkumensis, ‘Anaerocellum thermophilum’ (Caldicellulosiruptor bescii), Clostriduim botulinum,
‘Desulfococcus oleovorans’, ‘Methylacidiphilum infernorum’, Methylobacterium populi, Methylococcus capsulatus, Moorella thermoacetica,
Nitrosococcus oceani, Pelobacter propionicus, Pelodictyon phaeoclathratiforme (Chlorobium clathratiforme), Pelotomaculum
thermopropionicum, Photorhabdus luminescens, Thermotoga petrophila, Thermus thermophilus and Verminephrobacter eiseniae] are given in
blue, haloarchaeal species (Haloarcula marismortui, Halorubrum lacusprofundi and Natrialba magadii) in orange, other archaeal species
(Archaeoglobus fulgidus, Candidatus Korarchaeum cryptofilum, Methanocaldococcus vulcanius, Metahnopyrus kandleri, Methanospirillum
hungatei, Pyrococcus abyssi, Pyrococcus furiosus, Pyrococcus horikoshii, Staphylothermus marinus, Sulfolobus islandicus L.S.2.15,
Sulfolobus islandicus Y.G.57.14 and Thermococcus gammatolerans) in red, and a representative metagenomic read is provided in green.
Along the bottom is the region encoding both genes in the thermophilic bacterium Thermotoga petrophila and the halophilic archaeon
Halorubrum lacusprofundi. Orthologous genes are connected by arrows.

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
Amino acid signatures of the Dead Sea 2621

Thermotoga-related species is not without precedent. Sequencing and sequence analysis


Nelson and colleagues (1999) demonstrated numerous
Both genomic samples were sequenced on a half pico-titre
such events between Thermotoga maritima and thermo-
plate. The V9 amplicons were sequenced with a number of
philic Archaea, including the transfer of 15 regions greater additional amplicon sets on a separate half pico-titre plate. All
than 4 kilobases in size. The example of LGT presented amplicons with errors in the primer sequence, bad calls within
here is different in that aside from being an inter-domain the sequence, or of highly unusual length were discarded.
transfer of genetic material, it also represents the transfer BLASTX analyses against the nr database were performed on
of genetic material between distinct extreme environ- both genomic data sets. For each top hit with an e-value of
ments. 10-5 or less the portion of the read matching a homologous
protein in the database was extracted and subjected to the
We can also use the method presented above to make
analysis described in this article. Additionally, approximately
educated guesses concerning the identities of rare but 6.5% of the hits showed evidence of a frameshift within the
functioning organisms within a hypersaline ecosystem. read. For these sequences as well, the portions of the reads
We can then use the information provided as leads for matching a homologous protein in the database was
molecular approaches that may confirm the existence of extracted independently and the intervening amino acids
these species. Molecular convergence on an environmen- were discarded. A BLASTN analysis was performed on the V9
tal scale has, to date, only been documented in hypersa- amplicons against both the RDP 16S database and a collec-
tion of the genomes of all fully sequenced microbes aug-
line environments. However, with the indications of similar
mented with the 16S genes of members of taxa hit in the
amino acid biases in acidophiles and thermophiles genomic analysis.
(Haney et al., 1999; Goodarzi et al., 2008), these
approaches may be applicable to other extreme environ- Bioinformatic analysis
ments as well. The cluster analysis and redundancy analysis were per-
formed using the statistical package R (Ihaka and Gentle-
Experimental procedures man, 1996). The genomic BLAST hits were assigned to taxa
using the MEGAN program, and a top percentage of 10%. The
DNA extraction amplicons were assigned to taxa according to their respec-
tive best hit as matched by BLAST. Sequences were aligned
The 1992 sample consisted of roughly 10 ml of a bright red
using the MUSCLE program (Edgar, 2004) with default param-
solution obtained via centrifugation (15 min, 8000 g) of
eters. The alignments were then filtered to only include loca-
approximately 5 l of brine. It remained frozen at -20°C for 5
tions for which the majority of species contained codons and
years, and then further at -80° until being shipped on dry ice
for which at least three species encoded the same amino
in the spring of 2007. The sample was then stored at -80°C
acid. Trees were then constructed using RAxML (Stamatakis
until analysis. DNA was extracted from 3 ml of sample using
et al., 2005), and 100 bootstrap iterations were performed.
the MoBio UltraClean Microbial DNA kit (MoBio Laborato-
Finally the metagenomic reads were inserted using the
ries). The 2007 sample was collected and processed by the
maximum parsimony feature of the ARB program (Ludwig
lab of Oded Béjà (the Technion, Haifa, Israel) according to the
et al., 2004). All other analyses of the 2007 Dead Sea data
protocol described in Bodaker and colleagues (2009). A suf-
set were performed using home-written scripts in Perl and/or
ficient quantity of DNA for pyrosequencing was shipped in an
Python. The scripts are available upon request. The 1992
agarose plug which was subsequently digested using the
Dead Sea data set was not similarly analysed because the
Gelase enzyme (Epicentre).
reduced quantity of hits did not enable individual non-
halophilic taxa to be assessed.
Whole-genome amplification and preparation
for pyrosequencing Acknowledgements
Approximately 40 ng of the 1992 DNA was subjected to We thank Lynn Tomsho, Stephen Schuster, Tom Canich,
whole-genome amplification using the REPLI-g Mini kit Mark Patzkowsky, Oded Béjà, Idan Bodaker and the late Moti
(Qiagen). The V9 hypervariable region of the 16S rRNA gene Gonen for assistance. This work was supported by the
was amplified using the primer set 5′-gcctccctcgcgccatcag- National Aeronautics and Space Administration (NASA)
TGYACACACCGCCCGTC-3′ and 5′-gccttgccagcccgctcag- Astrobiology Institute (NAI) under NASA–Ames Cooperative
ACGGNWACCTTGTTACGACTT-3′ adapted from primers Agreement NNA04CC06A (C.H.H. and S.T.F.-G.) and in par-
1407f and 1492r (Lane, 1991) and augmented with 454 Life ticular through a NAI DDF award. The GS20 facility at the
Science’s A or B sequencing adapters. The amplification mix Pennsylvania State University Center for Genome Analysis is
consisted of 1.25 units of PfuUltra Hotstart DNA polymerase funded, in part, by a grant from the Pennsylvania Department
(Stratagene), 1¥ PfuUltra HF reaction buffer, a 200 mM con- of Health using Tobacco Settlement Funds appropriated by
centration of dNTPs, a 0.2 mM of each primer, and approxi- the legislature.
mately 5 ng of genomic DNA. The PCR reaction was run
according to the conditions of Sogin and colleagues (2006) References
with 27 cycles. The resulting product was gel purified using a
1% low-melting-point agarose gel. Samples were then Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang,
sequenced on a GS FLX sequencer (454 Life Sciences). Z., Miller, W., and Lipman, D.J. (1997) Gapped BLAST and

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
2622 M. E. Rhodes, S. T. Fitz-Gibbon, A. Oren and C. H. House
PSI-BLAST: a new generation of protein database search and community-level molecular convergence in a hypersa-
programs. Nucleic Acids Res 25: 3389–3402. line microbial mat. Mol Syst Biol 4: 198.
Biddle, J.F., Fitz-Gibbon, S., Schuster, S.C., Brenchley, J.E., Lane, D.J. (1991) 16S/23S rRNA sequencing. In Nucleic Acid
and House, C.H. (2008) Metagenomic signatures of the Techniques in Bacterial Systematics. Stackebrandt, E., and
Peru Margin subseafloor biosphere show a genetically dis- Goodfellow, M. (eds) Chichester, UK: Wiley, 115–175.
tinct environment. Proc Natl Acad Sci USA 105: 10583– Lanyi, J.K. (1974) Salt-dependent properties of proteins from
10588. extremely halophilic bacteria. Microbiol Rev 38: 272–290.
Bodaker, I., Béjà, O., Sharon, I., Feingersch, R., Rosenberg, Legault, B., Lopez-Lopez, A., Alba-Casado, J., Doolittle,
M., Oren, A., et al. (2009) Archaeal diversity in the Dead W.F., Bolhuis, H., Rodriguez-Valera, F., and Papke, R.T.
Sea: Microbial survival under increasingly harsh condi- (2006) Environmental genomics of ‘Haloquadratum
tions. In Saline Lakes around the World: Unique Systems walsbyi’ in a saltern crystallizer indicates a large pool of
with Unique Values. Oren, A., Naftz, D., Palacios, P., and accessory genes in an otherwise coherent species. BMC
Wurtsbaugh, W.A. (eds) Logan, UT, USA: The S.J. and Genomics 7: 171.
Jessie E. Quinney Natural Resources Research Library, Ley, R.E., Harris, J.K., Wilcox, J., Spear, J.R., Miller, S.R.,
College of Natural Resources, Utah State University, pp. Bebout, B.M., et al. (2006) Unexpected diversity and com-
137–143. plexity of the Guerrero Negro hypersaline microbial mat.
Bolhuis, H., Palm, P., Wende, A., Falb, M., Rampp, M., Appl Environ Microbiol 72: 3685–3695.
Rodriguez-Valera, F., et al. (2006) The genome of the Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H.,
square archaeon Haloquadratum walsbyi: life at the limits Yadhukumar, et al. (2004) ARB: a software environment for
of water activity. BMC Genomics 7: 169. sequence data. Nucleic Acids Res 32: 1363–1371.
Cole, J.R., Chai, B., Farris, R.J., Wang, Q., Kulam-Syed- Madern, D., Ebel, C., and Zaccai, G. (2000) Halophilic adap-
Mohideen, A.S., McGarrell, D.M., et al. (2007) The riboso- tation of enzymes. Extremophiles 4: 91–98.
mal database project (RDP-II): introducing myRDP space Nelson, K.E., Clayton, R.A., Gill, S.R., Gwinn, M.L., Dodson,
and quality controlled public data. Nucleic Acids Res 35: R.J., Haft, D.H., et al. (1999) Evidence for lateral gene
D169–D172. transfer between Archaea and Bacteria from genome
Dean, F.B., Nelson, J.R., Giesler, T.L., and Lasken, R.S. sequence of Thermotoga maritima. Nature 399: 323–329.
(2001) Rapid amplification of plasmid and phage DNA Ng, W.V., Kennedy, S.P., Mahairas, G.G., Berquist, B., Pan,
using Phi29 DNA polymerase and multiply-primed rolling M., Shukla, H.D., et al. (2000) Genome sequence of Halo-
circle amplification. Genome Res 11: 1095–1099. bacterium species NRC-1. Proc Natl Acad Sci USA 97:
Edgar, R.C. (2004) MUSCLE: multiple sequence alignment 12176–12181.
with high accuracy and high throughput. Nucleic Acids Res Oren, A. (1986) Intracellular salt concentrations of the
32: 1792–1797. anaerobic halophilic eubacteria Haloanaerobium praeva-
Gavrieli, I., Beyth, M., and Yechieli, Y. (1999) The Dead lens and Halobacteroides halobius. Can J Microbiol 32:
Sea-A terminal lake in the Dead Sea rift: a short overview. 4–9.
In Microbiology and Biogeochemistry of Hypersaline Envi- Oren, A. (1999) Bioenergetic aspects of halophilism. Micro-
ronments. Oren, A. (ed.) Boca Raton, USA: CRC Press, biol Mol Biol Rev 63: 334–348.
pp. 121–127. Oren, A. (2002) Halophilic Microorganisms and Their Envi-
Gill, S.R., Pop, M., Deboy, R.T., Eckburg, P.B., Turnbaugh, ronments. Dordrecht, the Netherlands: Kluwer.
P.J., Samuel, B.S., et al. (2006) Metagenomic analysis of Oren, A., and Gurevich, P. (1995) Dynamics of a bloom of
the human distal gut microbiome. Science 312: 1355–1359. halophilic archaea in the Dead Sea. Hydrobiologia 315:
Goodarzi, H., Torabi, N., Najafabadi, H.S., and Archetti, M. 149–158.
(2008) Amino acid and codon usage profiles: adaptive Oren, A., Larimer, F., Richardson, P., Lapidus, A., and
changes in the frequency of amino acids and codons. Csonka, L.N. (2005) How to be moderately halophilic with
Gene 407: 30–41. broad salt tolerance: clues from the genome of Chromoha-
Haney, P.J., Badger, J.H., Buldak, G.L., Reich, C.I., Woese, lobacter salexigens. Extremophiles 9: 275–279.
C.R., and Olsen, G.J. (1999) Thermal adaptation analyzed Paul, S., Bag, S., Das, S., Harvill, E., and Dutta, C. (2008)
by comparison of protein sequences from mesophilic and Molecular signature of hypersaline adaptation: insights
extremely thermophilic Methanococcus species. Proc Natl from genome and proteome composition of halophilic
Acad Sci USA 96: 3578–3583. prokaryotes. Genome Biol 9: R70.
Huson, D.H., Auch, A.F., Qi, J., and Schuster, S.C. (2007) Reistad, R. (1970) On the composition and nature of the bulk
MEGAN analysis of metagenomic data. Genome Res 17: protein of extremely halophilic bacteria. Arch Microbiol 71:
377–386. 353–360.
Ihaka, R., and Gentleman, R. (1996) R: a language for data Rivera, M.C., Jain, R., Moore, J.E., and Lake, J.A. (1998)
analysis and graphics. J Comput Graph Stat 5: 299–314. Genomic evidence for two functionally distinct gene
Kennedy, S.P., Ng, W.V., Salzberg, S.L., Hood, L., and Das- classes. Proc Natl Acad Sci USA 95: 6239–6244.
Sarma, S. (2001) Understanding the adaptation of Halo- Robertson, C.E., Spear, J.R., Harris, J.K., and Pace, N.R.
bacterium species NRC-1 to its extreme environment (2009) Diversity and stratification of Archaea in a hypersa-
through computational analysis of its genome sequence. line microbial mat. Appl Environ Microbiol 75: 1801–1810.
Genome Res 11: 1641–1650. Santos, H., and da Costa, M.S. (2002) Compatible solutes of
Kunin, V., Raes, J., Harris, J.K., Spear, J.R., Walker, J.J., organisms that live in hot saline environments. Environ
Ivanova, N., et al. (2008) Millimeter-scale genetic gradients Microbiol 4: 501–509.

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623
Amino acid signatures of the Dead Sea 2623
Sogin, M.L., Morrison, H.G., Huber, J.A., Welch, D.M., Huse, shotgun sequencing of the Sargasso Sea. Science 304:
S.M., Neal, P.R., et al. (2006) Microbial diversity in the 66–74.
deep sea and the underexplored ‘rare biosphere’. Proc Natl Volcani, B.E. (1944) The microorganisms of the Dead Sea. In
Acad Sci USA 103: 12115–12120. Papers Collected to Commemorate the 70th Anniversary of
Soppa, J. (2006) From genomes to function: haloarchaea as Dr Chaim Weizmann. Collective Volume. Rehovoth, Israel:
model organisms. Microbiology 152: 585–590. Daniel Sieff Research Institute, pp. 77–85.
Spear, J.R., Ley, R.E., Berger, A.B., and Pace, N.R. (2003) Ward, J.H. (1963) Hierarchical grouping to optimize an objec-
Complexity in natural microbial ecosystems: the Guerrero tive function. J Am Stat Assoc 58: 236–244.
Negro experience. Biol Bull 204: 168–173.
Stamatakis, A., Ludwig, T., and Meier, H. (2005) Raxml-iii: a
fast program for maximum likelihood-based inference of Supporting information
large phylogenetic trees. Bioinformatics 21: 456–463.
Tringe, S.G., von Mering, C., Kobayashi, A., Salamov, A.A., Additional Supporting Information may be found in the online
Chen, K., Chang, H.W., et al. (2005) Comparative metage- version of this article:
nomics of microbial communities. Science 308: 554–
Table S1. Amino acid and GC content of microorganisms and
557.
the Dead Sea metagenomes.
Turnbaugh, P.J., Ley, R.E., Mahowald, M.A., Magrini, V.,
Table S2. Foreign taxa and salt-out halophilic taxonomic
Mardis, E.R., and Gordon, J.I. (2006) An obesity-
groups.
associated gut microbiome with increased capacity for
Table S3. Bacterial taxonomic groups represented through
energy harvest. Nature 444: 1027–1131.
lateral gene transfer events into Dead Sea salt-in halophiles.
Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram,
Table S4. Bacterial taxonomic groups displaying evidence of
R.J., Richardson, P.M., et al. (2004) Community structure
being indigenous to the Dead Sea.
and metabolism through reconstruction of microbial
Appendix S1. Comparison of the GC content and amino acid
genomes from the environment. Nature 428: 37–43.
proportions of the Dead Sea metagenomes and representa-
Valenzuela-Encinas, C., Neria-González, I., Alcántara-
tive halophiles.
Hernández, R., Enríquez-Aragón, J., Estrada-Alvarado, I.,
Appendix S2. Analysis of bacterial taxa observed in the
Hernández-Rodríguez, C., et al. (2008) Phylogenetic
Dead Sea for halophilicity.
analysis of the archaeal community in an alkaline-saline
soil of the former lake Texcoco (Mexico). Extremophiles 12: Please note: Wiley-Blackwell are not responsible for the
247–254. content or functionality of any supporting materials supplied
Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., by the authors. Any queries (other than missing material)
Rusch, D., Eisen, J.A., et al. (2004) Environmental genome should be directed to the corresponding author for the article.

© 2010 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 2613–2623

You might also like