A Phylogenomic Investigation Into The Origin of Metazoa-2008-Ruiz-Trillo-664-72
A Phylogenomic Investigation Into The Origin of Metazoa-2008-Ruiz-Trillo-664-72
A Phylogenomic Investigation Into The Origin of Metazoa-2008-Ruiz-Trillo-664-72
Iñaki Ruiz-Trillo,* Andrew J. Roger,* Gertraud Burger,à Michael W. Gray,* and B. Franz Langà
*Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada; ICREA Researcher at Departament de
Genètica, Universitat de Barcelona, Barcelona, Spain; and àDépartement de Biochimie, Robert Cedergren Center for Bioinformatics
and Genomics, Université de Montréal, Program in Evolutionary Biology, Canadian Institute for Advanced Research, Boulevard
Edouard-Montpetit, Montréal, Québec, Canada
The evolution of multicellular animals (Metazoa) from their unicellular ancestors was a key transition that was accompanied
by the emergence and diversification of gene families associated with multicellularity. To clarify the timing and order of
specific events in this transition, we conducted expressed sequence tag surveys on 4 putative protistan relatives of Metazoa
including the choanoflagellate Monosiga ovata, the ichthyosporeans Sphaeroforma arctica and Amoebidium parasiticum,
and the amoeba Capsaspora owczarzaki, and 2 members of Amoebozoa, Acanthamoeba castellanii and Mastigamoeba
balamuthi. We find that homologs of genes involved in metazoan multicellularity exist in several of these unicellular
organisms, including 1 encoding a membrane-associated guanylate kinase with an inverted arrangement of protein-protein
interaction domains (MAGI) in Capsaspora. In Metazoa, MAGI regulates tight junctions involved in cell-cell
communication. By phylogenomic analyses of genes encoded in nuclear and mitochondrial genomes, we show that the
choanoflagellates are the closest relatives of the Metazoa, followed by the Capsaspora and Ichthyosporea lineages, although
the branching order between the latter 2 groups remains unclear. Understanding the function of ‘‘metazoan-specific’’
proteins we have identified in these protists will clarify the evolutionary steps that led to the emergence of the Metazoa.
and Bilateria (Eumetazoa), shows that within the ANTP M. ovata (a lambda ZAPII library, kindly provided by
class, A. queenslandica possesses several NK-like genes Dr P. Holland). The number of ESTs passing quality control
but no Hox, ParaHox, or EHGbox genes (Larroux et al. and submitted to further analysis was 13,770 (A. castella-
2007). Similarly, the placozoan Trichoplax adhaerens, a nii), 3,623 (A. parasiticum), 8,870 (C. owczarzaki), 19,182
basal metazoan of unclear phylogenetic position (Dellaporta (M. balamuthi), 6,433 (M. ovata), and 8,006 (S. arctica).
et al. 2006), seems also to have a low diversity of ANTP class EST data were automatically clustered by tools imple-
homeobox genes (Monteiro et al. 2006). Genes coding for mented in Taxonomically Broad EST Database (TBestDB)
proteins involved in cell adhesion and the extracellular matrix (http://amoebidia.bcm.umontreal.ca/pepdb/searches/login.
are also of key importance to the origin of metazoan multi- php?bye5true) (O’Brien et al. 2007) and AnaBench
cellularity. Although some of these proteins, such as the cad- (http://anabench.bcm.umontreal.ca/anabench/) (Badidi
herins, have already been detected in choanoflagellates et al. 2003). Capsaspora owczarzaki and S. arctica data
(King et al. 2003; King 2004), others have been suggested were also manually clustered using Phred and Phrap and
to be specific to Metazoa, including the membrane-associated CAP3 (Ewing and Green 1998; Huang and Madan
guanylate kinases with inverted arrangements (MAGIs) that 1999). All EST data generated for this article are publicly
participate in the assembly of multiprotein complexes at available from the GenBank EST data set, and clusters are
regions of cell–cell contact (Dobrosotskaya et al. 1997; available at TBestDB.
Dobrosotskaya and James 2000). All these observations seem
to suggest that the last common ancestor of metazoans was not
Purification and Sequencing of Mitochondrial DNA
as genetically complex as the last common ancestor of Eume-
tazoa (Metazoa excluding Porifera). On the other hand, it is Cells were lysed in the presence of 1% sodium dodecyl
Table 1
Deep Phylogenetic Relationships between Metazoa and Their Unicellular Relatives Obtained with Different Data Sets and
Using Various Methodologies
Topologies (% bootstrap support)
Nuclear data set Mitochondrial data set
Ecdysozoa Cnidaria Capsaspora þ Trichoplax þ Capsaspora-independent
Methods monophyletic monophyletic Ichthyosporea Bilateria lineage
ML (51) (65) þ (100) (85) þ
ML (FSR) (,50) (83) þ (100) NA NA
ML excluding taxa with .50% missing data þ (57) NA þ (100) NA NA
ML (recoded data) þ (95) þ (52) þ (97) (39) þ (57/49)
Bayesian (CAT model) þ (99) þ (51) þ (97) — þ (93/72)
Bayesian (CAT model þ recoded) þ þ þ NA NA
NOTE.—Statistical support is indicated where available. See text and Supplementary Material online for full details and discussion of methods. NA, not available; ML,
maximum likelihood (IQPNNI program); and FSR, fast-evolving sites removed.
To minimize potential systematic error, we used sev- ESTs. We also searched in all available ESTs and genomic
eral methods. First, we reconstructed the tree with the fast- databases from Eukaryota. We used the cnidarian homolog
Table 2
Phylogenetic Distribution of Genes Associated with Metazoan Multicellularity
Other Nonmetazoa Taxa
Function Gene Product Organisms GenBank Accession Number Where Gene Is Present
Cell adhesion and Tetraspanin Capsaspora EF693744 Fungi and Amoebozoa
adhesion related Capsaspora and EC736556, EC165586,
Laminin A Monosiga ovata EC164818 Amoebozoa and Trypanosoma
Beta-catenin–interacting
protein (ICAT) Capsaspora, Acanthamoeba EC740811, EF693748 Dictyostelium
MAGI Capsaspora EF611870 —
Ankyrin Capsaspora, Mastigamoeba EC737721, EC705671 Plants and Fungi
EF693745, EF693746,
Fascin Capsaspora, Monosiga ovata EF693747 Dictyostelium
Transcription factor Forkhead Amoebidium EC627545, EC629343 Fungi
Cell signaling Hedgehog Monosiga ovata ABA55664 —
NOTE.—See main text and Supplementary Material online for details of the methods used.
Results and Discussion proteins (2,619 amino acid positions), including homologs
Molecular Phylogeny from the complete mitochondrial genome of Capsaspora
that we have determined. Phylogenetic trees were inferred
Danio rerio
79/94 Mus musculus
Ciona intestinalis Metazoa
83/66
Strongylocentrotus purpuratus
Branchiostoma sp.
95/99 Drosophila melanogaster
Anopheles sp.
C. elegans
99/100
50/56 Schistosoma mansoni
Hydra magnipapillata
52/51 Porites porites
Oscarella carmela
Proterospongia sp.
Monosiga brevicollis Choanoflagellata
Monosiga ovata
Sphaeroforma arctica Ichthyosporea
Amoebidium parasiticum
97/97 Capsaspora owczarzaki Capsaspora
98/100 Neurospora sp.
M. grisea
Gibberella sp. Fungi
S. cerevisiae
C. albicans
99/100 Cryptococcus sp.
Phanerochaete chrysosporium
Ustilago maydis
Neocallimastix sp.
M. balamuthi Amoebozoa
D. discoideum
Acanthamoeba castellanii
0.1
FIG. 1.—Phylogeny of the opisthokonts based on concatenation of 110 nucleus-encoded proteins. The topology and branch lengths were obtained
using Bayesian analyses (PhyloBayes) with the amino acids recoded into functional categories. All branches with posterior probability values of 1.0 are
marked with a filled dot (black). The dot is colored red if maximum likelihood (ML) bootstrap analysis (IQPNNI) and Bayesian (PhyloBayes) bootstrap
also yields 100% support. For other relevant nodes, ML (with amino acids recoded into functional categories) bootstrap and Bayesian bootstrap support
values are indicated (in %). Taxa from which new data have been obtained from an EST project are depicted in bold. See Materials and Methods for
further details and supplementary table S1 (Supplementary Material online) for complete names of taxa.
668 Ruiz-Trillo et al.
FIG. 2.—Phylogenomic analysis based on mitochondrial proteins. The alignment (2,619 amino acid positions, after trimming with Gblocks) was
built from 13 protein sequences that are encoded in most mtDNAs. The topology and branch lengths were obtained in a PhyloBayes analysis. All
branches with posterior support values of 1.0 are marked with a filled dot (black). The dot is colored red when, in addition, maximum likelihood (ML)
bootstrap analysis with IQPNNI yields 100% support. Other ML (IQPNNI) bootstrap support values of interest are indicated. Genus abbreviations are:
L. polyphemus, Limulus polyphemus and K. tunicata, Katharina tunicata.
process across sites (Lartillot and Philippe 2004; Lartillot (fig. 1), whereas the mitochondrial tree shows Capsaspora
et al. 2007; Rodriguez-Ezpeleta et al. 2007). As expected, in an intermediate position between ichthyosporeans and
the use of recoding or of complex models such as CAT choanoflagellates (fig. 2). Using the SH and the ELWs tests,
improved the results. For example, some widely accepted we found that we could statistically reject the mitochondrial
metazoan clades such as Ecdysozoa or Cnidaria were recov- topology using the nuclear data set (P values 5 0.04
ered only when using more complex models with the and 0, respectively). The reciprocal test, the nuclear tree
nuclear data set (see table 1 and fig. 1) (Ruiz-Trillo et al. imposed upon the mitochondrial data set was also rejected
2002; Lavrov and Lang 2005; Philippe et al. 2005; Philippe (P value 5 0.04 and 0.02). The incongruity between these
and Telford 2006; but see Wolf et al. 2004; Philip et al. data sets most likely results from a phylogenetic artifact, and
2005; Rogozin et al. 2007). Curiously, the monophyly of it is difficult to assess which topology is correct. One possible
Metazoa has a very low bootstrap support (fig. 1). This contributing factor could be the different taxonomic sam-
is probably due to the effect of the missing data for both pling in the mitochondrial versus the nuclear data set
Oscarella carmela (70.12% missing data) and Porites por- (the mitochondrial data set includes just one representative
ites (56.60% missing data; see supplementary table S1, of each of the 3 unicellular opisthokont lineages, but a wider
Supplementary Material online). Consistent with this hypoth- sampling of metazoans). However, the nuclear tree exclud-
esis, a tree excluding those taxa with more than 50% missing ing taxa with more than 50% missing data not only has
data shows a maximum likelihood bootstrap support of 100% a similar sampling of unicellular taxa as the mitochondrial
for Metazoa (supplementary fig. S1, Supplementary Material tree but also recovers ichthyosporeans and Capsaspora as
online). An important point is that the position of Capsas- sister groups (supplementary fig. S1, Supplementary Mate-
pora, ichthyosporeans, and choanoflagellates remained iden- rial online). The source of the apparent strong incongruity
tical regardless of the method (table 1). Curiously, the nuclear between these data sets remains unclear, but because the
tree shows Capsaspora as the sister group to ichthyosporeans mitochondrial analysis is based on fewer aligned positions
The Origin of Metazoa: Phylogenomics 669
exhibit the simplest architecture (a single fascin domain), has been supported by European Molecular Biology Orga-
whereas in Metazoa, most fascin homologs are organized nization and CIHR postdoctoral fellowships and by an In-
into 2, 3, 4,or 6 contiguous fascin domains (supplementary stitució Catalana de Recerca i Estudis Avancxats contract.
fig. S4, Supplementary Material online). Fascin proteins in
both Capsaspora and M. ovata possess 4 contiguous fascin
domains, probably representing an intermediate form be- Literature Cited
tween fascin protein architecture in Dictyostelium and that Adamska M, Degnan SM, Green KM, Adamski M, Craigie A,
found in Metazoa. Elucidation of the function of the fascin Larroux C, Degnan BM. 2007. Wnt and TGF-beta expression
domain proteins in the unicellular opisthokonts will likely in the sponge Amphimedon queenslandica and the origin of
be key to understanding the evolution of cell adhesion and metazoan embryonic patterning. PLoS ONE. 2:e1031.
migration in Metazoa. Adell T, Gamulin V, Perovic-Ottstadt S, Wiens M, Korzhev M,
Muller IM, Muller WE. 2004. Evolution of metazoan cell
junction proteins: the scaffold protein MAGI and the trans-
Conclusions membrane receptor tetraspanin in the demosponge Suberites
domuncula. J Mol Evol. 59:41–50.
In summary, our analyses show definitively that Ichthyo- Badidi E, De Sousa C, Lang BF, Burger G. 2003. AnaBench:
sporea and Capsaspora diverged prior to choanoflagellates a Web/CORBA-based workbench for biomolecular sequence
and that the latter organisms are the closest unicellular rel- analysis. BMC Bioinformatics. 4:63.
atives of Metazoa. More importantly, our comparisons of Baldauf SL, Palmer JD. 1993. Animals and fungi are each other’s
EST and genomic data indicate that unicellular opisthokont closest relatives: congruent evidence from multiple proteins.
Proc Natl Acad Sci USA. 90:11558–11562.
King N. 2004. The unicellular ancestry of animal development. Philippe H, Telford MJ. 2006. Large-scale sequencing and the
Dev Cell. 7:313–325. new animal phylogeny. Trends Ecol Evol. 21:614–620.
King N, Carroll SB. 2001. A receptor tyrosine kinase from Putnam NH, Srivastava M, Hellsten U, et al. 2007. Sea anemone
choanoflagellates: molecular insights into early animal genome reveals ancestral eumetazoan gene repertoire and
evolution. Proc Natl Acad Sci USA. 98:15032–15037. genomic organization. Science. 317:86–94.
King N, Hittinger CT, Carroll SB. 2003. Evolution of key cell Ragan MA, Goggin CL, Cawthorn RJ, Cerenius L, Jamieson AV,
signaling and adhesion protein families predates animal Plourde SM, Rand TG, Soderhall K, Gutell RR. 1996. A
origins. Science. 301:361–363. novel clade of protistan parasites near the animal-fungal
Kureishy N, Sapountzi V, Prag S, Anilkumar N, Adams JC. divergence. Proc Natl Acad Sci USA. 93:11907–11912.
2002. Fascins, and their roles in cell structure and function. Ragan MA, Murphy CA, Rand TG. 2003. Are Ichthyosporea
BioEssays. 24:350–361. animals or fungi? Bayesian phylogenetic analysis of elonga-
Kusserow A, Pang K, Sturm C, et al. 2005. Unexpected complexity tion factor 1alpha of Ichthyophonus irregularis. Mol Phylo-
of the Wnt gene family in a sea anemone. Nature. 433:156–160. genet Evol. 29:550–562.
Lang BF, Burger G. 2007. Purification of mitochondrial and Rodriguez-Ezpeleta N, Brinkmann H, Roure B, Lartillot N,
plastid DNA. Nat Protoc. 2:652–660. Lang BF, Philippe H. 2007. Detecting and overcoming
Lang BF, O’Kelly C, Nerad T, Gray MW, Burger G. 2002. The
systematic errors in genome-scale phylogenies. Syst Biol.
closest unicellular relatives of animals. Curr Biol. 12:1773–1778.
56:389–399.
Larroux C, Fahey B, Degnan SM, Adamski M, Rokhsar DS,
Rogozin IB, Wolf YI, Carmel L, Koonin EV. 2007. Ecdysozoan
Degnan BM. 2007. The NK homeobox gene cluster predates
clade rejected by genome-wide analysis of rare amino acid
the origin of Hox genes. Curr Biol. 17:706–710.
Lartillot N, Brinkmann H, Philippe H. 2007. Suppression of replacements. Mol Biol Evol. 24:1080–1090.
long-branch attraction artefacts in the animal phylogeny using Ruiz-Trillo I, Burger G, Holland PW, King N, Lang BF, Roger AJ,
Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood- Sullivan JC, Ryan JF, Mullikin JC, Finnerty JR. 2007. Conserved
based phylogenetic analyses with thousands of taxa and and novel Wnt clusters in the basal eumetazoan Nematostella
mixed models. Bioinformatics. 22:2688–2690. vectensis. Dev Genes Evol. 217:235–239.
Stamatakis A, Ludwig T, Meier H. 2005. RAxML-III: a fast Technau U, Rudd S, Maxwell P, et al. 2005. Maintenance of
program for maximum likelihood-based inference of large ancestral complexity and non-metazoan genes in two basal
phylogenetic trees. Bioinformatics. 21:456–463. cnidarians. Trends Genet. 21:633–639.
Steenkamp ET, Baldauf SL. 2004. Origin and evolution of Valentine JW. 2004. On the origin of phyla. Chicago: The
animals, fungi and their unicellular allies (Opisthokonta). University of Chicago Press.
In: Hirt RP, Homer DS, editors. Boca Raton (FL): CRC Press. Wolf YI, Rogozin IB, Koonin EV. 2004. Coelomata and not
p. 109–129. Ecdysozoa: evidence from genome-wide phylogenetic analy-
Steenkamp ET, Wright J, Baldauf SL. 2006. The protistan origins sis. Genome Res. 14:29–36.
of animals and fungi. Mol Biol Evol. 23:93–106.
Strimmer K, Rambaut A. 2002. Inferring confidence sets of Laura Katz, Associate Editor
possibly misspecified gene trees. Proc R Soc Lond B Biol Sci.
269:137–142. Accepted December 29, 2007