The New Tree of Eukaryotes
The New Tree of Eukaryotes
The New Tree of Eukaryotes
, The New Tree of Eukaryotes, Trends in Ecology & Evolution (2019), https://doi.org/10.1016/
j.tree.2019.08.008
Review
The New Tree of Eukaryotes
Fabien Burki,1,2,*,@ Andrew J. Roger,3,4 Matthew W. Brown,5,6 and Alastair G.B. Simpson4,7,*
For 15 years, the eukaryote Tree of Life (eToL) has been divided into five to eight major group- Highlights
ings, known as ‘supergroups’. However, the tree has been profoundly rearranged during this The eukaryote Tree of Life (eToL)
time. The new eToL results from the widespread application of phylogenomics and numerous dis- represents the phylogeny of all
coveries of major lineages of eukaryotes, mostly free-living heterotrophic protists. The evidence eukaryotic lineages, with the vast
that supports the tree has transitioned from a synthesis of molecular phylogenetics and biolog- bulk of this diversity comprising
ical characters to purely molecular phylogenetics. Most current supergroups lack defining microbial ‘protists’. Since the early
morphological or cell-biological characteristics, making the supergroup label even more arbi- 2000s, the eToL has been summa-
rized in a few (five to eight) ‘super-
trary than before. Going forward, the combination of traditional culturing with maturing cul-
groups’. Recently, this tree has
ture-free approaches and phylogenomics should accelerate the process of completing and
been deeply remodeled due
resolving the eToL at its deepest levels.
mainly to the maturation of phylo-
genomics and the addition of
The Eukaryote Tree of Life numerous new ‘kingdom-level’ lin-
Resolving the evolutionary tree for all eukaryotes has been a long-standing goal in biology. Inferring eages of heterotrophic protists.
an eToL that is both accurate and comprehensive is a worthwhile objective in itself, but the eToL is
The current eToL is derived almost
also the framework on which we understand the origins and history of eukaryote biology and the exclusively from molecular phylog-
evolutionary processes underpinning it. It is therefore a fundamental tool for studying many aspects enies, in contrast to earlier models
of eukaryote evolution, such as cell biology, genome organization, sex, and multicellularity. In the that were syntheses of molecular
molecular era, the eToL has also become a vital resource to interpret environmental sequence and other biological data.
data and thus reveal the diversity and composition of ecological communities.
The supergroup model for the eToL
has become increasingly abstract
Although most of the described species of eukaryotes belong to the multicellular groups of animals
due to the absence of known
(Metazoa), land plants, and fungi, it has long been clear that these three ‘kingdoms’ represent only a shared derived characteristics for
small proportion of high-level eukaryote diversity. The vast bulk of this diversity – including dozens of the new supergroups.
extant ‘kingdom-level’ taxa – is found within the ‘protists’, the eukaryotes that are not animals, plants,
or fungi [1–6]. To a first approximation, inferring the eToL is to resolve the relationships among the Culture-based studies, not higher-
major protist lineages. However, this task is complicated by the fact that protists are much less stud- throughput methods, have been
ied overall than animals, plants, or fungi [7]. Molecular sequence data has accumulated slowly for responsible for most of the new major
lineages recently added to the eToL.
many known protist taxa and numerous important lineages were completely unknown (or were not
cultivated, hence challenging to study) when the molecular era began. Thus, resolving the eToL
has been a process where large-scale discovery of major lineages has occurred simultaneously 1Department of Organismal Biology,
with deep-level phylogenetic inference. This makes the task at hand analogous to a jigsaw puzzle, Program in Systematic Biology, Uppsala
University, Uppsala, Sweden
but one where a large and unknown number of pieces are missing from the box and instead are hid-
2Science for Life Laboratory, Uppsala
den under various pieces of the furniture.
University, Uppsala, Sweden
3Department of Biochemistry and
sity among five to eight major taxa usually referred to as ‘supergroups’ [8–12]. The category of su- Evolutionary Bioinformatics, Dalhousie
University, Halifax, NS, Canada
pergroup was a purely informal one, denoting extremely broad assemblages that contain, for
5Department of Biological Sciences,
example, the traditional ‘kingdoms’ like Metazoa and Fungi as subclades. Thus, the original super-
Mississippi State University, Mississippi
groups generally represented the most inclusive collections of organisms within eukaryotes for State, MS, USA
which there was reasonable evidence that they formed a monophyletic group. A typical list of these 6Institute for Genomics, Biocomputing,
groups included (with some differences in capitalization and endings): Archaeplastida (also known and Biotechnology, Mississippi State
University, Mississippi State, MS, USA
as Plantae), Chromalveolata, Rhizaria (or Cercozoa), Opisthokonta, Amoebozoa, and Excavata (see
7Department of Biology, Dalhousie
Box 1 for short descriptions). The main variations between accounts from that time were that some
University, Halifax, NS, Canada
united Opisthokonta and Amoebozoa as ‘unikonts’ [12] (much later renamed ‘Amorphea’ [13]) or @Twitter: @fburki (F. Burki).
did not show Excavata and/or Chromalveolata confidently resolved as clades [10,11]. For half of
*Correspondence:
the groups (i.e., Opisthokonta, Amoebozoa, and Rhizaria), the principal evidence supporting their fabien.burki@ebc.uu.se,
unity was the phylogenies of one or a few genes [14–16]. For the others, it was a combination of alastair.simpson@dal.ca
Trends in Ecology & Evolution, Month 2019, Vol. xx, No. xx https://doi.org/10.1016/j.tree.2019.08.008 1
ª 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Please cite this article in press as: Burki et al., The New Tree of Eukaryotes, Trends in Ecology & Evolution (2019), https://doi.org/10.1016/
j.tree.2019.08.008
weaker molecular phylogenetic evidence and shared derived cell-biological features. Archaeplas-
tida and Chromalveolata were each identified by the presence of similar plastids [17,18], with se-
quences from plastid genomes supporting an ancestral endosymbiotic origin of plastids in each
group [19,20]. Excavata, meanwhile, was distinguished by the inference that taxa shared a derived,
complex flagellar apparatus cytoskeleton [21]. Consequently, the original supergroup-based eToLs
were syntheses of different information rather than straightforward summaries of molecular
phylogenies.
The supergroup model for the eToL became widely popular in both the primary literature and text-
books, for several reasons. First, the model made for convenient and efficient summaries of eukary-
otes, since almost all species fell into these few relatively diverse major groups. Second, all of the
original supergroups, except Rhizaria, had at least one distinctive biological characteristic that
seemed to ancestrally define them (see above and Box 1). Third, the groupings seemed to coincide
with the limits of phylogenetic resolution. In fact, the overarching supergroup model has remained
the standard description of the eToL for 15 years, despite major changes in our knowledge of eukary-
otic phylogeny and diversity over that time.
Opisthokonta includes animals, fungi, and several protist lineages that are most closely related to either
animals or fungi. Opisthokonta remains a robust clade in modern phylogenies; however, it is nested within
at least two larger taxa, Amorphea and Obazoa, that are frequently treated as supergroups instead.
Amoebozoa is also still a robust group, but now is often regarded as a member of the supergroup Amor-
phea. Amoebozoa includes free-living amoeboid forms with lobose pseudopodia (e.g., Amoeba) but
also more filose amoebae, some flagellates, and various slime molds.
Excavata was originally proposed based on a distinctive morphology, namely a particular feeding groove
form and associated cytoskeleton system, found in many enigmatic flagellated protists. Phylogenetics
and phylogenomics defined three monophyletic subgroups – Discoba, Metamonada, and malawimonads
– but have not consistently placed them together as a single clade. The name is now usually restricted to
a Discoba–Metamonada clade (quite possibly artificial; see main text) or regarded as referring to a paraphy-
letic group.
Archaeplastida are distinguished by the presence of primary plastids – the photosynthetic organelles
deriving directly from cyanobacteria by endosymbiosis. The three main groups with primary plastids are
the green algae and land plants, red algae (and likely their recently discovered relative Rhodelphis), and
glaucophyte algae. Today, Archaeplastida is generally still considered a supergroup, although most phy-
logenomic analyses do not strongly support its monophyly (i.e., all three host lineages forming a single
clade to the exclusion of other supergroups).
Chromalveolata contained groups with red alga-derived secondary plastids (i.e., Alveolata, Stramenopila,
Haptophyta, and Cryptophyta). This group was based on the assumption that these plastids were acquired
once in a common ancestor, which was supported by plastid evidence but never strongly from the host
perspective. Chromalveolata has been shown to be polyphyletic, with Alveolata and Stramenopila
belonging to Sar (in TSAR), Haptophyta in Haptista, and Cryptophyta in Cryptista.
Rhizaria was the latest addition at the time the supergroup model was proposed. It includes a wide diversity
of amoebae (e.g., foraminiferans, the radiolarians, filose testate amoebae), flagellates, various parasites,
and the chlorarachniophyte algae. In contrast to all other original supergroups, which were at least partly
distinguished by morphological characters, Rhizaria was inferred more or less exclusively using molecular
phylogenetics. It is now part of Sar (in TSAR) along with Alveolata and Stramenopila.
Phylogenomics
The term ‘phylogenomics’ covers various approaches combining genomic-scale data with phyloge-
netic methods. In the context of the eToL, it usually refers to the estimation of organismal phylogeny
from datasets containing dozens to hundreds of gene alignments, most often nucleus-encoded
genes analyzed as inferred amino acid sequences [22]. The data are sourced from a mixture of
genome and, frequently, transcriptome sequencing projects. The introduction of phylogenomics
offered the promise of overcoming the limited information afforded by single genes, which were
mostly inadequate to resolve deep divergences within the eToL [23]. However, voices warned early
on that most of the analysis artefacts known to afflict single-gene phylogenies can also apply to phy-
logenomics [24]. Phenomena that cause unrelated taxa to cluster together in phylogenies, such as
compositional bias and high rates of sequence divergence, often also affect the whole genome.
Therefore, merely adding genes can amplify artefacts rather than overriding them [25]. Accuracy
might be improved by using more realistic evolutionary models, and especially by careful choice of
taxa, where this is possible (see below). Examining multiple genes also raises the specter of
combining different gene histories together artificially, making careful quality controls essential to
eliminate incorrect paralog assignments, contaminating sequences, etc. (see Box 2 for a typical ‘phy-
logenomic pipeline’).
Dataset assembly is followed by the actual phylogenomic analyses, in which hundreds of genes are concate-
nated into a phylogenomic ‘supermatrix’. Usually, both ML and Bayesian analyses are conducted. Various
evolutionary models are employed, with choice often constrained by computational logistics. Site-heteroge-
neous models, in which the profile of substitution propensities can differ among sites in the alignment, appear
to be particularly important for improved phylogenetic accuracy. These models were first implemented in the
Bayesian inference platform PhyloBayes [102], but the analyses are computationally intensive and problems
with mixing and convergence are common. Recently, practical ML implementations of site-heterogeneous
models have become available in IQ-Tree [103,104]. Frequently, subsidiary analyses are conducted to test
whether initial results are robust to perturbations of the data, especially excluding data most likely to foster
incorrect phylogenetic inference (e.g., the fastest-evolving species, sites, or genes).
Although pioneering phylogenomic studies were instrumental in showing what could be done,
they contributed only marginally to the original supergroup model, mostly because the sampling
of protist taxa was extremely limited (e.g., missing entire supergroups, especially Rhizaria)
[20,26,27]. This situation gradually improved, however, and by the late 2000s some genome/tran-
scriptome data were available for most well-known major groups [28–33]. Since then, the wide-
spread use of next-generation sequencing, especially multiplexed transcriptomics, has greatly
accelerated improvements in taxon sampling within the most familiar protist taxa [34–44]. As a
result, paneukaryote phylogenomic analyses of datasets of 120–350+ nucleus-encoded genes
have become the dominant tool for inferring the eToL at the level of major lineages. Overwhelm-
ingly, recent depictions of the eToL at its broadest scale are summaries of such phylogenomic an-
alyses. Thus, unlike for the original supergroup trees, there is now little to no integration of other
information (e.g., cell-biological evidence). The most important exceptions concern: (i) the place-
ment of the root of the eukaryote tree and; (ii) inclusions of lineages known only as environmental
rRNA sequences. The root is not directly examined by most phylogenomic analyses since they do
not include outgroups to eukaryotes; the root must therefore be inferred using quite different data
(Box 3).
TSAR
The acronym TSAR stands for the group’s constituent members: telonemids, stramenopiles, alveo-
lates, and Rhizaria. The latter three groups form a clade, ‘SAR’ or ‘Sar’, that emerged relatively early
in the phylogenomic era [29,30,33] and has been routinely considered a ‘supergroup’ (partly replac-
ing chromalveolates; Box 1). Sar has been estimated to comprise up to half of all eukaryote species
Picozoae 2007 [89] 2007 Heterotrophic flagellatesf Environmental PCR + 2012 [53]g
FISHg
Hemimastigophora 2018 [61] 1893 Heterotrophic flagellates Single-cell isolationn 2018 [61]
o
Rhodelphis 2019 [63] 2019 Heterotrophic flagellates Cultivation 2019 [63]
Table 1. Candidate New Major Lineages of Eukaryotes Identified since 2004 Using Molecular Phylogeneticsa
a
Defined as taxa that do not fall inside any robust clade within eukaryotes that was widely recognized in 2004.
b
Report of cultivation and first molecular data from [97], but misidentified as an archamoeba (Amoebozoa); arguably, first identification as a likely major lineage,
albeit nominally within Amoebozoa, by [98].
c
Here and elsewhere, ‘cultivation’ indicates that strains have been grown indefinitely under laboratory conditions with no other eukaryotes, except prey for or-
ganisms that consume other eukaryotic cells.
d
First phylogenomic investigation placed breviates incorrectly with(in) Amoebozoa [50]. Current placement in Obazoa robustly established later [55].
e
Confirmed as distinct in [51], but robust inference as sister of Sar reported much later [62].
f
Named ‘picobiliphytes’ and identified as algae when first reported [89]. Later studies, including transient cultivation, show that they are heterotrophic flagellates
[99,100].
g
Subsequently, genome amplification performed on isolated single cells [79]; these data used for seven-gene phylogenies [79] and later in phylogenomic ana-
lyses [53].
h
Previously studied as ‘Micronucleariida’ [90]; current name ‘Rigifilida’ introduced in [101].
i
Confirmed as distinct in [69]; robust inference of current position in CRuMs established later [56].
j
Recognized as distinct, and sister to haptophytes, on the basis of plastid rDNA data only [94]. Examinations of nuclear data awaited.
k
Falls outside current supergroups in phylogenomic analyses, but position is highly unstable [57]. Reanalysis awaited.
l
First studied Ancoracysta was misidentified as a Colponema (Alveolata) and recognized after the fact [70]. Name introduced in [60].
m
No published phylogenomic analysis, although a possible affinity with metamonads based on unpublished analyses is noted in the description [96].
n
Initial small subunit (SSU) rDNA and transcriptomic data generated using single-cell methods; cultivated subsequently [61].
o
Heterotrophic, but inferred to possess a nonphotosynthetic plastid based on gene sequence information [63].
diversity [4]. It includes several major groups of microbial algae (e.g., diatoms, dinoflagellates), large
seaweeds (e.g., kelps), ecologically important free-living protozoa (e.g., ciliates, foraminiferans, radi-
olarians), and many well-studied protozoan parasites (e.g., apicomplexans, oomycetes) [64]. The sis-
ter group to Sar had been unclear, but there is now good evidence that this is the enigmatic free-living
flagellate taxon Telonemia, which has just two described species [65]. The TSAR clade was robustly
supported in recent phylogenomic analyses with improved sequence quality and quantity [62] and
was also recovered earlier with some smaller datasets [52,53,61].
Amorphea
Archaeplastida
An
a
Rh ad Obazoa
co
+ odo on
Pic
Ch
ra
lor Rh ph om a
nt
oz
es
cy
Gla o od y us
uco plast elphi ta iat ko
oa
sta
Ap v h o
ph ida s
Br
e ist a
Cryptista Palp yta Op ozo
itom
onas m oeb
Katab A
lepha a
Diphylleid CRuMs
F
rida
Crypt
ophyt
a
O
Haptista nas
Centrohelida Mantamo
Haptophyta
O
+ Rappemonads Discoba
‘Excavates’
Telonemia Metamonada
TSAR
Rhizaria Malawimonadida
Sar
Alveolata Ancyromonadida
Original ‘Supergroup’
Stramenopila No molecular data in 2004
Hemimastigophora
Haptista
Haptista comprises the haptophyte algae (previously assigned to chromalveolates; Box 1) and centro-
helids. Haptophytes, especially the calcifying coccolithophorids (e.g., Emiliania huxleyi), play crucial
roles in marine ecosystems and global biogeochemical cycles. Centrohelids, by contrast, are free-
living protozoa with ray-like pseudopodia supported by microtubules (axopodia), which radiate
from a spherical cell body. Haptista is generally well supported in recent phylogenomic studies
[54,57].
Cryptista
Cryptista contains the cryptomonads (also former chromalveolates; Box 1), a lineage that has been
central to the study of plastid origin and spread across eukaryotes (e.g., Guillardia theta). Cryptista
also includes the katablepharids and the more recently discovered Palpitomonas, both enigmatic
heterotrophic flagellates (Table 1). Phylogenomic studies robustly support the monophyly of Cryp-
tista [37,53,58].
Archaeplastida
The three taxa that comprise Archaeplastida are the Chloroplastida (green algae + land plants),
Rhodophyta (red algae), and Glaucophyta. All three lineages have primary plastids, which are photo-
synthetic organelles that originated directly from cyanobacteria. Recently, a new group – Rhodel-
phis – was discovered and shown to branch as sister to red algae in phylogenomic analyses [63]. Rho-
delphis cells are heterotrophic flagellates, but gene sequence data suggest that they have a
More recently, the root position has been addressed using molecular phylogenies of concatenated proteins of
mitochondrial or bacterial origin in which eukaryotes appear particularly closely related to outgroup prokary-
otic sequences [112–114]. Derelle and Lang [112] analyzed 42 genes of mitochondrial origin and found that
their analyses supported a ‘unikont’/’bikont’ root. Then, He et al. analyzed a distinct, but overlapping, set of
37 genes, including some transferred to the eukaryotic stem lineage from bacteria prior to the Last Eukaryote
Common Ancestor (LECA) [114]. Their analyses placed the eukaryote root on the branch between Discoba and
other eukaryotes. Derelle and colleagues subsequently contested this result, recovering a root between two
large groupings: ‘Opimoda’, comprising Amorphea, collodictyonids, and malawimonads, and ‘Diphoda’,
including Discoba, Archaeplastida, cryptomonads, and Sar [113]. They argued that Excavata cannot be a nat-
ural group because both ‘sides’ of this root include excavates (malawimonads and Discoba, respectively). If
correct, this root implies that cell-structure features proposed as synapomorphies for Excavata could be ances-
tral properties of the LECA. Regardless of which, if any, of these results are correct, many of the novel protist
taxa recently placed in the eToL (Figure 1 and Table 1) were not represented in these analyses. Therefore, the
precise position of the root of the eToL remains uncertain.
nonphotosynthetic primary plastid. The hypothesis of a common origin of the primary plastids unifies
Archaeplastida and, uniquely among the current supergroups, implies a strong morphological/cell-
biological synapomorphy. However, most recent phylogenomic analyses of nuclear genes do not
recover Archaeplastida as a strict clade or do so with poor support (e.g. [37,62]). The most common
alternative topologies place Cryptista and sometimes Picozoa (see below) within the minimal green +
red + glaucophyte clade.
Amorphea
This taxon groups opisthokonts (animals, fungi, and their respective unicellular relatives) with the
amoeboid protists of Amoebozoa (e.g., Amoeba and most ‘slime molds’ among many). Amorphea
now also includes two small lineages of heterotrophic flagellates, the breviates and the apusomo-
nads, that cluster with the opisthokonts to form the Obazoa [34,55]. Amorphea is robustly supported
in most phylogenomic analyses, with the caveat that the position of the root remains uncertain (Box 3),
and a placement within Amorphea has been inferred in some cases [66], which would make Amorphea
paraphyletic.
CRuMs
As with TSAR, CRuMs represents a novel proposed supergroup named as an acronym of its constit-
uent members: collodictyonids (syn. diphylleids) + Rigifilida + Mantamonas. These three free-living
protozoan taxa have very different basic morphologies (swimming flagellates, filose amoeboid cells,
and tiny gliding cells, respectively) and were previously ‘orphan taxa’ (see below), but robustly coa-
lesced in recent phylogenomic analyses [56,61].
Discoba
Discoba includes Euglenozoa and Heterolobosea (collectively ‘Discicristata’), plus the heterotrophic
flagellate groups Jakobida and Tsukubamonas (Table 1). Euglenozoa includes the euglenophyte
Metamonada
Metamonada entirely comprises anaerobic protists, including various free-living protozoa, intestinal
symbionts (especially of wood-eating insects), and many parasites (e.g., Giardia, Trichomonas). The
monophyly of Metamonada is well supported by contemporary phylogenomic analyses [38,67]. How-
ever, placing metamonads relative to other taxa has proved very challenging, because most species
exhibit very high rates of sequence evolution. Phylogenomic analyses often infer a Metamonada plus
Discoba clade (see above) [34,68,69] largely corresponding to the original ‘Excavata’ supergroup (Box
1); however, this topology could represent an analysis artefact. Some phylogenomic analyses, usually
those that include shorter-branching metamonads, recover instead a specific relationship with the
‘orphan’ excavate group malawimonads (see below) [55,59,68,70].
Hemimastigophora
The ‘hemimastigotes’ are free-living protozoa with two rows of flagella. They had been known since
the 19th century and given a high taxonomic rank based on electron microscopy observations [71] but
were never cultivated, and genetic data were lacking. Recent phylogenomic analyses, based on tran-
scriptomes from hand-picked cells of two genera, showed hemimastigotes as one of the deepest
branches within eukaryotes [61]. They could not be placed as sister to any one of the ‘established’ su-
pergroups (or any ‘orphan’); consequently, it was proposed to consider them a new supergroup.
Orphan Taxa
In addition to the groups listed above, there are several seemingly species-poor taxa for which phy-
logenomic analyses have thus far failed to provide a convincing phylogenetic placement. These so-
called ‘orphan taxa’ include Ancoracysta, Picozoa, malawimonads, and ancyromonads (= planomo-
nads), all of which are free-living protozoa. Some or all of these may branch with an established group;
for example, Ancoracysta may be sister to Haptista [60,62] and malawimonads may be sisters to Meta-
monada (see above; [68]). It is possible, however, that some represent even deeper-diverging line-
ages, following the recent example of Hemimastigophora.
Given this new framework for eukaryote evolution (Figure 1), an obvious question is: can these super-
groups be reliably grouped further? Most recent phylogenomic analyses show Cryptista branching
with (or within) Archaeplastida and many show Haptista as a close relative of Sar, and now TSAR
[37,53,54,61,62]. ‘Diaphoretickes’ is an even larger assemblage that is proposed to unite these four
supergroups to the exclusion of Amorphea, Discoba, and Metamonada [13,56,61,72], while CRuMs
is inferred to be sister to Amorphea [56]. It is too early to tell, but even if reliable, these inferences
depend on assumptions about the position of the root of eukaryotes (Box 3), which becomes ever-
more problematic as larger groups are inferred from unrooted phylogenetic trees.
Decisions about which major groupings are considered supergroups have always been arbitrary, but
the increasing absence of distinguishing biological features makes this more apparent. Paradoxically,
the improved resolution of the tree makes the problem worse, not better. To illustrate this issue, take
the newly identified supergroup CRuMs [56], which was inferred to branch together with Amorphea,
itself containing two taxa often recognized as supergroups, Amoebozoa and Obazoa. The opinion
that this collection of taxa represents two supergroups (or three), rather than one, reflects the lack
of distinguishing characters for the CRuMs–Amorphea grouping. This leaves the decision driven by
subjective judgments concerning: (i) which phylogenetic results are sufficiently robust to be accepted
without further confirmation; and (ii) the uncertainty about the location of the ‘root’ of the eToL (Box
3). Moreover, there is a blurry line between orphan lineages, which often have just a few known spe-
cies, and the least speciose supergroups. If a diversity-poor orphan is shown to be evolutionarily un-
related to all supergroups, does that make it a new supergroup? To be most useful, the notion of ‘su-
pergroup’ should not be distinguished from ‘orphan’ by the level of diversity it contains but instead
should reflect the degree of confidence that a lineage is not encompassed phylogenetically by an ex-
isting clade.
We expect that many researchers and educators will continue to find it useful to divide eukaryote di-
versity into a small number of major clades, and this ultimately is what a catalog of supergroups aims
to provide. Future comparative genomics research may identify robust apomorphies for deep clades
within eukaryotes, which in turn could help to more naturally delineate supergroups. Until then, how-
ever, it seems that the bulk of major subdivisions of eukaryotes will continue to be only clades derived
from molecular phylogenetic trees. Accordingly, we should expect the list of supergroups to be
increasingly volatile as the understanding of eukaryote diversity and resolution of the tree improve
further (see below), and more author-dependent, since there will be no conspicuous criteria for
deciding which clades are to be distinguished as supergroups.
never have been observed under a microscope. Eukaryotic metagenomics is still in its infancy, but the
Outstanding Questions
signs are there that it might be a workable approach for placing novel genetic diversity in a phyloge-
How many more extant ‘kingdom-
nomic framework [84,85]. Another method recently applied to obtain genomic information from
level’ eukaryotic lineages exist and
important taxa combines metabarcoding and fluorescence in situ hybridization (FISH) to go from se-
can we find them? Most recent dis-
quences back to the cells [86]. No matter which technique proves to be the most useful, the release coveries of major groups used cul-
from the burden of culturing means that the taxonomic breadth, and importantly the taxon density, of ture-based approaches, whilst much
phylogenomic datasets may improve rapidly in the near future. Thus, new groups that are especially higher-throughput environmental
challenging to culture may be identified and added for the first time, in turn greatly accelerating the sequencing has mostly increased
achievement of robust taxon sampling for all groups on the tree. The availability of these data should the diversity within known super-
ultimately improve the overall reliability of phylogenetic estimation, although with the caveat that us- groups. Interestingly, the major line-
ing culture-free approaches generally means that some important aspects of the biology are missed ages discovered via cultivation are
(e.g., details of life cycles and morphology). often not well represented in environ-
mental surveys; this suggests that
cultivation and culture-independent
With these anticipated improvements in taxon sampling for eukaryotes, it is more important than ever
methods will preferentially access
to develop rigorous phylogenomic pipelines. This involves best practice when assembling the data- different subsets of the diversity re-
sets as well as models of sequence evolution complex enough to adequately describe the processes maining to be characterized.
at play, with software implementations that allow these models to be used on large datasets (Box 2).
So far, broad-scale phylogenomics of eukaryotes has almost exclusively used the concatenation Can we refine the relationships
among the major lineages to obtain
approach, but exploring, in depth, the influence of individual genes can help to pinpoint more spe-
a fully resolved eukaryotic Tree of
cifically where and how the phylogenetic signal is distributed [55,87]. It will also be informative to
Life (eToL)? Will phylogenomics using
assess the origins of the different signals between different datasets so that the influence of taxon more deep-branching taxa and better
and gene sampling can be disentangled. Better understanding of eukaryote-wide phylogenomic da- evolutionary models (e.g., site-het-
tasets, combined with improvements in state-of-the-art phylogenetic methods, will enable the recov- erogeneous models) be enough to
ery of even more ancient and difficult-to-discern phylogenetic signals. stabilize all major nodes in the eToL?
6. Pawlowski, J. et al. (2012) CBOL Protist Working 30. Rodriguez-Ezpeleta, N. et al. (2007) Toward
Group: barcoding eukaryotic richness beyond the resolving the eukaryotic tree: the phylogenetic
animal, plant, and fungal kingdoms. PLoS Biol. 10, positions of jakobids and cercozoans. Curr. Biol. 17,
e1001419 1420–1425
7. Sibbald, S.J. and Archibald, J.M. (2017) More protist 31. Hampl, V. et al. (2009) Phylogenomic analyses
genomes needed. Nat. Ecol. Evol. 1, 145 support the monophyly of Excavata and resolve
8. Simpson, A.G.B. and Roger, A.J. (2002) Eukaryotic relationships among eukaryotic ‘‘supergroups’’.
evolution: getting to the root of the problem. Curr. Proc. Natl. Acad. Sci. U. S. A. 106, 3859–3864
Biol. 12, R691–R693 32. Patron, N.J. et al. (2007) Multiple gene phylogenies
9. Simpson, A.G.B. and Roger, A.J. (2004) The real support the monophyly of cryptomonad and
‘‘kingdoms’’ of eukaryotes. Curr. Biol. 14, R693– haptophyte host lineages. Curr. Biol. 17, 887–891
R696 33. Hackett, J.D. et al. (2007) Phylogenomic analysis
10. Baldauf, S.L. (2003) The deep roots of eukaryotes. supports the monophyly of cryptophytes and
Science 300, 1703–1706 haptophytes and the association of Rhizaria with
11. Adl, S.M. et al. (2005) The new higher-level chromalveolates. Mol. Biol. Evol. 24, 1702–1713
classification of eukaryotes with emphasis on the 34. Katz, L.A. and Grant, J.R. (2015) Taxon-rich
taxonomy of protists. J. Eukaryot. Microbiol. 52, phylogenomic analyses resolve the eukaryotic tree
399–451 of life and reveal the power of subsampling by sites.
12. Keeling, P.J. et al. (2005) The tree of eukaryotes. Syst. Biol. 64, 406–415
Trends Ecol. Evol. 20, 670–676 35. Keeling, P.J. et al. (2014) The Marine Microbial
13. Adl, S.M. et al. (2012) The revised classification of Eukaryote Transcriptome Sequencing Project
eukaryotes. J. Eukaryot. Microbiol. 59, 429–493 (MMETSP): illuminating the functional diversity of
14. Nikolaev, S.I. et al. (2004) The twilight of Heliozoa eukaryotic life in the oceans through transcriptome
and rise of Rhizaria, an emerging supergroup of sequencing. PLoS Biol. 12, e1001889
amoeboid eukaryotes. Proc. Natl. Acad. Sci. U. S. A. 36. Burki, F. et al. (2010) Evolution of Rhizaria: new
101, 8066–8071 insights from phylogenomic analysis of uncultivated
15. Baldauf, S.L. and Palmer, J.D. (1993) Animals and protists. BMC Evol. Biol. 10, 377
fungi are each other’s closest relatives: congruent 37. Cenci, U. et al. (2018) Nuclear genome sequence of
evidence from multiple proteins. Proc. Natl. Acad. the plastid-lacking cryptomonad Goniomonas
Sci. U. S. A. 90, 11558–11562 avonlea provides insights into the evolution of
16. Smirnov, A. et al. (2005) Molecular phylogeny and secondary plastids. BMC Biol. 16, 137
classification of the lobose amoebae. Protist 156, 38. Leger, M.M. et al. (2017) Organelles that illuminate
129–142 the origins of Trichomonas hydrogenosomes and
17. Cavalier-Smith, T. (1999) Principles of protein and Giardia mitosomes. Nat. Ecol. Evol. 1, 0092
lipid targeting in secondary symbiogenesis: 39. Brown, M.W. et al. (2012) Aggregative
euglenoid, dinoflagellate, and sporozoan plastid multicellularity evolved independently in the
origins and the eukaryote family tree. J. Eukaryot. eukaryotic supergroup Rhizaria. Curr. Biol. 22,
Microbiol. 46, 347–366 1123–1127
18. Cavalier-Smith, T. (1981) Eukaryote kingdoms: 40. Kang, S. et al. (2017) Between a pod and a hard test:
seven or nine? Biosystems 14, 461–481 the deep evolution of amoebae. Mol. Biol. Evol. 34,
19. Yoon, H.S. et al. (2002) The single, ancient origin of 2258–2270
chromist plastids. Proc. Natl. Acad. Sci. U. S. A. 99, 41. Lahr, D.J.G. et al. (2019) Phylogenomics and
15507–15512 morphological reconstruction of Arcellinida
20. Rodriguez-Ezpeleta, N. et al. (2005) Monophyly of testate amoebae highlight diversity of microbial
primary photosynthetic eukaryotes: green plants, eukaryotes in the neoproterozoic. Curr. Biol. 29,
red algae, and glaucophytes. Curr. Biol. 15, 1325– 991–1001.e3
1330 42. Gentekaki, E. et al. (2014) Large-scale
21. Simpson, A.G.B. (2003) Cytoskeletal organization, phylogenomic analysis reveals the phylogenetic
phylogenetic affinities and systematics in the position of the problematic taxon Protocruzia and
contentious taxon Excavata (Eukaryota). Int. J. Syst. unravels the deep phylogenetic affinities of the
Evol. Microbiol. 53, 1759–1777 ciliate lineages. Mol. Phylogenet. Evol. 78, 36–42
22. Delsuc, F. et al. (2005) Phylogenomics and the 43. Sheng, Y. et al. (2018) Phylogenetic relationship
reconstruction of the tree of life. Nat. Rev. Genet. 6, analyses of complicated class Spirotrichea based on
361–375 transcriptomes from three diverse microbial
23. Philippe, H. et al. (2004) Phylogenomics of eukaryotes: Uroleptopsis citrina, Euplotes vannus
eukaryotes: impact of missing data on large and Protocruzia tuzeti. Mol. Phylogenet. Evol. 129,
alignments. Mol. Biol. Evol. 21, 1740–1752 338–345
24. Jeffroy, O. et al. (2006) Phylogenomics: the 44. Derelle, R. et al. (2016) A phylogenomic framework
beginning of incongruence? Trends Genet. 22, to study the diversity and evolution of
225–231 stramenopiles (=heterokonts). Mol. Biol. Evol. 33,
25. Philippe, H. et al. (2011) Resolving difficult 2890–2898
phylogenetic questions: why more sequences are 45. Cavalier-Smith, T. (2004) Only six kingdoms of life.
not enough. PLoS Biol. 9, e1000602 Proc. Biol. Sci. 271, 1251–1262
26. Bapteste, E. et al. (2002) The analysis of 100 genes 46. Berney, C. et al. (2004) How many novel eukaryotic
supports the grouping of three highly divergent ‘‘kingdoms?’’ Pitfalls and limitations of
amoebae: Dictyostelium, Entamoeba, and environmental DNA surveys. BMC Biol. 2, 13
Mastigamoeba. Proc. Natl. Acad. Sci. U. S. A. 99, 47. de Vargas, C. et al. (2015) Eukaryotic plankton
1414–1419 diversity in the sunlit ocean. Science 348, 1261605
27. Lang, B.F. et al. (2002) The closest unicellular 48. Massana, R. et al. (2015) Marine protist diversity in
relatives of animals. Curr. Biol. 12, 1773–1778 European coastal waters and sediments as revealed
28. Burki, F. and Pawlowski, J. (2006) Monophyly of by high-throughput sequencing. Environ.
Rhizaria and multigene phylogeny of unicellular Microbiol. 17, 4035–4049
bikonts. Mol. Biol. Evol. 23, 1922–1930 49. Mahé, F. et al. (2017) Parasites dominate
29. Burki, F. et al. (2007) Phylogenomics reshuffles the hyperdiverse soil protist communities in neotropical
eukaryotic supergroups. PLoS One 2, e790 rainforests. Nat. Ecol. Evol. 1, 91
50. Minge, M.A. et al. (2009) Evolutionary position of 70. Cavalier-Smith, T. et al. (2018) Multigene phylogeny
breviate amoebae and the primary eukaryote and cell evolution of chromist infrakingdom
divergence. Proc. Biol. Sci. 276, 597–604 Rhizaria: contrasting cell organisation of sister phyla
51. Burki, F. et al. (2009) Large-scale phylogenomic Cercozoa and Retaria. Protoplasma 255, 1517–1574
analyses reveal that two enigmatic protist lineages, 71. Foissner, W. et al. (1988) The Hemimastigophora
Telonemia and Centroheliozoa, are related to (Hemimastix amphikineta nov. gen., nov. spec.), a
photosynthetic chromalveolates. Genome Biol. new protistan phylum from Gondwanian soils. Eur.
Evol. 1, 231–238 J. Protistol. 23, 361–383
52. Zhao, S. et al. (2012) Collodictyon – an ancient 72. Burki, F. et al. (2008) Phylogenomics reveals a new
lineage in the tree of eukaryotes. Mol. Biol. Evol. 29, ‘‘megagroup’’ including most photosynthetic
1557–1568 eukaryotes. Biol. Lett. 4, 366–369
53. Burki, F. et al. (2012) The evolutionary history of 73. Keeling, P.J. (2009) Chromalveolates and the
haptophytes and cryptophytes: phylogenomic evolution of plastids by secondary endosymbiosis.
evidence for separate origins. Proc. Biol. Sci. 279, J. Eukaryot. Microbiol. 56, 1–8
2246–2254 74. Burki, F. (2017) The convoluted evolution of
54. Burki, F. et al. (2016) Untangling the early eukaryotes with complex plastids. In Advances in
diversification of eukaryotes: a phylogenomic study Botanical Research, Vol. 84, Y. Hirakawa, ed
of the evolutionary origins of Centrohelida, (Academic Press), pp. 1–30
Haptophyta and Cryptista. Proc. Biol. Sci. 283, 75. Krabberød, A.K. et al. (2017) Single cell
20152802 transcriptomics, mega-phylogeny, and the genetic
55. Brown, M.W. et al. (2013) Phylogenomics basis of morphological innovations in Rhizaria. Mol.
demonstrates that breviate flagellates are related to Biol. Evol. 34, 1557–1573
opisthokonts and apusomonads. Proc. Biol. Sci. 76. Kolisko, M. et al. (2014) Single-cell transcriptomics
280, 20131755 for microbial eukaryotes. Curr. Biol. 24, R1081–
56. Brown, M.W. et al. (2018) Phylogenomics places R1082
orphan protistan lineages in a novel eukaryotic 77. Heywood, J.L. et al. (2010) Capturing diversity of
super-group. Genome Biol. Evol. 10, 427–433 marine heterotrophic protists: one cell at a time.
57. Cavalier-Smith, T. et al. (2015) Multiple origins of ISME J. 5, 674–684
Heliozoa from flagellate ancestors: new cryptist 78. Gawryluk, R.M.R. et al. (2016) Morphological
subphylum Corbihelia, superclass Corbistoma, and identification and single-cell genomics of marine
monophyly of Haptista, Cryptista, Hacrobia and diplonemids. Curr. Biol. 26, 3053–3059
Chromista. Mol. Phylogenet. Evol. 93, 331–362 79. Yoon, H.S. et al. (2011) Single-cell genomics reveals
58. Yabuki, A. et al. (2014) Palpitomonas bilix represents organismal interactions in uncultivated marine
a basal cryptist lineage: insight into the character protists. Science 332, 714–717
evolution in Cryptista. Sci. Rep. 4, 4641 80. Roy, R.S. et al. (2014) Single cell genome analysis of
59. Kamikawa, R. et al. (2014) Gene content evolution in an uncultured heterotrophic stramenopile. Sci. Rep.
discobid mitochondria deduced from the 4, 4780
phylogenetic position and complete mitochondrial 81. Seeleuthner, Y. et al. (2018) Single-cell genomics of
genome of Tsukubamonas globosa. Genome Biol. multiple uncultured stramenopiles reveals
Evol. 6, 306–315 underestimated functional diversity across oceans.
60. Janouskovec, J. et al. (2017) A new lineage of Nat. Commun. 9, 310
eukaryotes illuminates early mitochondrial genome 82. Sieracki, M.E. et al. (2019) Single cell genomics
reduction. Curr. Biol. 27, 3717–3724.e5 yields a wide diversity of small planktonic protists
61. Lax, G. et al. (2018) Hemimastigophora is a novel across major ocean ecosystems. Sci. Rep. 9, 6025
supra-kingdom-level lineage of eukaryotes. Nature 83. Wideman, J.G. et al. A single-cell genome reveals
564, 410–414 diplonemid-like ancestry of kinetoplastid
62. Strassert, J.F.H. et al. (2019) New phylogenomic mitochondrial gene structure. Philos. Trans. R. Soc.
analysis of the enigmatic phylum Telonemia further Lond. Ser. B Biol. Sci. (in press)
resolves the eukaryote tree of life. Mol. Biol. Evol. 84. West, P.T. et al. (2018) Genome-reconstruction for
36, 757–765 eukaryotes from complex natural microbial
63. Gawryluk, R.M.R. et al. (2019) Non-photosynthetic communities. Genome Res. 28, 569–580
predators are sister to red algae. Nature 572, 85. Steinegger, M. et al. (2019) Protein-level assembly
240–243 increases protein sequence recovery from
64. Grattepanche, J.-D. et al. (2018) Microbial diversity metagenomic samples manyfold. Nat. Methods 32,
in the eukaryotic SAR clade: illuminating the 834
darkness between morphology and molecular data. 86. Kwong, W.K. et al. (2019) A widespread coral-
Bioessays 40, e1700198 infecting apicomplexan with chlorophyll
65. Shalchian-Tabrizi, K. et al. (2006) Telonemia, a new biosynthesis genes. Nature 568, 103–107
protist phylum with affinity to chromist lineages. 87. Shen, X.-X. et al. (2017) Contentious relationships in
Proc. Biol. Sci. 273, 1833–1842 phylogenomic studies can be driven by a handful of
66. Katz, L.A. et al. (2012) Turning the crown upside genes. Nat. Ecol. Evol. 1, 126
down: gene tree parsimony roots the eukaryotic 88. Okamoto, N. and Inouye, I. (2005) The
tree of life. Syst. Biol. 61, 653–660 katablepharids are a distant sister group of the
67. Karnkowska, A. et al. (2016) A eukaryote without a Cryptophyta: a proposal for Katablepharidophyta
mitochondrial organelle. Curr. Biol. 26, 1274–1284 divisio nova/ Kathablepharida phylum novum
68. Heiss, A.A. et al. (2018) Combined morphological based on SSU rDNA and beta-tubulin phylogeny.
and phylogenomic re-examination of Protist 156, 163–179
malawimonads, a critical taxon for inferring the 89. Not, F. et al. (2007) Picobiliphytes: a marine
evolutionary history of eukaryotes. R. Soc. Open Sci. picoplanktonic algal group with unknown affinities
5, 171707 to other eukaryotes. Science 315, 253–255
69. Cavalier-Smith, T. et al. (2014) Multigene 90. Cavalier-Smith, T. et al. (2008) Planomonadida ord.
eukaryote phylogeny reveals the likely protozoan nov. (Apusozoa): ultrastructural affinity with
ancestors of opisthokonts (animals, fungi, Micronuclearia podoventralis and deep
choanozoans) and Amoebozoa. Mol. Phylogenet. divergences within Planomonas gen. nov. Protist
Evol. 81, 71–85 159, 535–562
91. Yabuki, A. et al. (2010) Palpitomonas bilix gen. et sp. 102. Lartillot, N. and Philippe, H. (2004) A Bayesian
nov.: a novel deep-branching heterotroph possibly mixture model for across-site heterogeneities in the
related to Archaeplastida or Hacrobia. Protist 161, amino-acid replacement process. Mol. Biol. Evol.
523–538 21, 1095–1109
92. Yabuki, A. et al. (2011) Tsukubamonas globosa n. 103. Nguyen, L.-T. et al. (2015) IQ-TREE: a fast and
gen., n. sp., a novel excavate flagellate possibly effective stochastic algorithm for estimating
holding a key for the early evolution in ‘‘Discoba’’. maximum-likelihood phylogenies. Mol. Biol. Evol.
J. Eukaryot. Microbiol. 58, 319–331 32, 268–274
93. Glücksman, E. et al. (2011) The novel marine gliding 104. Wang, H.-C. et al. (2018) Modeling site
zooflagellate genus Mantamonas (Mantamonadida heterogeneity with posterior mean site frequency
ord. n.: Apusozoa). Protist 162, 207–221 profiles accelerates accurate phylogenomic
94. Kim, E. et al. (2011) Newly identified and estimation. Syst. Biol. 67, 216–235
diverse plastid-bearing branch on the eukaryotic 105. Stechmann, A. and Cavalier-Smith, T. (2003) The
tree of life. Proc. Natl. Acad. Sci. U. S. A. 108, 1496– root of the eukaryote tree pinpointed. Curr. Biol. 13,
1500 R665–R666
95. Yabuki, A. et al. (2012) Microheliella maris 106. Richards, T.A. and Cavalier-Smith, T. (2005) Myosin
(Microhelida ord. n.), an ultrastructurally highly domain evolution and the primary divergence of
distinctive new axopodial protist species and eukaryotes. Nature 436, 1113–1118
genus, and the unity of phylum Heliozoa. Protist 107. Kim, E.E. et al. (2006) Evolutionary relationships of
163, 356–388 apusomonads inferred from taxon-rich analyses of 6
96. Táborský, P. et al. (2017) Anaeramoebidae fam. nuclear encoded genes. Mol. Biol. Evol. 23, 2455–
nov., a novel lineage of anaerobic amoebae and 2466
amoeboflagellates of uncertain phylogenetic 108. Roger, A.J. and Simpson, A.G.B. (2009) Evolution:
position. Protist 168, 495–526 revisiting the root of the eukaryote tree. Curr. Biol.
97. Stiller, J.W. et al. (1998) Amitochondriate amoebae 19, R165–R167
and the evolution of DNA-dependent RNA 109. Sebé-Pedrós, A. et al. (2014) Evolution and
polymerase II. Proc. Natl. Acad. Sci. U. S. A. 95, classification of myosins, a paneukaryotic whole-
11769–11774 genome approach. Genome Biol. Evol. 6, 290–305
98. Cavalier-Smith, T. et al. (2004) Molecular phylogeny 110. Rogozin, I.B. et al. (2009) Analysis of rare genomic
of Amoebozoa and the evolutionary significance of changes does not support the unikont–bikont
the unikont Phalansterium. Eur. J. Protistol. 40, phylogeny and suggests cyanobacterial symbiosis
21–48 as the point of primary radiation of eukaryotes.
99. Seenivasan, R. et al. (2013) Picomonas judraskeda Genome Biol. Evol. 1, 99–113
gen. et sp. nov.: the first identified member of the 111. Cavalier-Smith, T. (2010) Kingdoms Protozoa and
Picozoa phylum nov., a widespread group of Chromista and the eozoan root of the eukaryotic
picoeukaryotes, formerly known as tree. Biol. Lett. 6, 342–345
‘‘picobiliphytes.’’. PLoS One 8, e59565 112. Derelle, R. and Lang, B.F. (2012) Rooting the
100. Moreira, D. and López-Garcı́a, P. (2014) The rise and eukaryotic tree with mitochondrial and bacterial
fall of picobiliphytes: how assumed autotrophs proteins. Mol. Biol. Evol. 29, 1277–1289
turned out to be heterotrophs. Bioessays 36, 113. Derelle, R. et al. (2015) Bacterial proteins pinpoint a
468–474 single eukaryotic root. Proc. Natl. Acad. Sci. U. S. A.
101. Yabuki, A. et al. (2013) Rigifila ramosa n. gen., n. sp., 112, E693–E699
a filose apusozoan with a distinctive pellicle, is 114. He, D. et al. (2014) An alternative root for the
related to Micronuclearia. Protist 164, 75–88 eukaryote tree of life. Curr. Biol. 24, 465–470