The Black Truffle Genome Uncovers Evolutionary Origins and Mechanisms of Symbiosis
Francis Martin1*, Raffaella Balestrini2+, Olivier Jaillon3–5+, Annegret Kohler1+, Barbara Montanini6+, Emmanuelle Morin1+,
Claude Murat1+, Benjamin Noel3–5+, Riccardo Percudani6+, Bettina Porcel3–5, Andrea Rubini7+, Antonella Amicucci8, Joelle
Amselem9, Véronique Anthouard3–5, Sergio Arcioni7, François Artiguenave3–5, Jean-Marc Aury3–5, Paola Ballario10, Angelo
Bolchi6, Andrea Brenna10, Annick Brun1, Marc Buée1, Brandi Cantarel11, Gérard Chevalier12, Arnaud Couloux3–5, Pedro
Coutinho11, Corinne Da Silva3–5, France Denoeud3–5, Sébastien Duplessis1, Stefano Ghignone2, Bernard Henrissat11,
Benoît Hilselberger1,9, Mirco Iotti13, Antonietta Mello2, Michele Miranda14, Giovanni Pacioni15, Hadi Quesneville9, Claudia
Riccioni7, Roberta Ruotolo6, Richard Splivallo16, Vilberto Stocchi8 , Alessandra Zambonelli13, Elisa Zampieri2, Arturo
Roberto Viscomi6, Francesco Paolocci7++, Paola Bonfante2++, Simone Ottonello6++ & Patrick Wincker3–5++
1
INRA, UMR 1136, INRA-Nancy Université, Interactions Arbres/Microorganismes, 54280 Champenoux, France. 2 Istituto per la Protezione delle Piante
del CNR, sez. di Torino c/o Dipartimento Biologia Vegetale, Università degli Studi di Torino, Viale Mattioli, 25, 10125 Torino, Italy. 3 CEA, IG,
Genoscope, 2 rue Gaston Crémieux CP5702, 91057 Evry. 4 CNRS, UMR 8030, 2 rue Gaston Crémieux, CP5706, F-91057 Evry, France. 5 Université
d’Evry, F-91057 Evry, France. 6 Dipartimento di Biochimica e Biologia Molecolare, Università degli Studi di Parma, Viale G.P. Usberti 23/A, 43100
Parma, Italy.. 7 CNR-IGV Istituto di Genetica Vegetale, Unità Organizzativa di Supporto di Perugia, via Madonna Alta, 130, 06128 Perugia, Italy. 8
Dipartimento di Scienze Biomolecolari, Università degli Studi di Urbino, Via Saffi 2 - 61029 Urbino (PU), Italy. 9 INRA, URGI, Route de Saint-Cyr 78026
Versailles cedex. 10Dipartimento di Genetica e Biologia Molecolare & IBPM (CNR), Università La Sapienza, Roma, Piazzale .A. Moro 5, 00185 Roma,
Italy. 11 Architecture et Fonction des Macromolécules Biologiques, UMR 6098 CNRS-Universités Aix-Marseille I & II, Marseille, France. 12 INRA, UMR
Amélioration et Santé des Plantes, INRA-Université Blaise Pascal, centre INRA de Clermont-Ferrand-Theix France. 13 Dipartimento di Protezione e
Valorizzazione Agroalimentare, Università degli Studi di Bologna, Bologna, Italy. 14 Dipartimento di Biologia di Base ed Applicata, Università degli Studi
dell’Aquila,Via Vetoio Coppito 1 - 67100 L’Aquila, Italy. 15 Dipartimento di Scienze Ambientali, Università degli Studi dell’Aquila,Via Vetoio Coppito 1 67100 L’Aquila, Italy.
16
University of Goettingen, Molecular Phytopathology and Mycotoxin Research, Grisebachstrasse 6, D-37077 Goettingen,
Germany.
* to whom correspondence should be addressed. E-mail: fmartin@nancy.inra.fr
+
These authors contributed equally to this work as second autor
++ These authors contributed equally to this work as senior authors.
One-sentence summary. In addition to unraveling specific features of a ‘cult food’, the analysis of the Black Truffle
genome shows that Ascomycete and Basidiomycete fungi have acquired different combination of molecular adaptations
and genetic predisposition to evolve the ectomycorrhizal symbiosis.
1
Abstract. The Perigord black truffle is a delicious gourmet food and an ectomycorrhizal symbiont. To understand the
biology and evolution of this fungus, its haploid genome was sequenced. Proliferation of transposable elements explains
the larger size of this genome compared with other fungi. The truffle genome only contains ~ 7,500 genes with very few
multigene families. It lacks large sets of carbohydrate degrading enzymes, but endoglucanases and pectinases involved
in degradation of cell walls are expressed in ectomycorrhiza. Identification of two different mating-type loci demonstrated
that T. melanosporum is an heterothallic, obligate outcrossing species. Consistent with a role in flavour formation, the
expression of several sulfur metabolism genes is upregulated in fruiting body. Our study suggests that genetic
predispositions for symbiosis evolved along different ways in ascomycetes and basidiomycetes. The truffle genome
sequence is thus a key resource for understanding evolution of symbiosis, and accelerating genetic improvement for
truffle production.
Ectomycorrhizas (ECM) are the most frequent mycorrhizal type in forests of temperate and boreal latitudes.
ECM fungi establish a mutualistic symbiosis with their host trees and as such are essential contributors to
carbon and nitrogen cycles in soils (1). In the basidiomycete Laccaria bicolor, the expansion of gene families
may have acted as a putative ‘symbiosis toolkit’ and might thus be a landmark of symbiosis evolution (2).
However, this contention must be tempered by the caveat that the above features may reflect evolution of this
particular mycorrhizal taxon and not a general trait shared by all ECM species (3). We therefore sequenced
the nuclear genome of the Perigord black truffle (Tuber melanosporum Vittad.) (table S1) (4) [supporting online
material (SOM) text S1] . This tree ECM symbiont is endemic to calcareous soils in southern Europe and
produces hypogeous fruit bodies, so-called truffles (5) (Fig. S1), highly appreciated by European gastronomy
for their organoleptic properties (i.e., taste and flavors) (5). In terms of understanding the evolution of Fungi
genetic and developmental complexity, T. melanosporum genome is likely to be critically important, as this
Pezizomycetes phylum is regarded as the basal clade of Ascomycota (Fig. S2) lacking any genomic coverage.
The present genome analysis highlights processes that may underlie the symbiotic lifestyle as well as fruiting
body formation.
2
Transposable elements and genome defense. The 125 megabase genome of T. melanosporum is the
largest sequenced fungal genome to date (6) (table S1), but no evidence for large scale duplications was
observed. The ~4-fold larger size of the truffle genome compared with other sequenced ascomycetes is
accounted for by multi-copy transposable elements (TE) (Fig. S3) which constitute about 66% of the
assembled genome (Fig. S4) (SOM text S4). Estimated insertion times suggest a major wave of
retrotransposition at <5 million year ago (Fig. S5). Most TEs are not uniformly distributed across the genome
(Fig. 1). Relics of TEs riddled with stop codons have been found in a number of ascomycetes as a result of
repeat-induced point mutation (RIP) (7). No RIP footprint (8) was detected in the T. melanosporum genome
(SOM text S3), suggesting that RIP was not an active defense mechanism when TEs invaded this genome, or
else that TEs were adapted to tolerate and escape RIP or related methylation mechanisms, such as
methylation-induced premeiotically (MIP). The proliferation of TEs within the truffle genome may result from its
low effective population size (SOM text S2.5) (9) during postglaciation migrations (10).
In filamentous fungi, global genome defense mainly relies on RNAi and on DNA methylation. Two DNAmethyltransferases (DMT) are present in the T. melanosporum genome: TmelDMT2 and TmelDMT1 (Fig. S6).
Distinct functional roles for these two DMTs are supported by the presence of a single DNA methylase domain
in both TmelDMT2 and Neurospora crassa DIM-2, rather than two separate domains as in MIP-related Masc1
and TmelDMT1, and by the preferential expression of the latter in fruiting body. T. melanosporum thus likely
uses general, rather than specialized RNAi and DNA methylation processes for genome defense (SOM text
S6.1).
The gene complement of Tuber. The predicted proteome is in the lower range of sequenced
filamentous fungi as only 7496 protein-coding genes were identified (6) (SOM text S4). They are mainly
located in TE-poor regions (Fig. 1) and the gene density is heterogeneous when compared with that of other
ascomycetes (Fig. S7). Amongst the predicted proteins, only 3970, 5596 and 5644 showed significant
sequence similarity to proteins from Saccharomyces cerevisiae, Neurospora crassa and Aspergillus niger,
respectively (Fig. S8). This agrees with the predicted ancient separation (>450 Myr ago) of the Pezizomycetes
3
from the other ancestral fungal lineages (Fig. S2) (11). Of the ~5600 T. melanosporum genes that have an
ortholog, very few show conservation of neighboring orthologs (synteny) in at least one of the other species
(Fig. S9, SOM text S5.2). T. melanosporum genome shows a structural organization strikingly different from
other sequenced ascomycetes; the largest syntenic region (with Coccidioides immitis) only contains 99 genes
with 39 orthologs (Fig. S10). TE proliferation likely facilitated genome rearrangements. Some regions of mesosynteny were however detected (Fig. S9), suggesting that T. melanosporum could be used for assessing the
genome organization of ancestral ascomycete clades.
Expression of 98% of the predicted genes was detected in free-living mycelia, ECM root tips and/or
fruiting bodies by custom-oligoarrays (SOM text S8, table S2, Fig. S23) and EST sequencing (SOM text S2.4).
Only a low proportion (6%) is differentially expressed (fold-ratio >4.0, P<0.05) in either ectomycorrhiza (table
S3) or fruiting body (table S4). Only 61 ectomycorrhiza-, fruiting body- or free-living mycelium specific
transcripts were detected (table S5). Transcripts coding for lectins, MFS transporters, redox proteins and
polysaccharide degrading enzymes are strikingly enriched in symbiotic tissues. They may play a role in
adhesion to host cells and colonization of root apoplast.
Gene and domain family expansion patterns. One of the most striking characteristics of the
T. melanosporum genome is the almost complete absence of highly similar gene pairs. Of the predicted 7496
protein-coding genes, only seven pairs share >90% amino-acid identity in their coding sequence, whereas 30
pairs share >80% identity (Fig. S11A). The latter value is significant as RIP mutates duplicated sequences that
share greater than ~80% nucleotide similarity (10). An ancestral RIP, or a similarly acting mechanism, has
likely prevented the emergence of novel genes through duplication. Multigene families in T. melanosporum
are in limited number and comprise only 19% of the predicted proteome; most families have only two diverging
members (Fig. S12). The rate of gene family gain is much lower than the rate of gene loss (Fig. S11B) and
amongst the 11234 gene families found in ascomycetes, 5695 appear to be missing in T. melanosporum
(Fig. S11) (15). This feature may reflect the genome organization of the ascomycete common ancestor as
T. melanosporum is the earliest diverging lineage within the Pezizomycotina clade. By comparison to other
4
ascomycetes, gene families predicted to encode metabolite transporters (e.g., amino acid & sugar
permeases), secondary metabolism enzymes (e.g., polyketide synthases and cytochrome P450s) and
carbohydrate-active enzymes (table S6) are lacking. Besides its low rate of gene family gain, T.
melanosporum is characterized by a small-sized tRNA gene repertoire, a strikingly uniform codon usage, and
an extremely weak translational selection (12) compared to other sequenced filamentous ascomycetes (SOM
text S7.1).
Truffles: a hypogeous fruiting body delicacy. T. melanosporum is the first sequenced fungus producing
highly flavoured hypogeous fruiting bodies (SOM text S6.4). Genomic signatures of the long-standing (>2000
years-old) reputation of the black truffle as gastronomic delicacy are its extremely low allergenic potential
(Fig. S15), coupled with the lack of key mycotoxin biosynthetic enzymes (SOM text S6.2, table S14), and the
preferential overexpression of various flavour-related enzymes in fruiting body (Figs. S16-S18). Among the
latter are specific subsets of sulfate assimilation and S-amino acid interconversion enzymes – especially,
cystathionine lyases known to promote the side-formation of methyl sulfide volatiles abundant in truffles (13) –
as well as various enzymes involved in amino acid degradation through the Ehrlich pathway and giving rise to
known truffle volatiles and flavors, e.g. 2-methyl-1-butanal (SOM text S7.4) (Figs. S17 & 18). Also notable,
given the subterranean habitat of this fungus, is the presence of various putative light-sensing components
(SOM text S6.6), which might be involved in light avoidance mechanisms and/or in the control of seasonal
developmental variations, especially those related to fruiting body formation and sexual reproduction.
The analysis of genes implicated in the mating process, including pheromone response, meiosis and
fruiting body development showed that most sex-related components identified in other ascomycetes are also
present in T. melanosporum (table S11). Sexual reproduction in ascomycete filamentous fungi is partly
controlled by two different mating-type (MAT) genes that establish sexual compatibility (14): one MAT gene
codes for protein with an alpha box domain, whereas the other encodes a high mobility group (HMG) protein
(SOM text S6.5). It was widely believed that T. melanosporum was an homothallic or even an exclusive
selfing species (15). The sequenced Mel28 strain contains the HMG locus and the opposite, linked alpha MAT
5
locus was identified in another natural isolate (Fig. S19), confirming recent hints that T. melanosporum is
heterothallic and thus an obligate outcrossing species (16). This result has major implications for truffle
cultivation which will be improved by the use of host plants harboring truffle strains of opposite mating types.
In most ascomycetes, the genomic region flanking the MAT locus shows an extended conservation (14), but
there is no synteny of the MAT loci between T. melanosporum and other sequenced fungi (Fig. S20).
Saprotrophism. We observed an extreme reduction in the number of enzymes involved in the
degradation of plant cell wall (PCW) oligo- and polysaccharides (table S11). A comparison of
T. melanosporum candidate CAZymes (17) (SOM text S6.11) with those of ascomycetous phytopathogens
points to an adaptation to symbiosis (tables S23 & S24). Reduction in PCW CAZymes affects almost all
glycosyl hydrolase (GH) families, some of which are completely absent. For instance, there is no GH5
cellulase appended to a cellulose-binding module (CBM1) and no cellulases from families GH6 and GH7 were
found in the genome (table S24). However, a GH5 endoglucanase, together with a secreted GH12 xyloglucanspecific endoglucanase, a pectin methylesterase, a secreted GH28 polygalacturonase and a
rhamnogalacturonan acetylesterase, were amongst the most highly upregulated transcripts in ECM root tips,
suggesting a role for these enzymes in PCW degradation and remodeling during host colonization (table S3 &
fig. S21). This repertoire of mycorrhiza-induced cell wall degrading enzymes is likely to be of profound
importance for the symbiotic interaction. At variance with L. bicolor (5), it appears that T. melanosporum
mycelium penetrates between colonized roots by degrading apoplastic pectin polymers.
The ability to establish ECM symbioses is a widespread characteristic of various ascomycetes and
basidiomycetes (1). The truffle genome reveals features of an ancestral symbiotic lineage that diverged from
other fungal lineages >450 Myr ago (11). Despite their similar symbiotic structures and similar beneficial
effects on plant growth, the ascomycete T. melanosporum and the basidiomycete L. bicolor encode strikingly
different proteomes – compact with very few multigene families vs. large with many expanded multigene
families – and symbiosis-regulated genes. Effector-like proteins, such as the L. bicolor ECM-induced SSP
MiSSP7 (2), are not expressed in T. melanosporum ectomycorrhizas. Based on our results, the ECM appears
6
as an ancient innovation which developed several times during the course of Mycota evolution using different
‘toolkits’ (18). Sequencing of the T. melanosporum genome has provided unprecedented insights into the
molecular bases of symbiosis, sex and fruiting in a most popular representative of the only lifestyle not yet
addressed by Ascomycota genomics (19). It will be a major step in moving truffle research into the realm of
ecosystem science, nothing to say about the exceptional social and cultural impact of a deeper understanding
of the genome of one of the worldwide recognized icons of European gastronomy and culture.
References and Notes
1.
S.E. Smith, D. J. Read, Mycorrhizal Symbiosis (2nd edition, Academic Press, London) (1996).
2.
F. Martin et al., Nature 452, 88 (2008).
3.
F. Martin, M. A. Selosse, New Phytol. 180, 296 (2008).
4.
Material and methods are available as supporting material on Science Online.
5.
A. Mello, C. Murat, P. Bonfante, FEMS Microbiol. Lett. 260, 1 (2006).
6.
J. E. Galagan, M. R. Henn, L. J. Ma, C. A. Cuomo, B. Birren, Genome Res. 15, 1620 (2005).
7.
J.E. Galagan, E.U. Selker, Tr. Genetics 20, 417 (2004).
8.
J. K. Hane, R. P. Oliver, BMC Bioinformatics 12, 478.000 (2008).
9.
M. Lynch, J.S. Conery, Science 302, 1401 (2003).
10. Murat C, Diez J, Luis P, Delaruelle C, Dupre C, Chevalier G, Bonfante P, Martin F, New Phytol. 164, 401 (2004).
11. JW Taylor, ML Berbee, Mycologia 98, 838 (2006).
12. P. G. Higgs, W. Ran, Mol. Biol. Evol. 25, 2279 (2008).
13. R. Splivallo, S. Bossi, M. Maffei, P. Bonfante, Phytochem. 68, 2584 (2007).
14. JA Fraser, J Heitman, Mol Microbiol 51: 299 (2004).
15. G. Bertault, M. Raymond, A. Berthomieu, G. Callot, D. Fernandez, Nature 394, 734 (1998).
16. C. Riccioni C, B. Belfiori B, A. Rubini A, V. Passeri V, S. Arcioni S, F. Paolocci F, New Phytol. 180, 466 (2008).
17. B. L. Cantarel, P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, B. Henrissat, Nucl. Ac. Res. 37, D233 (2009).
18. D. S. Hibbett, P. B. Matheny, BMC Biology 7, 13 (2009).
19. DM Soanes, I Alam, M Cornell, HM Wong, C Hedeler, NW Paton, M Rattray, SJ Hubbard, SG Oliver, NJ Talbot, PLoS one 3,
e2300 (2008).
20. We thank the late L Riousset and C Dupré for providing the Mel28 isolate. The authors acknowledge Jean Weissenbach and
7
Marc-Henri Lebrun for continuous support. The genome sequencing of Tuber melanosporum was funded by the Genoscope,
Institut de Génomique, CEA and Agence Nationale de la Recherche (ANR). Annotation and transcriptome analysis were
supported by INRA, the European FP6 Network of Excellence EVOLTREE, Région Lorraine, the ANR FungEffector project,
Fondazione Cariparma, Compagnia di San Paolo and the Italian Ministry of Education, University and Research (MIUR), Regione
Umbria and Instituto Pasteur Fondazione Cenci Bolognetti. F.M. coordinated the project, annotation and transcriptome analysis;
P.W. coordinated the sequencing and automated annotation at Genoscope. F.M. and S.O. wrote the manuscript with input from
P.B.. R.B., A.K., O.J., B.M., E.M., C.M., B.N., R.P., B.P., A. R. and P.W. also made substantial contributions (listed in alphabetical
order). All others contributed as members of the Tuber genome consortium or Genoscope sequencing and are listed in
alphabetical order. Susanne von Pall di Tolna assisted in EST analysis. We would like to thank Antonella Bonfigli, Michele
Buffalini, Sabrina Colafarina, Timothé Flutre, Shwet Kamal, Paola Ceccaroli, Christophe Roux, Roberta Saltarelli, and Osvaldo
Zarivi for their assistance in annotation, and David Hibbett and John Heitman for useful comments on an early draft of the
mansucript. Assemblies and annotations are available at INRA (http://mycor.nancy.inra.fr/IMGC/TuberGenome/) and Genoscope
(https://www.genoscope.cns.fr/secure-nda/Tuber/html/entry_ggb.html). Genome assemblies together with predicted gene models
and annotations were deposited at DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank under the project
accession numbers CABJ01000001-CABJ01004455 (WGS data) and FN429986-FN430383 (scaffolds and annotations).
Supporting Online Material
Supplementary Methods, Results & Discussion
References for Supplementary Materials
Supplementary Tables
Supplementary Figures
8
Figure Legends
Figure 1. Genomic landscape of Tuber melanosporum. Area charts quantify transposable elements (66%)
and genes (18%) on the supercontig 5 of the Arachne assembly. Heat maps tracks detail the distribution of
selected elements. SSR, simple sequence repeats; TE, transposable elements; LTR, long terminal repeat
retrotransposons; LINE, Long interspersed elements; TIR, Terminal inverted repeats; no cat, unknown class of
TE. Figures for the 15 largest supercontigs are available at INRA TuberDB.
9
10
SUPPLEMENTARY INFORMATION
The Black Truffle Genome Uncovers Evolutionary Origins and Mechanisms of Symbiosis
Supplementary Methods, Results & Discussion
1. Background information
2. Genome sequencing and assembly
3. Transposable elements
4. Gene prediction and annotation
5. Orthology, synteny, tandem repeats and multigene families
6. Targeted annotation of specific gene categories
7. Non-coding RNAs
8. Whole-genome exon oligoarray analyses
9. References for Supplementary Materials
Supplementary Tables
Supplementary Figures
SUPPLEMENTARY INFORMATION
The Black Truffle Genome Uncovers Evolutionary Origins and Mechanisms of Symbiosis
Supplementary Methods, Results & Discussion
1. Background information
1.1. The life cycle of Tuber melanosporum
The Perigord truffle (Tuber melanosporum Vittad.) is endemic to calcareous soils in southern Europe and
found in symbiotic association with roots of deciduous trees, mostly oaks (Quercus spp.) and hazelnut trees
(Corylus avellana). The fungus requires a host tree to complete its life cycle and produce hypogeous fruit
bodies, so-called truffles (Fig. S1) (1). The meiotic spores germinate in the Spring, producing a vegetative
mycelium growing in the soil and the rhizosphere, which results in colonisation of root tips and further
development of ectomycorrhizas. Extramatrical hyphae then aggregate to form fruit body initials. The latter
developed to the fruit body during Fall and early Winter completing the truffle life cycle. In this mature truffle,
the ascogenous heterokaryotic pseudotissues, resulting from an unknown fertilization process after
outcrossing, are surrounded by homokaryotic maternal pseudotissues. After meiosis, the ascospores are
dispersed, mainly by mycophagous animals, including wild boars. The spores pass through the digestive tract
and are dispersed in the faeces over short distances (several kilometres). This short-distance land dispersal
contrasts with the longer-distance dispersal of epigeous fungal species through airborne spores. Tuber
species are found in temperate, mediterranean, and continental climates. They are excluded from tropical, dry
(annual rainfall less than 350 mm) and very cold climates.
The fruiting body of T. melanosporum is an edible truffle (= hypogeous ascocarp or ascoma), which is a highly
appreciated delicacy for its delicate organoleptic properties (i.e., taste and aroma). The high prize of the
Perigord truffle (from 300 to 1000 €/kg) has prompted the development of its culture through man-made
inoculation of seedlings. It can be assumed that the natural distribution and genetic structure of populations of
this black truffle species has been structured at least by five major factors: (i) the distribution of its host plant
species (i.e. ectomycorrhizal deciduous trees), (ii) the spore dispersal by mycophagous animals; (iii) limiting
ecological factors (calcareous soils and a temperate climate), (iv) geographical barriers (i.e. Mediterranean
Sea, which limits its expansion towards North Africa) and (v) historical events (i.e. northward recolonization
routes from glacial refugia in southern Europe). Phylogeographic analysis of T. melanosporum populations
(2,3) suggested that host post-glacial expansion was one of the major factors that shaped the truffle population
structure.
The fruiting of ectomycorrhizal Tuber spp. depends on a complex set of variables, including metabolites and
signals produced by the host plant, the nutritional status of the substrate, and unknown environmental cues
(e.g., humidity and temperature) (4,5). The different types of cells and (pseudo)tissues of fruit bodies of
ascomycetes (ascomata) are the result of a differentiation process leading to the production of asci containing
meiospores. Morphological descriptions of ascoma development in truffles are scarce and mainly illustrate
2
SUPPLEMENTARY INFORMATION
advanced developmental stages (6). This situation is due to the hypogeous habitat of truffles, which leads to
erratic sampling. In addition, symbiotic relationships are required for the development of the truffle fruit body,
and fruit bodies cannot be produced in vitro (1).
1.2. Phylogeny of T. melanosporum
According to molecular clock analyses, the genus Tuber would have arisen between 270 and 140 million years
(Myr) ago (7). Modern plant phylogenies show that the ectomycorrhizal lifestyle has arisen independently over
the course of evolution in Pinaceae and several disparate lineages of angiosperms (8,9). The oldest known
fossil ectomycorrhiza date from 50 Myr ago (11), but it is now thought that the symbiosis predates this period
by some time (~ 135 Myr), because the Pinaceae and many of the angiosperm families, including the
Dipterocarpaceae, whose current members establish ectomycorrhizal symbiosis were extant well before 50
Myr, along with the major fungal lineages with modern ectomycorrhizal representatives (12,13,14).
The Perigord truffle (Tuber melanosporum Vittad.) belongs to the Ascomycota (Discomycetes, Pezizomycota,
Pezizomycetes, Pezizales, Tuberaceae). This is the first Pezizomycetes sequenced to date. This phylum is
considered as the earliest diverging lineage within the Pezizomycotina clade. The plant pathogens, Fusarium
graminearum and Magnaporthe grisea, and the saprotrophs, Podospora anserina and Neurospora crassa,
belong to the Sordariomycetes. The human pathogens, Aspergillus fumigatus, Neosartorya fisheri and
Coccidioides immitis, and the saprotrophs, Aspergillus nidulans and A. niger, belong to the Eurotiomycetes.
The plant pathogens Phaeosphaeria nodorum and Pyrenophora tritici-reprentis belong to the Dothideomycetes
(Fig. S2). The analysis of the T. melanosporum genome therefore has the potential to illuminate features of
their last common ancestor – the ancestral Ascomycotina – which lived approximately 400 to 800 Myr (15).
Phylogenetic analysis based on well-conserved protein-coding genes showed that T. melanosporum is the
earliest diverging clade within the current sequenced fungi belonging to the Pezizomycotina (Fig. S2).
1.3. Tuber melanosporum Mel28: origin of the sequenced strain and culture conditions
A fruiting body of T. melanosporum was collected by Louis Riousset at St Rémy de Provence (Bouches-duRhône, France) in February 1988 and deposited at the INRA-Clermont-Ferrand Tuber Collection. Free-living
vegetative mycelium from this fruiting body was subcultured (= strain Mel28) by Chantal Dupré and Gérard
Chevalier (INRA-Clermont-Ferrand). For purification of the high molecular weight DNA used for genomic
library construction, the haploid homokaryotic strain Mel28 was grown on liquid medium and incubated at
25°C. This strain is available upon request to Francis Martin (INRA-Nancy).
2. Genome sequencing and assembly
2.1. Shotgun sequencing strategy and results
A whole-genome shotgun strategy (WGS) was adopted for sequencing and assembling the T. melanosporum
draft genome. All genomic DNA was obtained from the vegetative mycelium of the homokaryotic haploid strain
Mel28. Template DNA (200 µg) was extracted from mycelium using the Qiagen Genomic Tip DNA extraction
kit. This high molecular weight DNA (~40 kb average size) was randomly sheared and size-fractionated to
create plasmid libraries with roughly 3 kb and 10 kb inserts. From these two libraries, 1284900 reads were
obtained by Sanger sequencing at the Genoscope facilities. The reads were screened for vector using
cross_match, then trimmed for vector and quality. Reads shorter than 100 bases after trimming were then
excluded. After trimming, the removal of multiple read attempts and the exclusion of overly-short reads, the
pool of data available for the assembly consisted of 1262177 reads, with ~1250 MB of sequence. Nearly 10X
total sequence redundancy of the predicted 125 megabase was thus obtained from the 3 kb and 10 kb
3
SUPPLEMENTARY INFORMATION
plasmids.
The data was assembled using the ARACHNE assembler (16). The 4484 contigs (N50 = 62 kb) were
assembled (%ID >98%) in 398 supercontigs (N50= 637 kb) corresponding to 124.946 Mb of sequence (%GC=
52.02). Based on the number of alignments per read, the main genome scaffolds were at a depth of 10. The
largest supercontig has a size of 2.785 Mb. The T. melanosporum genome is the largest fungal genome
published so far [see the Broad Institute’s Fungal Genome Initiative (FGI),
http ://www.broad.mit.edu/annotation/fungi/fgi/ and JGI (http ://www.jgi.doe.gov/) web sites]. No evidence of
whole genome duplication or large-scale dispersed segmental duplication was detected.
The 20 largest supercontigs (between 1 and 2.7 Mb), presumably correspond to chromosome arms or even
entire chromosomes.
Assemblies and annotations are available at INRA (http://mycor.nancy.inra.fr/IMGC/TuberGenome/) and
Genoscope (https://www.genoscope.cns.fr/secure-nda/Tuber/html/entry_ggb.html). Genome assemblies
together with predicted gene models and annotations were deposited at DNA Data Bank of Japan/European
Molecular Biology Laboratory/GenBank under the project accession numbers CABJ01000001-CABJ01004455
(WGS data) and FN429986-FN430383 (scaffolds and annotations).
2.2. Telomeric repeats
Putative telomeric repeats were identified from repetitive sequences found at the ends of ARACHNE
supercontigs. The consensus telomeric repeat, T2AC3, was used as a query sequence in a BLASTN search
(with dust filter disabled, E-value < 1e-10) against the assembled genome; 17 supercontigs have a telomere on
one end; most of them are associated to a LINE transposable element. This would suggest there are at least
eight chromosomes in the haploid T. melanosporum genome in agreement with the karyotyping (17).
2.3. Completeness of the assembly
A low fraction of shotgun reads (i.e., 19214) was not assembled in the ARACHNE assembly suggesting that
the assembled regions did capture the vast majority of protein-coding genes in T. melanosporum. This was
checked by aligning 88829 Sanger and 113855 ‘454’ ESTs to the assembly using a two-step strategy. As a
first step, BLAST served to generate the alignments between the repeat-masked EST sequences and the
genomic sequence using the following settings: W = 20, X = 8, match score = 5, mismatch score = -4. The
sum of scores of the high-scoring pairs was then calculated for each possible location, then the location with
the highest score was retained if the sum of scores was more than 1,000. Once the location of the transcript
sequence was determined, the corresponding genomic region was extended by 5 kb on either side. Transcript
sequences were then realigned on the extended region using EST_GENOME (mismatch 2, gap penalty 3)
(http://www.well.ox.ac.uk/~rmott/ESTGENOME/est_genome.shtml) to define transcript exons. These transcript
models were fused by a single linkage clustering approach, in which transcripts from the same genomic region
sharing at least 100 bp are merged. The assembled genome sequence provides near complete coverage of
genes, since 98.5% of T. melanosporum cDNAs (section 2.4) could be aligned to the assembly covering 14.8
Mb (12%) of the assembly.
2.4. cDNA libraries and EST clustering
The cDNAs were constructed from free-living mycelium (FLM) and fruiting body (FB) at the Genoscope. The
library FLM was obtained from the homokaryotic strain Mel28 used for genome sequencing. Gleba (sterile
4
SUPPLEMENTARY INFORMATION
mycelium) of a fruit body harvested in Auvergne was used for the library FB. Harvested mycelia were frozen in
liquid nitrogen and stored at -80° C prior to RNA extraction. Poly-A+ RNA was used to make cDNA using the
CloneMiner cDNA library construction Kit (pDONRTM222 vector, Invitrogen Life Technologies) following the
supplier’s instructions. Paired-end sequencing of cDNA clones was performed at the Genoscope using
conventional Sanger sequencing technology (ABI3730xl DNA analyzers, Applied Biosystems). Base calling of
the 104549 5’- and 3’-cDNA sequences was carried out using Phred. Leading and trailing vector, and
polylinker sequences were removed by Seqclean filters. Groups of sequences were assembled into clusters
using Cap3 and parsed using dedicated Perl scripts. The 92371 edited Sanger EST will be available at the
National Center for Biotechnology Information (NCBI) dbEST (accession number FP383504FP458874,FP458876-FP475875).
Pyrosequencing was carried out on cDNAs from 200 µg of total RNA extracted from a fruiting body (sample
#20044802) collected in October 2007 (by Henri Dessolas in St Front d’Alemps, Dordogne) under an oak tree.
PolyA+ RNA were purified on Oligotex (Qiagen) according to the manufacturer’s instructions. cDNAs were
synthesized using the SMART cDNA synthesis kit (Clontech) according to the manufacturer’s instructions and
purified on a QIAquick PCR purification column (Qiagen). Adapter ligation, nebulization and DNA sequencing
was performed by COGENICS (Meylan, France). A half-plate pyrosequencing on the Genome Sequencer
FLX 454 System (454 Life Sciences/Roche Applied Biosystems, Nutley, New Jersey, USA) resulted in 164904
reads; 136640 reads which satisfied the length and sequence quality criteria and were assembled by using
Newbler; 5641 TCs mapped to the genome assembly or the gene models.
2.5. Detection of single nucleotide polymorphisms in ESTs
To detect SNPs in the cDNA pools, we used the gene models as a reference sequence to which individual
ESTs were aligned. We used three EST datasets:
– 44361 Sanger ESTs from the free-living homokaryotic mycelium of the isolate Mel28 from Provence
(used for genome sequencing) (= Tm_FLM_AB).
– 44468 Sanger ESTs from the heterokaryotic fruiting body so-called ‘Auvergne_Chevalier
‘(= Tm_FB_CD).
– 136640 ‘454’ ESTs from the heterokaryotic fruiting body so called ‘Dordogne_ Dessolas’
(= Tm_FB_454).
Sequence alignments were carried out with BLASTN and parsed with custom-made scripts based on Bioperl.
Each read was aligned to only a single best homologous site in the reference genomic sequence. Reads
aligning equally well in more than one location in the genome were discarded. For the analysis reported here,
we considered only single nucleotide polymorphisms (SNPs), excluding all indels and variants involving more
than one nucleotide. We also imposed the constraint that a nucleotide position must be covered by at least
two ESTs in a given isolate. The number of sites meeting these criteria for the different isolates was as
follows: Tm_FLM_AB, 1164268 sites; Tm_FB_CD, 1221914 sites and Tm_FB_454, 526759 sites. A site was
considered to be polymorphic if (i) the substitution was confirmed by at least two different ESTs in a given
isolate; and (ii) the frequency of the mutation in the isolate was >=40%.
Protein coding sequences for each consensus was delimited using the T. melanopsorum peptides translated
from gene models. Next, we determined whether SNPs positioned in consensus coding sequences introduced
synonymous or nonsynonymous mutations by comparing the translated amino acids from the reference and
variant sequences. By considering the proportion of silent (~25%) and replacement (~75%) sites, we
calculated the rate of synonymous substitution per silent sites (S), and the rate of nonsynonymous
substitutions per replacement sites (R). For the ‘Tm_FLM_AB’ ESTs, 292803 synonymous and 878410
5
SUPPLEMENTARY INFORMATION
nonsynonymous sites on a total of 1171213 sites were found; 7 synonymous and 14 nonsynonymous
substitutions were identified (S = 2.4e-05, R = 1.6e-05; R/S = 0.67). For the ‘Tm_FB_454’ ESTs, 133634
synonymous and 400902 nonsynonymous sites on a total of 534536 sites were found; 95 synonymous and
153 nonsynonymous substitutions were identified (S = 7.1e-04, R = 3.8e-04; R/S = 0.54). For the ‘Tm_FB_CD’
ESTs, 307272 synonymous and 921818 nonsynonymous sites on a total of 1229090 sites were found; 66
synonymous and 85 nonsynonymous substitutions were identified (S = 2.1e-04, R = 9.2e-05; R/S = 0.43).
It follows from the above calculations that the level of neutral polymorphism (S) in T. melanosporum is in the
10-4 range; two order of magnitude lower than that observed in N. crassa (S = ~2 x 10-2) and S. cerevisiae (S =
~4 x 10-2) (18). The parameter S can be used as an estimator of the composite parameter Nu (where N is the
long-term effective population site and u is the mutation rate per nucleotide per generation). Given the relation
S = 2Nu and assuming a mutation rate of the order of 10-8 to 10-9, the long-term effective population size for T.
melanosporum would be between 10000 and 350000 individuals.
3. Transposable elements
Overall, the T. melanosporum genome contains an unusual, strikingly rich and diverse population of
transposable elements (TEs) (Fig. S3). These TEs were predicted anonymously using the REPET pipeline
(19). The TEdenovo pipeline was used to detect TEs, grouped them in families and classified the consensus
of each family. The consensus sequences were obtained with PILER (20), RECON (21) and BLASTER (19)
clustering methods. The TEannot pipeline annotated TEs in the genome using the consensus library obtained
as output of TEdenovo. Using the 2515 consensus sequences coming from the TEdenovo pipeline, TEannot
masked 71.32 Mb corresponding to 57.72 % of the T. melanosporum genome (Fig. S3).
Although previously identified major TE superfamilies found in other fungi were found in T. melanosporum, 728
out of the 2515 TEs consensus sequences (12.97 % of the genome) were specific to T. melanosporum (Fig.
S3). The most abundant TEs are Class 1 Gypsy/Ty3-like elements which represent 29.51% of the genome
(Fig. S4). To identify full length LTRs retrotransposons, a de novo search was also performed with
LTR_STRUC (22). The program yielded 304 full-length candidate LTR retrotransposon sequences, which
were checked for their homology using BLASTN algorithm against the consensus sequences coming from the
REPET pipeline. Amongst the 304 putative full length LTRs, 271 were attributed to Gypsy/Ty3-like elements
and 13 to Copia/Ty1-like. Other 20 elements displayed sequence identity with several different family of TE
elements indicating that they did not correspond to a single repeat element.
Comparison of the T. melanosporum genome with other fungal genomes reveals an unusual genome
organization, comprised of blocks of protein-coding genes in which gene density is relatively high and repeat
content (e.g. LTR-Rs) is relatively low, separated by regions in which gene density is low and repeat content is
high (Fig. 1). Recent proliferation of Gypsy elements underlies the genome expansion. The insertion age of
full length LTRs (Fig. S5) was determined from the evolutionary distance between 5’- and 3’-solo LTR derived
from a ClustalW alignment of the two LTR sequences using the Kimura correction in ClustalW. For the
conversion of the sequence distance to putative insertion age, a substitution rate of 1.3e-8 mutations per site
per year was used (23). Most full-length Gypsy/Ty3-like elements were inserted in the T. melanosporum
genome 2 to 3 millions years ago (Fig. S5). LINE I are the second most frequent TE family corresponding to
5.68 % of T. melanosporum genome. Amongst the class II elements, the Tc1/Mariner are the most abundant
with 258 consensus sequences corresponding to 4.20 % of T. melanosporum genome (Fig. S3 & S4).
Consistent with a model of repeat-driven expansion of the T. melanosporum genome, the majority of TEs in
the genome are highly similar to their consensus sequences, indicating a high rate of recent transposon
activity (Fig. S5). In addition, we have observed and experimentally confirmed examples of active elements
6
SUPPLEMENTARY INFORMATION
(>1000) by transcript profiling using the NimbleGen oligoarrays (section 9).
4. Gene prediction and annotation
Most of the genome comparisons were performed with repeat masked sequences. For this purpose, we
searched and masked sequentially several kinds of repeats: known T. melanosporum TEs (see SOM section
4), repeats and transposons available in Repbase (http://www.girinst.org/repbase/update/index.html) with the
RepeatMasker program (24) (http://www.repeatmasker.org/), and tandem repeats with the TRF program (25).
The UniProt (26) database was used to detect well conserved genes between T. melanosporum and other
species. As GeneWise (27) is time greedy, the UniProt database was first aligned with the T. melanosporum
genome assembly using BLAT (with parameters minIdentity = 0). Then High-scoring Segment Pairs (HSPs)
were filtered on their score and their length. HSPs from the same protein were clustered on the genomic
position, to assign one (or several) loci to each peptide. For a given locus, the five best matches were chosen
for a GeneWise alignment. GeneID (28) and SNAP (Semi-HMM-based Nucleic Acid Parser) (29). Ab initio
gene prediction softwares were trained on 250 protein-coding genes that had been manually annotated, using
cDNA sequences, and reviewed by the Consortium experts.
All the resources described here were used to automatically build T. melanosporum gene models using GAZE
(30) (http://www.sanger.ac.uk/Software/analysis/GAZE). Individual predictions from each of the programs
(GeneID, SNAP, Genewise and est2genome) were broken down into segments (coding, intron, intergenic) and
signals (start codon, stop codon, splice acceptor, splice donor, transcript start, transcript stop). Exons
predicted by ab-initio softwares, GeneWise, and Est2genome were used as coding segments. Introns
predicted by GeneWise and est2Genome were used as intron segments. Intergenic segments created from
the span of each mRNA, with a negative score (coercing GAZE not to split genes). Predicted repeats were
used as intron segments, and non-coding RNAs as intergenic segments, to avoid prediction of genes coding
proteins in such regions. The whole genome was scanned to find signals (splice sites, start and stop codons),
and two signals, transcript START and STOP, were extracted from the ends of mRNAs. Each segment
extracted from a software output which predicts exon boundaries (like GeneWise, est2genome or ab-initio
predictors), was used by GAZE only if GAZE chose the same boundaries. Each segment or signal from a
given program was given a value reflecting our confidence in the data, and these values were used as scores
for the arcs of the GAZE automaton. All signals were given a fixed score, but segment scores were context
sensitive: coding segment scores were linked to the percentage identity (%ID) of the alignment; intronic
segment scores were linked to the %ID of the flanking exons. The impact of each data source (GeneID, etc.)
was evaluated on a reference sequence, and a weight was assigned to each resource to further reflect its
reliability and accuracy in predicting gene models. This weight acts as a multiplier for the score of each
information source, before processing by GAZE. When applied to the entire assembled sequence, GAZE
predicts 7496 gene models; 1309 of these predicted genes have been manually curated and revised (if
needed).
Protein domains were predicted using InterProScan (31) against various domain libraries (Prints, Prosite,
Pfam, ProDom & SMART) (http ://www.ebi.ac.uk/interpro/). Annotations were also assigned to Gene Ontology
(GO) (32) (http://www.geneontology.org/), eukaryotic clusters of orthologous groups (KOG) (33) Kyoto
Encyclopedia of Genes and Genomes (KEGG) database (34) (http ://www.genome.jp/kegg/) by homology
search against the corresponding databases and against EC number, using PRIAM
(http://priam.prabi.fr/REL_JUL06/index_jul06.html), by homology search against the corresponding databases.
The reference metabolic pathways, including in the Kyoto Encyclopedia of Genes and Genomes (KEGG)
database (34) (http ://www.genome.jp/kegg/), were deduced from the EC number.
7
SUPPLEMENTARY INFORMATION
The 7496 nuclear gene models that were automatically predicted, annotated and promoted to a ‘Reference’
set, including 1309 models manually annotated at community “jamborees” are available at the Genoscope
Genome Portal for T. melanosporum (https://www.genoscope.cns.fr/secure-nda/Tuber/html/entry_ggb.html)
and at the INRA TuberDB (http://mycor.nancy.inra.fr/IMGC/TuberGenome/). Similarly to other fungi, the
majority (88%) of genes showed multi-exon gene structure with average of 4.5 exons per gene. The average
gene length was 2073 bp and the average predicted protein length was 439 amino acids (table S1). A broad
distribution of all exon lengths peaked at around 340 nucleotides, whereas for introns the peak occurred at
around 107 nucleotides. This average intron length was consistent with the trend observed in other fungi. We
assigned GO terms to 3646 (48.6%) T. melanosporum proteins including 3157, 2507 and 1499 genes with
molecular function, cellular component and biological process, respectively. We also assigned 5146 (68%)
proteins to KOG clusters. Analysis of gene density (Fig. S7) was performed by plotting the abundance of
genes in a sliding window (100 kb for T. melanosporum and 30 kb for other fungi), binned and plotted. Up to
241 fragments (<100 AA) of protein-coding exons, showing a significant similarity (BLASTX, cut-off e-value >
E-5) to known proteins in the NCBI non-redundant database, are located in TE-rich regions suggesting that TE
activity played a key role in pseudogene generation.
5. Orthology, synteny, tandem repeats and multigene families
5.1. Orthology
Up to 5990 of T. melanosporum predicted proteins (80%) showed a significant similarity (BLASTX, cut-off evalue > E-5) (35) to known proteins in the NCBI non-redundant database (May 2009). Most putative homologs
(best reciprocal similarity BRH) were found in the Pezizomycotina. Amongst the predicted proteins, 3970,
5596 and 5644 showed significant sequence similarity to proteins from Saccharomyces cerevisiae,
Neurospora crassa and Aspergillus niger, respectively (42 to 48% mean sequence similarity) (Fig. S8). The
number of T. melanosporum genes showing a sequence similarity with N. crassa proteins ranged from 4667 to
5124 using a BLASTP cut-off e-value of 1,00E-10 or 1,00E-1, respectively. This agrees with the predicted
ancient separation (>450 Myr ago) of the Pezizomycetes from the other ancestral fungal lineages (15). The
time of divergence between the modern major families of Ascomycetes occurred approximately 450 Mya (Fig.
S2). A substantial fraction (20%) of predicted genes in T. melanosporum are found to lack sequence similarity
to any of the genes in public databases. The origin of these species-specific genes or orphan genes is poorly
understood. An enrichment of orphan genes has been found at subtelomeric regions in Aspergillus species
(36) and N. crassa (37). We therefore mapped the localization of T. melanosporum orphan genes on the
largest supercontigs of the assembly (>0.5 Mbp) (Fig. S14). The orphan genes were evenly distributed along
the supercontigs in most protein-rich regions of the genome.
5.2. Computation of blocks of conserved gene order amongst Ascomycete genomes
Regions of conserved collinear gene order (syntenic regions) between T. melanosporum and Aspergillus
nidulans, A. fumigatus, A. oryzae, A. terreus, Chaetomium globosum, Coccidioides immitis, Fusarium
graminearum, F. verticillioides, Magnaporthe grisea, Neuropora crassa, Podospora anserina, Saccharomyces
cerevisiae, S. kluyveri, Trichoderma reesei, and Yarrowia lipolytica were computed using FISH (38) based on
BLASTP matches with a cutoff of E-value threshold of 1e-5. Input files for FISH was produced using custom
Perl code and FISH was run with default parameters and required the minimal block to contain at least four
anchors. Of the ~5600 T. melanosporum genes that have an ortholog in other fungi, very few show
conservation of neighboring orthologs (synteny) in at least one of the other species (Fig. S9).
T. melanosporum genome therefore shows a structural organization strikingly different from other sequenced
ascomycetes; the largest syntenic region (with Coccidioides immitis) only contains 99 genes with 39 orthologs
(Fig. S10).
8
SUPPLEMENTARY INFORMATION
5.3. Identification of tandem duplications, segmental duplications and duplications located anywhere in the
genome
Three different categories of gene duplicates were identified in the ascomycete genomes, tandem when genes
duplicates sit next to each other, segmental duplicates when genes duplicates are part of a longer fragments
of the genome duplicated as a whole, and the rest where duplication occurs all-over the genome. Tandem
duplicates (>80% of nucleotide identity on their whole length) were identified through BLASTP algorithm (35)
of adjacent genes using a cutoff of 1.E-5 and a sliding window of 10 kbp. Segmentally duplicated genes were
identified through the synteny analysis as described in Section 6. Duplicated gene-pairs located anywhere in
the genome were identified by the BLASTCLUST program (35) with bidirectional length coverage of 0.9. Two
sequence identity cutoffs were used; 80% and 90% identity. The two datasets were subsequently divided into
gene pairs having <80% percent identity (low similarity) and > 80 % identity (high similarity). One of the most
striking characteristics of the T. melanosporum genome sequence is the almost complete absence of highly
similar gene pairs (Fig. S11A & S12). Of the predicted 7496 protein-coding genes, only 11 pairs share >80%
amino-acid identities in their coding sequence. This includes unlinked pairs of adenylosuccinate lyase C
(GSTUMT00000346001/ GSTUMT00000611001), amino acid permease (GSTUMT00001333001/
GSTUMT00008162001), ubiquitin (GSTUMT00001535001/ GSTUMT00005111001), Hsp70
(GSTUMT00011349001/ GSTUMT00002222001), Major Facilitator Superfamily protein
(GSTUMT00005575001/ GSTUMT00004129001), acyl-CoA dehydrogenase (GSTUMT00006563001/
GSTUMT00008938001) and cytochrome P450 (GSTUMT00000763001/ GSTUMT00009894001) genes. At
the protein level, seven pairs share >90% amino-acid identities in their coding sequence, whereas 30 pairs (60
genes) share >80% amino-acid identities (Fig. S11A). Only three tandem duplications were found coding for
an aldolase, a DEAD domain-containing protein and an orphan protein.
5.4. Multigene families and evolutionary analysis of multigene families (CAFE analysis)
The species choice aimed to maximize the phylogenetic coverage with similar lineage radiation times in
Sordariomycetes, Dothideomycetes, Leotiomycetes, and Eurotiomycetes. Protein sets from Stagonospora
nodorum (Phaeosphaeria nodorum) (16597 models), Botrytis cinerea (Botryotinia fuckeliana) (16389 models),
Sclerotinia sclerotiorum (14 522 models), Fusarium graminearum (Gibberella zeae) (15707 models),
Neurospora crassa (9823 models), Magnaporthe grisea (12832 models), Aspergillus fumigatus (9631 models),
and Laccaria bicolor (20 614 models) were retrieved from the NCBI Genome Database, the Joint Genome
Institute Portal or the Broad Institute Portal. Multigene families were generated from proteins in
T. melanosporum, representative Pezizomycotina phyla and L. bicolor (128941 predicted proteins) using TribeMCL tools (39) with default settings (BLASTP, cut-off e-value > E-5, inflation parameter = 3). In total, 11234
protein families (containing at least two sequences) were identified within the Ascomycetes (Fig. S11B).
Within these 11234 ascomycetous (Tribe-MCL) protein families, 4056 were found in T. melanosporum.
Amongst the 7496 T. melanosporum genes, 1441 (19%) were found in (Tribe-MCL) gene families (≥ 2
members); a value very similar to N. crassa (18%) and much lower than the ectomycorrhizal basidiomycete
Laccaria bicolor (55%). The percentage of proteins found in protein families was not related to genome size
and was lowest in T. melanosporum (Fig. S12). This was mainly due to the lower size of protein families (Fig.
S11), but also to the lower number of protein families in T. melanosporum as compared to the other
ascomycetes. The most abundant gene families are coding for proteins having a NB-ARC, Protein kinase,
Helicase, AAA ATPase, or WD40 domains (table S6).
Multigene families were analysed for evolutionary changes in protein family size using the CAFE program (40)
(Fig. S11). The program makes phylogenetic inferences on changes in family size based on the topology and
divergence times of a user defined linerarised tree using a maximum likelihood estimation to model the
9
SUPPLEMENTARY INFORMATION
random birth and death process of genes in each family. CAFE calculates for each branch in the tree whether
a protein family has not changed, is expanded, is contracted or has even gone extinct. Gene families that are
lineage specific (i.e. unique to one species) were analysed separately since CAFE makes the assumption that
at least one member of each family existed at the root. In addition to the classification, a p-value (0.001) for
each branch is reported for each family to assess the significance compared to a random dataset using a
Monte Carlo resampling procedure of gene gains and losses. This gives an indication whether the changes in
the family size are indications of adaptive expansions or contractions. A linearized phylogenetic tree was
constructed with estimates of the divergence times between T. melanosporum and other ascomycetes
(Magnaporthe grisea 70-15 (GenBank accession # AACU00000000), Nectria haematococca/Fusarium solani
(http://genome.jgi-psf.org/Necha2/Necha2.home.html), N. crassa OR74A (#AABX00000000), Botrytis cinerea/
Botryotinia fuckeliana B05.10 (#AAID00000000), Sclerotinia sclerotiorum 1980 (#AAGT00000000),
Stagonospora nodorum/Phaeosphaeria nodorum SN15 (#AAGI00000000) and Aspergillus nidulans
(#AACD00000000). The unique protein families were excluded from the analysis.
In T. melanosporum, 269 families were expanded, 5270 showed no change and 5695 families had undergone
contraction by comparison to a putative common ancestor Pezizomycotina having 11234 gene families.
Comparing the counts of protein families on all branches of the Pezizomycotina tree, the largest average
contraction in protein family size had occurred along the T. melanosporum lineage (Fig. S11B) with an average
expansion rate of - 0.518. Tables S7 and S8 present the largest contracted and expanded gene families in
T. melanosporum genome, respectively. The putative functions of these families were revealed by homology
searches using the PFAM tools and database (41). Relative abundance of the various protein domains has
then been vizualized by hierarchical clustering of the relative abundance of PFAM domains after
transformation of the frequency values into z-scores (Fig. S13). Dramatic gene family contraction occurred in
those genes predicted to have roles in metabolite transport (e.g., MFS, amino acid & sugar permeases),
secondary metabolism (e.g., polyketide synthases, cytochrome P450), and carbohydrate-active enzymes (see
below) (table S7, Fig. S13). On the other hand, several gene families showed a significant expansion, such as
those coding for proteins containing the TPR, NB-ARC and protein kinase domains (table S8, Fig. S13).
Overall, T. melanosporum only contains 58 species-specific gene families (most of which comprising two
members) (table S9), but a fairly large number of single-copy orphan genes (ca. 1356) scattered in gene-rich
gene regions (Fig. S14). Only 14 gene families are unique to L. bicolor and T. melanosporum (table S10), but
none of their members is specifically overexpressed in symbiosis.
6. Targeted annotation of specific gene categories
Gene categories corresponding to proteins playing a role in the fungal development, symbiosis and fruiting
body formations, such as genome defense-related proteins, sex genes, carbohydrate degrading enzymes, and
transporters, were targeted for manual review, phylogenetic analysis and gene model editing using ARTEMIS
(http://www.sanger.ac.uk/Software/Artemis/). Evidence for identifying genes and editing exon boundaries was
derived from protein and EST alignments, with focus provided by related proteins of known function.
6.1. RNA silencing and DNA methylation
In filamentous fungi, global genome defense relies on RNA interference (RNAi), also known as RNA-mediated
gene silencing, and on DNA methylation. Homologs of genes involved in both processes were identified in the
T. melanosporum genome (table S12). No orthologs of the annotated genes, except for one putative helicase
and the PP1 phosphatase, are present in S. cerevisiae. Some deviations in gene numbers, mainly regarding
RNA silencing components (e.g., the two paralogous Argonaute proteins TmelAGO2 and TmelAGO3), were
observed by comparison with an expanded set of reference ascomycetes (table S13). All but two siRNA/DNA
10
SUPPLEMENTARY INFORMATION
methyltransferase (DMT)-related genes were covered by ESTs and all of them gave above background
hybridization signals in at least one of the three life-cycle stages that were subjected to transcriptome analysis.
Gene numbers for core RNA silencing components in T. melanosporum are very close to those of N. crassa
and identical to those of M. grisea (table S13) Most of them are basal to groups of proteins containing
functionally identified RNAi components from Neurospora and other fungi (e.g., M. grisea) in which siRNAmediated post-transcriptional gene silencing (PTGS) has been documented. However, except for the RNAdependent RNA polymerase TmelRRPc, no putative component of the RNAi machinery of T. melanosporum
co-clusters with known PTGS (e.g., “quelling”; 42) components from N. crassa. Similar considerations hold for
meiotic silencing by unpaired DNA (MSUD; 43), a component of which – the SAD-1 RRP chaperone SAD-2 –
is missing in T. melanosporum (Fig. S6). Thus, if there are quelling- or MSUD-like processes in T.
melanosporum they are likely to be quite different from those operating in N. crassa. Interestingly, one of the
two Argonaute paralogs (TmelAGO3) clusters with the AGO protein of Schizosaccaromyces pombe (Fig. S6),
a fungus in which siRNA-mediated transcriptional gene silencing (RITS) has been described (44). The latter
process involves siRNA- and AGO-directed deposition of histone H3 K9 methylation marks on centromeric
DNA repeats by a specialized histone methyl transferase (CLR4), a ortholog of which is also present in T.
melanosporum (TmelCLR4; 53% identity; E-value=2e-27). Similar siRNA-mediated gene silencing processes
have been described in plants, while their existence and mode of action in animals are still controversial. The
lack of specialized miRNA biogenesis components such as a Drosha-like nuclease or a HEN1-like
methyltransferase (45) suggests the absence in T. melanosporum (as in all fungi sequenced so far) of these
particular non-coding RNAs and associated silencing mechanisms.
Two putative DNA methyltransferases are present in the T. melanosporum genome as in the genomes of most
filamentous ascomycetes sequenced so far (table S12). One of them (TmelDMT2) is homologous to, and
clusters with, the DNA methyltransferase DIM-2 from N. crassa (Fig. S6) and is accompanied by the orthologs
of all the proteins required for DIM-2 recruitment on hemimethylated DNA in this organism: the histone H3
Ser10-P phosphatase TmelPP1 (and its associated regulatory subunit TmelSDS22), the H3 K9
methyltransferase TmelCLR4 (DIM-5 in N. crassa), and the H3 3MeK9 binding (and heterochromatin forming)
protein TmelHP1 (46) (table S12). The other DNA methyltransferase (TmelDMT1) is orthologous to, and
clusters with the “methylation-induced premeiotically” (MIP; 47) enzyme MASC1 from Ascobolus immersus – a
fungus that is basal to the Pezizales, the same order to which T. melanosporum belongs – rather than with the
“Repeat-Induced Point Mutation” (RIP; 48) enzyme RID from N. crassa (Fig. S6). Potentially distinct functional
roles of the two Tuber DMTs are also supported by the presence of a single C-5 cytosine-specific DNA
methylase domain (PF00145) in both TmelDMT2 and Dim-2, rather than two separate domains as in MASC1,
RID and TmelDMT1, and by the preferential expression of the latter gene in fruiting bodies (six fruiting bodyderived ESTs vs 0 free living mycelium-derived ESTs and a FB/FLM expression ratio >3; as opposed to the
lack of any fruiting body expression bias for TmelDMT2). Finally, putative homologs of the Arabidopsis
SWI/SNF chromatin remodeling proteins DDM1 and DRM1 and of the histone deacetylase HDA6 – involved in
RNA-independent and siRNA-mediated DNA methylation in plants (49-51) – were identified
(GSTUMT00001029001, GSTUMT00000296001 and GSTUMT00010406001) suggesting that siRNAmediated DNA methylation may also contribute to gene silencing in T. melanosporum.
6.2. Allergome and mycotoxin profiling
Many fungi (e.g., Aspergillus fumigatus, Penicillium oxalicum, Penicillium citrinum, Cladosporium herbarum
and Alternaria alternata) are strongly allergenic (59). Fungal allergens are usually (glyco) proteins or
polysaccharides that elicit IgE antibody production upon interaction with the immune system of atopic
individuals. An even more serious threaten regards mycotoxins: chemically diverse small molecules that can
cause a variety of diseases, including cancer. Mycotoxins are produced by various filamentous Ascomycetes,
11
SUPPLEMENTARY INFORMATION
especially members of the Aspergillus and Fusarium genera, through well defined biosynthetic pathways (60).
Both issues are relevant to T. melanosporum because of its GRAS (Generally Recognized as Safe) status and
production of edible, highly prized fruiting bodies with a long-standing reputation as a “gourmet food” and
delicacy. To gain insight into the allergenic potential of truffles we searched the predicted proteome of T.
melanosporum using as reference the fungal allergen database, an inventory of the 92 allergenic proteins
presently identified in fungi (http://www.allergen.org). Since immunoreactions are determined by specific (often
surface-exposed) epitopes, the mere presence of a putative allergen ortholog does not represent conclusive
evidence for allergenicity. Also, recent estimates indicate that cross-reactivity (and thus prospective
allergenicity) becomes significant at amino acid sequence identity values ≥ 50% (61). A sequence identity
criterion was thus used to analyze the allergenic potential of T. melanosporum and to compare it with that of
GRAS (S. cerevisiae and N. crassa) and strongly allergenic (A. fumigatus) ascomycetes as well as with that of
other fungi, including the basidiomycete Laccaria bicolor, not explicitly associated with allergenicity. As
revealed by the heat map reported in Fig. S15, T. melanosporum, amongst the examined fungi, presents the
lowest allergenic potential after S. cerevisiae, with only four predicted proteins exhibiting >80% identity with
known allergens. Only one of these proteins, the ribosomal protein encoded by GSTUMT00008226001, is
more similar to its parent allergen (Alt a 12) than the corresponding protein from N. crassa.
A similar analysis was conducted on genes involved in mycotoxin biosynthesis in Aspergillus, Fusarium,
Alternaria, Penicillium and Trichothecium spp. Four mycotoxins (aflatoxin, trichothecene, fumonisin and
gliotoxin) and the corresponding biosynthetic genes and pathways were interrogated (62-65). As shown in
table S14, 51 potential homologs (E-value < 1e-10) of the 81 analyzed candidate mycotoxin biosynthetic genes
were found in the T. melanosporum genome. It should be noted, however, that in all the investigated cases a
minimum of six genes coding for key mycotoxin biosynthetic enzymes appear to be missing in
T. melanosporum (table S14). Therefore, to the best of our present knowledge we conclude that none of the
above mycotoxins is produced by T. melanosporum.
6.3. Sulfur assimilation and metabolism
Ten sulfur metabolism-related sub-pathways, outlined in Fig. S16A, were interrogated and a total of 126 genes
coding for most of the corresponding enzymes, permeases and regulators were identified (table S15). When
compared with a reference set of five different Ascomycetes (S. cerevisiae, N. crassa, M. grisea, B. cinerea
and A. nidulans), 21 genes appear to be the Tuber-specific paralogs of genes that are also present in S.
cerevisiae, four genes are the Tuber-specific paralogs of filamentous ascomycete specific genes (a total of
25), while five genes were only found in T. melanosporum (table S15). The latter include phytochelatin
synthase, the first enzyme of this kind to be described in filamentous ascomycetes; three putative dipeptidyl
aminopeptidases possibly involved in Cys-Gly (and other dipeptides) hydrolysis; and a putative chromate efflux
transporter similar to bacterial proteins conferring resistance to chromate – a toxic metal anion that is
internalized by sulfate permeases –. The most significant deviations in gene number with respect to N. crassa
and other filamentous ascomycetes (Fig. S16) are the duplication of PAPS reductase, the presence of an
extra-copy of a putative sulfate transporter and the presence of 19 copies of putative cysteine desulfurase
genes (table S15). Amongst the genes that appear to be missing in T. melanosporum there are those coding
for arylsulfatase, alkanesulfonate monoxygenase and sulfide dehydrogenase.
More than 75% of the manually annotated genes were covered by ESTs and 93% of them gave above
background hybridization signals in at least one of the three life-cycle stages that were subjected to transcript
profiling [free-living mycelium (FLM), fruiting body (FB), and ectomycorrhiza (ECM)]. Normalized EST
redundancy for core S-metabolism genes (pathways 1, 2, 3, 4 and 7 in Fig. S16) was on average 6.8-fold
higher in T. melanosporum than in N. crassa, with a 3.5-fold higher redundancy for FB-derived vs. FLM-
12
SUPPLEMENTARY INFORMATION
derived ESTs. Similar results with regard to the fruiting body preferential expression of S-metabolism genes
were obtained from transcriptome analysis, which also included ectomycorrhiza (Fig. S16B, table S15). Sulfur
metabolism thus appears to be strikingly active in T. melanosporum, and particularly in fruiting bodies.
The three pathways exhibiting the strongest fruting body bias were “sulfate internalization & reduction”,
“Cys/Met biosynthesis & interconversion” and “methionine uptake & utilization”. Amongst the genes with the
highest expression prevalence in fruiting bodies there are those encoding two sulfate permeases (ST1 and
ST2) and various enzymes involved in S-amino acid biosynthesis/interconversion, especially methionine/SAM
formation and recycling. Also worth of note is a case of markedly disproportionate transcriptional output (and
EST redundancy) for otherwise convergent or coupled reactions involved in cysteine/homocysteine
interconversion in fruiting bodies (Fig. S17, table S15). Here, the two most represented enzymes are
cystathionine γ-lyase (CGL; TmelCYS3) and cystathionine β-lyase (CBL; TmelSTR3). The standard substrate
of these enzymes is cystathionine, which is produced by two cystathionine synthases (TmelSTR2 and
TmelCYS4) that are both expressed at exceedingly low levels in fruiting bodies (Fig. S17). What this suggests
is that the entire pathway is strongly polarized toward Met (and to a lesser extent Cys) to start with, and that
these two overexpressed lyases may actually be involved in different (cystathionine metabolism-unrelated)
reactions leading to the formation of S-volatiles (or S-VOC precursors). Consistent with this hypothesis, CGLs
and CBLs from several microorganisms (especially bacteria involved in cheese ripening) are known to act on
various S-containing substrates besides cystathionine (66). As shown in Fig. S17, these alternative, nonstandard reactions include cysteine/homocysteine desulfhydrylation and H2S production, and the
dethiomethylation of methionine to methanethiol, which can then spontaneously decompose to H2S,
dimethyldisulfide and other methyl sulfides, many of which are known constituents (or precursors) of truffle
flavour compounds (52-54).
Additional flavour-related pathways originating from methionine are centered on 4-methylthio-2-oxobutanoic
acid (also called α-keto-γ-(methylthio) butyric acid or KMBA), which can be degraded chemically or
enzymatically to methanethiol and sulfides, or be converted enzymatically into 3-(methylthio)propanal and 3(methylthio)propanol (or the corresponding acids; the so called “fusel alcohols/acids”), through the Ehrlich
pathway (Fig. S18) (see also section 6.4). The latter pathway as well as other “Methionine uptake & utilization”
components (including TmelMsrA, a gene coding for a methionine sulfoxide reductase enzyme implicated in
dimethyl sulfoxide reduction and flavor formation in yeast; 63) are also well represented in the
T. melanosporum genome and transcriptome. Despite the presence of a Cys-degrading taurine dioxygenase
(TmelTDI1c) that is highly expressed in fruiting bodies and a potentially flavour-related, but as yet unidentified
role of the enzymes encoded by the multigene cysteine desulfurase family, the data suggest a more prominent
role of methionine as a key sulfur metabolite and S-VOC precursor in T. melanosporum fruiting bodies.
Many S-metabolism genes, including one sulfate permease, various sulfate reducing enzymes, a single
methionine transporter and a methionine synthase, are expressed at fairly high levels in mycorrhiza
(Fig. S16B). This suggests potential symbiosis-related roles for these proteins, such as an improved sulfur
nutrition and metal tolerance of the host plant and an enhanced redox capacity of (pre)symbiotic hyphae to
counteract the oxidative burst mounted by the plant upon infection.
6.4. Truffle aroma and volatile organic compounds (VOC)
The truffle aroma is made of hundreds of volatiles that vary in proportion and composition depending on the
species, maturity and origin of the isolates (52-54). Several isoprenoids (also known as terpenoids) have been
identified amongst VOCs produced by ripe T. borchii fruiting bodies, along with a sustained expression of
genes involved in the isoprenoid pathway (55). Isoprenoids belong to a vast group of secondary metabolites
synthesized from isopentenyl diphosphate (IPP), which includes flavor enhancers and fragrances. A complete
13
SUPPLEMENTARY INFORMATION
set of genes involved in the biosynthesis of isoprenoid units through the mevalonate intermediate, plus a group
of putative polyisoprenoid/terpenoid biosynthetic genes, were identified in the T. melanosporum genome (table
S16). Another major constituent of truffle aroma are fusel alcohols, which are also produced by yeasts via
amino acid catabolism through the Ehrlich pathway (56). Earlier studies suggested that truffle aroma as a
whole could be a mixture of compounds produced by both the mycelium (so called “gleba”) and fruiting bodyassociated microorganisms (53, 57). However, a large set of genes homologous to the Ehrlich pathway genes
operating in yeast is present in the T. melanosporum genome (Fig. S18) and many of them are preferentially
expressed in fruiting bodies, thus suggesting that truffles can produce most (if not all) of these compounds on
their own. Similar considerations hold for sulfur-containing volatiles (S-VOCs), which are also major and
characteristic constituents of truffle aroma found in all Tuber species (58). Indeed, S-metabolism, especially
those coding for sulfur assimilation and Cys/Met metabolism components, are amongst the most highly
expressed genes in T. melanosporum, with a strong expression bias for fruiting bodies (see section 6.3).
Surprisingly, we found no gene coding for lipoxygenase, the enzyme that in most edible fungi is responsible for
the biosynthesis of 1-octen-3-ol, a key component of ‘mushroom aroma’.
6.5. Sex and mating type genes
The analysis of genes implicated in the mating process, including pheromone response, meiosis and fruiting
body development showed that most sex-related components identified in other ascomycetes are also present
in T. melanosporum (table S11).
6.5.1. Identification of mating type genes
The T. melanosporum genome sequence, strain Mel28, was analyzed for the presence of mating type
homologous genes using BLASTN and the T. melanosporum EST sequences showing a high similarity with
the MAT1-2-1 gene from Diaporthe sp. (BAE93753, BAE93759), Cryphonectria parasitica (AAK83343),
Fusarium sacchari (BAE94382) as query; this was confirmed using other Ascomycete MAT genes as queries.
The BLAST search identified the corresponding gene model (GSTUMT00001090001) in scaffold 247. On the
other hand, BLAST search using the α-box containing genes from different Ascomycetes as a query did not
allow the identification of any region with significant similarity in the Mel28 genome. The MAT1-2-1 gene
consists of 4 exons and contains the HMG-box sequence typical of the MAT1-2-1 genes of Ascomycetes (Fig.
S19). The deduced amino acid sequence is 297 residues long.
6.5.2. Identification of mat1-1 and mat1-2 T. melanosporum strains
To assess for the presence of the HMG-box containing region in different T. melanosporum strains, the gene
specific primers GMmat121intf: 5’- TTTCTTTGATGGGTCGGATGGAG - 3’ and GMmat121intR: 5’ GCCCTTGCCTATTAATGTGTTAGTG - 3’ where designed and used to perform a PCR screening on 15 T.
melanosporum ascocarps. Thirteen out the 15 samples screened produced the expected amplicon (673 bp).
The two samples (mel206 and mel151) that did not give rise to any amplification product were then supposed
to harbor the opposite mating type (MAT1-1). Conversely, all samples yielded the expected amplicons when
amplified with primers for β-tubulin as control.
6.5.3. Isolation of the mat1-1 idiomorph
To identify conserved regions flanking the two putative idiomorphs a set of primers was designed in order to
amplify the 5’ and 3’ genomic regions surrounding the MAT1-2-1 gene. These primers were used both on
samples harboring the putative MAT1-2 (mel271 and mel459) and MAT1-1 (mel206 and mel151) idiomorphs.
On the 5’ flanking region, the primers GMmatext2F: 5’- CAATCTCTTCCATCGCCCGTCCAG -3’ and
14
SUPPLEMENTARY INFORMATION
GMmatextr6 5’- TGGTATATGTGGATGTATTGATAACTATAAT -3’ yielded an amplicon of about 370 bp on
both MAT1-2 and MAT1-1 strains. On the 3’ flanking region, the primers GMextF2: 5’ AGAGATAGAGAAATAGCATGGCTCGG -3’ and GMmatEXT2r: 5’ - AAGTAACCTTTGTGCCATTGCTCCA - 3’
produced an amplicon of about 1200 bp on MAT1-2 strains and a fragment of about 1700 bp on MAT1-1
strains. The PCR fragments obtained from sample mel206 with primer pairs GMmatext2F/GMextr6 and
GMextf2/GMmatext2r were cloned and sequenced. Sequence alignment showed that these two fragments are
highly similar (~ 96% sequence identity) to the corresponding regions on the sequenced genome, although the
fragment generated by primers GMextf2/GMmatext2r showed an insertion of 495 bp with respect to the 3’
region downstream the MAT1-2 idiomorph. On this insertion a primer specific to MAT1-1 strains (mat111R1:
5’ - GCCAACCTCTAGTTGGGATATTTGTTCAGGAC – 3’) was designed and used in combination with
GMmatext2f to amplify the entire MAT1-1 idiomorph on mel206. This amplification produced a fragment of
10326 bp whose sequencing confirmed the identification of the MAT1-1 idiomorph. Within this amplicon an
idiomorphic region of 7470 bp containing the MAT1-2-1 gene was identified (Fig. S19). The deduced amino
acid sequence of the MAT1-1-1 gene is 319 residues long and alignment with MAT1-1-1 sequences from other
ascomycetes revealed a conserved α-box domain that typifies the MAT1-1-1 gene of filamentous ascomycetes
(data not shown). More specifically, BLASTX analysis against the NCBI nr database revealed a sequence
similarity with the mat1-1-1 protein of Alternaria brassicae (AAK85543.1, score = 35.4, E-value = 2.7),
Penicillium marneffei and Ajellomyces capsulatus.
The mat1-1-1 sequence was deposited in GenBank under accession number 000.
6.5.4. Pheromone and mating signal transduction-related genes
Complementary alpha-factor and a-factor pheromones are required for sex in heterothallic species. Binding of
the pheromone to a cognate G-protein coupled receptor triggers the activation of a mitogen-activated protein
(MAP) kinase signaling pathway ultimately targeting a homeodomain transcription factor. Most of the key
genes for pheromone and MAP-kinase cascades were identified in T. melanosporum genome (table S18). A
putative pheromone precursor gene similar to the alpha-factor precursor gene of Saccharomyces cerevisiae
and a series of genes involved in enzymatic processing, maturation (i.e. prenylation) and efflux of both alphafactor and a-factor pheromones were identified in the sequenced Mel28 genome (table S11). Conversely, the
a-factor like pheromone precursor gene was not identified in this strain. Two G-protein coupled receptors (with
a characteristic seven transmembrane domain) for the a- and alpha-factor like pheromones were also
identified along with genes coding for the alpha, beta and gamma G-protein subunits. The MAP-kinase
cascade and the homeodomain transcription factor STE12 are conserved in T. melanosporum. Furthermore, a
gene containing a HMG-box domain similar to the transcription factor STE11 of S. pombe and a MADS-box
gene similar to MCM1 of S. cerevisiae were identified in the Mel28 genome. In S. pombe, STE11 is involved
in the induction of mating type genes in response to nutritional starvation and in the transcriptional activation of
meiotic genes. In S. cerevisiae, MCM1 interacts with STE12 and with mating type transcription factors to
trigger mating type-specific gene expression in a and α cells.
6.6. Light perception and potential photoresponses
Light from UV-C to far red is perceived (and transduced) by fungi, in which it modulates growth and
morphogenesis and other processes such as pigmentation, sexual/asexual development and secondary
metabolism. Eight genes coding for putative “photoreceptors and light-dependent regulators” plus five
“accessory components and modulators” were identified in the T. melanosporum genome (table S17). The
former group includes the homologs of the blue light receptor WC-1 and its partner protein WC-2 from
N. crassa (68-69), a putative phytochrome, and an opsin-like protein. All but one of the annotated genes
were covered by ESTs, and all of them gave above background hybridization signals in at least one of the
three life-cycle stages that were subjected to transcriptome analysis.
15
SUPPLEMENTARY INFORMATION
The results of genome sequence analysis are in line with previous data pointing to the occurrence of a blue
light-induced phenotype (i.e., apical growth inhibition), likely mediated by TmelWc-1, in Tuber borchii (70). At
variance with its N. crassa homolog, TmelWc-1 lacks the polyQ region involved in transcriptional activation,
suggesting a repressive, rather than an activating role for this protein. The putative phytochrome identified in
T. melanosporum is homologous to one of the two phytochromes present in Neurospora (Phy-1; 71), and
similarly bears two histidine kinase domains and single PAS, GAF and “response regulator” domains. It is
accompanied by a suite of putative transducers and regulators (TmelVeA, TmelVelB, TmelLaeA and
TmelVosA) resembling the components of the blue/red-light sensing complex of A. nidulans (72). The other
putative light-sensing component is a bacteriopsin-like protein (TmelOrp1) homologous to the opsin-related
protein-1 from N. crassa. Despite the presence of a canonical seven transmembrane-helix domain, TmelOrp1
lacks a conserved Schiff-base forming lysine as well as 10 of the 22 amino acid residues that bind all-trans
retinaldehyde in the N. crassa bacteriopsin Nop-1 (73) In keeping with this observation, two putative
polyisoprenoid synthetases (data not shown), but no true β-carotene (nor retinaldehyde) biosynthetic enzyme,
were found in T. melanosporum. Also, comparison with opsin-like proteins from other organisms suggests that
oxygen-, rather than light-sensing might be the main function of TmelOrp1, which is preferentially expressed in
fruiting bodies.
Amongst the light-sensing components that appear to be missing in T. melanosporum there is FREQUENCY,
a regulator of circadian rhythmicity, and VIVID, a PAS/LOV protein that mediates adaptation to irradiation
intensity in Neurospora (69). The lack of FREQUENCY is in line with previous observations pointing to the
absence of circadian rhythmicity in T. borchii (70). Perhaps long-term seasonal variations, rather than finely
tuned daily rhythms are more suited to a hypogeous fungus, whose life-cycle is likely influenced by
light/temperature variations rather than by circadian rhythmicity. The lack of VIVID, a sensor specialized in the
adaptation to changes in day-light irradiation intensity, may similarly reflect adaptation to a soil-screened
subterranean habitat. Therefore, although the actual function of the above components remains to be defined,
it is tempting to speculate that a photosystem such as the one revealed by genome analysis may be
instrumental to: (i) enforce subterranean growth and development by acting as a sort of light escape/avoidance
mechanism; and (ii) control seasonal developmental variations, especially those related to sexual
differentiation and secondary metabolism reprogramming. Light sensing in truffles is also supported by
molecular phylogenetic data attesting to repeated episodes of epigeous to hypogeous lifestyle transitions, with
no instance of character reversal, in the evolutionary history of the Pezizales (74).
6.7. Genes families involved in transduction pathways
We have carried out a genome-wide analysis of gene families encoding components of T. melanosporum
signaling pathways, including monomeric G-proteins of the ras family, subunits of heterotrimeric G-proteins, G
protein-coupled receptors (GPCR), and kinases (table S18). T. melanosporum signaling genes were compared
to signaling genes in yeast and ascomycete genomes (N. crassa, A. nidulans, B. cinerea and M. grisea) using
the best reciprocal blast hit (BRH) method to identify orthologous genes. Gene expression in mycelium or in
fruiting bodies was estimated based on EST abundance. Transcript profiling was carried out using the
NimbleGen exon oligonucletide arrays (section S8).
T. melanosporum genome encodes the signalling genes documented in other filamentous ascomycetes (table
S18). All proteins involved in pathways controlling key cellular processes such as stress response, filamentous
growth, virulence and mating were identified and curated. Except for the GPCR Pth11 family (see below), no
significant expansion or contraction of signaling gene families were found. Up to 80% of the gene models
were supported by ESTs. During the interaction with the host plant, most signaling transcripts are strongly
16
SUPPLEMENTARY INFORMATION
expressed and only a few were not or barely detected. Only a few transcripts showed a differential expression
in ectomycorrhizal root tips (table S18).
We identified 22 genes encoding the pathogenesis-related GPCR Pth11 in T. melanosporum. The number of
paralogs in this multigene GPCR family was lower in T. melanosporum by comparison to M. grisea (86
members), A. nidulans (71 members) or B. cinerea (55 members) and more comparable to the reduced set
present in N. crassa (30 members); no homologs are present in the yeast genome. Interestingly, a
phylogenetic analysis of this GPCR family clustered five T. melanosporum Pth11-related genes in a subgroup
that may represent functionally distinct Pth11-related GPCRs specific to T. melanosporum (data not shown).
In this large gene family, more than half of the genes showed low expression levels while some exhibited a
very strong expression in different biological situations. Interestingly, a few Pth11-like GPCR transcripts
strongly accumulated in ectomycorrhizal tissues (e.g., Tmel_Pth11_rel4) or in fruiting bodies
(Tmel_Pth11_rel15, Tmel_Pth11_rel17, Tmel_Pth11_rel20) compared to the free-living mycelium. The three
most highly Pth11 transcripts in fruiting bodies all belong to the Tuber-specific Pth11 subgroup. The function
of Pth11-related proteins is not yet known, but since GPCRs are key components of external stimulus sensing,
one can speculate that they might be involved in the complex cross-talk between the mycobiont and its hostplant.
6.8. Secretome
Secreted proteins were identified using a custom pipeline including the TargetP
(http://www.cbs.dtu.dk/services/TargetP/) and SignalP (http://www.cbs.dtu.dk/services/SignalP/) algorithms
(75). The 1449 proteins predicted to carry a signal peptide were then screened for transmembrane proteins
using TMHMM (http://www.cbs.dtu.dk/services/TMHMM/), T. melanosporum TE fragments, and coding
sequences with matches only in the secretory leader sequence. The pipeline identified 125 genes coding for
cysteine-rich small secreted proteins (SSP) (≥4 Cys, 80<AA<300AA. Of them, 70 SSPs were lineage-specific.
A single SSP cluster (defined as a group of at least three SSPs in 10 kbp), corresponding to hydrophobin
genes, was found in T. melanosporum. Amongst the most highly upregulated transcripts in T.
melanosporum/Corylus avellana ectomycorrhizal root tips were several encoding predicted secreted proteins
[e.g., LysM domain (GSTUMT00012780001)] (table S3). Several transcripts encoding Tuber-specific SSPs
were also differentially expressed in fruiting bodies, e.g. GSTUMT00006097001, GSTUMT00012378001, and
GSTUMT00007269001, suggesting that they play a role in the differentiation of the fruitng body
(pseudo)tissues.
6.9. Environmental stress response genes
T. melanosporum experiences various stress conditions at different stages of its lifecycle. In particular,
production of fruiting bodies follows a fairly strict seasonal timing, which is likely determined, and influenced,
by various abiotic factors and potential stressors, such as high or low temperatures and drought. Proper
response to these stressful conditions likely influences the T. melanosporum lifecycle, including its symbiotic
interactions with its host plants, nutrient exchange and thus fruiting body production. We have curated genes
coding for heat shock proteins (Hsp), as well as other chaperones and proteins binding to Hsps, i.e., cochaperones. Seventy-four genes were manually annotated; 15 gene models were edited. Sixty-four of them
were assigned to 16 gene families identified through PFAM search (table S19), while the remaining 10 genes
did not match any of the known families. Especially noteworthy amongst the latter was the dehydrin (DHN1)
gene. Initially classified as LEA (late embryogenesis abundant), dehydrins (DHNs) are now believed to play a
protective role during plant dehydration. The first fungal DHN-like gene (TbDHN1) was identified in the whitish
truffle Tuber borchii. A homolog of TbDHN1, designated as TmelDHN1, was identified in the T. melanosporum
genome. Despite a significant overall identity of 47%, the two amino acid sequences differ in length and in the
17
SUPPLEMENTARY INFORMATION
number of repeats of the ‘DPRVDS’ motif. As expected, TmelDHN1 has no ortholog in S. cerevisiae, whereas
potential orthologs are present in other filamentous fungi. The other stress-response related gene families that
were annotated in T. melanosporum genome are: CPN60-TCP1, Hsp70, DNAJ, cyclophilin, Hsp90 and
associated proteins, Bag, Fes, Pam16, Hsp20, Hsp9/12, Grpe, Clp, Cpn10, ClpA/B, FKBP and Usp (table
S19).
The CPN60-TCP1 family includes molecular chaperones belonging to the octameric complex TCP1 (CCT).
Genes coding for eight of the known TCP1 subunits were identified in T. melanosporum (Tmelcct1, Tmelcct2,
Tmelcct3, Tmelcct4, Tmelcct5, Tmelcct6, Tmelcct7, and Tmelcct8). According to the PFAM analysis,
Tmelhsp60 also belongs to this family. All the above gene products were subjected to phylogenetic analysis
(data not shown), which confirmed their identity and tentative designation as CPN60-TCP1 components.
Twelve Hsp70-like genes were identified in the T. melanosporum genome. As revealed by phylogenetic
analysis, Tmelhsp70 and Tmelhsp88 clustered with homologous proteins from A. nidulans, while TmelSSB is
closely related to a homologous Hsp from N. crassa (bootstrap value > 50). Within this family, Tmelhspa12b
and Tmelhspa12a_1 are likely to be specific to filamentous fungi. Members of the DNAJ family are involved in
protein folding, protein transport, and response to stress (http://ghr.nlm.nih.gov/geneFamily=dnaj). Fourteen
putative members of this family were identified in the T. melanosporum genome; these, along with a set of 90
homologs were subjected to phylogenetic analysis (data not shown). Seven genes, members of the
cyclophilin superfamily, were identified in T. melanosporum genome. Hsp90 is a conserved heat shock
protein, which can bind other proteins, defined as co-chaperones. In the T. melanosporum genome we found
a single Tmelhsp90 and five co-chaperones (Tmelcdc37, Tmelsti1, Tmelcns1, Tmelwos2 and a not yet
assigned gene). Members of the other families (Pam16, Bag, Fes, Hsp9/12, Hsp20, Grpe, Clp, Cpn10,
ClpA/B, FKBP, Usp) were also identified in the T. melanosporum genome (table S19).
6.10. Aminoacyl-tRNA synthetases and translation factors
The main components of the translation machinery are the 80S ribosome, the activated amino acid (aminoacyl
tRNA) forming enzymes “aminoacyl-tRNA synthetases” and the translational factors. A conserved set of 77
ribosomal proteins (RPs) – 32 RPs plus the 18S rRNA associated with the small subunit; 45 RPs plus the 25S,
5.8S and 5S rRNA associated with the large subunit – is encoded by the genomes of all the ascomycetes
(unicellular and multicellular) sequenced so far. Special attention was thus given to the annotation of the other
two main components of the translation machinery, the “aminoacyl-tRNA synthetases” and the “translation
factors”. The latter belong to a large and heterogenous set of proteins assisting and orchestrating the three
functional phases of translation: initiation, elongation and termination.
6.10.1. Aminoacyl-tRNA synthetases
Twenty cytosolic and eleven nuclear-encoded mitochondrial amino acyl-tRNA synthetase (ARS) genes were
identified by similarity with orthologous genes of known function from other fungi. T. melanosporum ARSs,
named according to the standard nomenclature (http://www.genenames.org), are listed in table S20.
Paralogous genes with no assigned function or potential pseudogenes were found for leucyl-tRNA synthetase
(GSTUMT00003081001, GSTUMT00000386001, GSTUMT00005548001, GSTUMT00009004001), lysyl-tRNA
synthetase (GSTUMT00005082001), histidyl-tRNA synthetase (GSTUMT00009282001) and valyl-tRNA
synthetase (GSTUMT00003370001). Ninety-seven % of the annotated genes were covered by ESTs and
90% of them gave above background hybridization signals in at least one of the three life-cycle stages that
were subjected to transcriptome analysis (free-living mycelium, FLM; fruiting body, FB; and ectomycorrhiza,
ECM) using the NimbleGen oligoarrays (section 8). The three ARS genes that appear not to be expressed
under any of the presently examined life-cycle stages/ growth conditions are the mitochondrial leucyl-tRNA
synthetase, methionyl-tRNA synthetase, and lysyl-tRNA synthetase.
18
SUPPLEMENTARY INFORMATION
6.10.2. Translation factors
Forty-nine translation factor genes (37 initiation factors plus one putative polyA binding protein; eight
elongation factors; and three termination factors) were identified in the T. melanosporum genome (table S21).
Ninety-eight % of the annotated genes were covered by ESTs and all of them gave above background
hybridization signals in at least one of the three life-cycle stages that were subjected to transcriptome analysis.
When compared with a reference set of five sequenced ascomycetes (S. cerevisiae, N. crassa, M. grisea, B.
cinerea and A. nidulans), seven of the annotated genes have no homolog in S. cerevisiae, whereas one of
them, the paralog of the release factor 1 gene eRF1, was only present in T. melanosporum. Amongst the
filamentous fungus-specific translation factors identified in the T. melanosporum genome there are three
components of initiation factor eIF3 (eIF3e, eIF3j, eIF3k) (table S22). This is the largest eukaryotic initiation
factor that serves as a scaffold for the interaction with other initiation factors as well as with other complexes
and processes such as the COP9 signalosome, the 26S proteasome and nonsense-mediated mRNA decay.
The most significant deviations in gene number are the presence in T. melanosporum of three copies of
elongation factor eEF1a and the apparent absence of eEF1Bβ. As revealed by an extended sequence
comparison (table S22), most T. melanosporum translation factors (28 out of 49) share the highest similarity
with homologous components from Schizosaccharomyces pombe, while some of them are most similar to the
corresponding animal (8 gene models) or plant (3 gene models) factors.
6.11. Carbohydrate Active enZymes (CAZymes)
Enzymes that cleave, build and rearrange oligo- and polysaccharides play a central role in the biology of
saprotrophic, pathogenic and symbiotic fungi and are key to optimizing biomass degradation by these species.
Given the relative importance of this protein families to the ecology of ectomycorrhizal fungi, we performed a
detailed examination of the genes coding for carbohydrate active enzymes (CAZYmes) in T. melanosporum
genome and compared it with the corresponding gene subsets from saprotrophic, pathogenic, and symbiotic
fungi (tables S23 & S24). The search for catalytic modules specific to CAZYmes, glycoside hydrolases (GH),
glycosyltransferases (GT), polysaccharide lyases (PL), carbohydrate esterases (CE), and their ancillary
carbohydrate-binding modules (CBMs) in T. melanosporum was performed exactly as for the daily updates of
the Carbohydrate-Active enZymes (CAZy) database (76) (http://www.cazy.org). Each protein model was
compared with a library of over 100000 constitutive modules (catalytic modules, CBMs and other non-catalytic
modules or domains of unknown function) using BLASTP. Models that returned an e-value passing the 0.1
threshold were automatically sorted and manually analyzed. The presence of the catalytic machinery was
verified for distant relatives whenever known in the family. The models that displayed significant similarities
were retained for functional annotation and classified in the appropriate classes and families. A strong
similarity to an enzyme with a characterized activity allows annotation as 'candidate activity', but often for a
safe prediction of substrate specificity, annotation such as 'candidate α- or β-glycosidase' may be provided, as
the stereochemistry of the α- or β-glycosidic bond is more conserved than the nature of the sugar itself. Each
protein model was compared to the manually curated CAZy database, and a functional annotation was
assigned according to the relevance. All uncharacterized protein models were thus annotated as 'candidates'
or 'related to' or 'distantly related to' their characterized match as a function of their similarity. The overall
results of the annotation of the set of CAZymes from T. melanosporum were compared to the content and
distribution of CAZymes in several fungal species (table S23) in order to identify singularities in the families'
distributions. This allowed the identification of significant reductions of specific CAZyme families in T .
melanosporum.
As expected for a symbiotic fungus leaving in the root apoplast, T. melanosporum has few genes encoding
glycoside hydrolases (GHs). With a total of 91 GH encoding genes (table S23), it has much fewer GHs than
19
SUPPLEMENTARY INFORMATION
the phytopathogens (e.g., M. grisea and F. graminearum) and the saprotrophs (e.g., N. crassa and P.
anserina). This repertoire is even lower than the symbiotic basidiomycete L. bicolor. Based on its CAZome
(table S24), T. melanosporum has a limited ability to hydrolyze plant cell wall polysaccharides (PCW). For
instance, there is no GH5 cellulase appended to a cellulose-binding module (CBM1) and no cellulases from
families GH6 and GH7 were found in the genome. However, we have detected a few genes encoding
enzymes acting on PCW:
– Cellulose degradation: Endoglucanases [GH5 (1 gene model); GH61 (1 to 4); CBM1 (1)]
– Hemicellulose degradation: Xylanase [GH10 (1 gene model)]; Xyloglucanase [GH12 (1)];
Arabinanase [GH43 (1)]; but no Galactanase [GH53 (0)];
– Pectin degradation: Pectinase [GH28 (2 gene models); GH78 (2 gene models); PL1 (2 gene models);
PL4 (1 gene models); CE8 (1 gene models); CE12 (1 gene models)].
The single GH5 endoglucanase, together with the single secreted GH12 xyloglucan-specific endoglucanase, a
pectin methylesterase, a secreted GH28 polygalacturonase and a rhamnogalacturonan acetylesterase, were
amongst the most highly upregulated transcripts in ECM root tips, suggesting a role for these enzymes in PCW
degradation and remodeling during host colonization (table S3 and fig. S21).
With 103 glycosyltransferases (GT), T. melanosporum is close to the average amongst Sordariomycetes and
Eurotiomycetes, suggesting that glycosyltransferases possess basal intracellular activities and that variations
in composition may reflect species divergence rather than ecological niche pressure.
The enzymes involved in plant polysaccharide depolymerization often carry a carbohydratebinding module (CBM) appended to their catalytic domain. Expectedly, the T. melanosporum genome has the
smallest number of CBM-containing proteins amongst the sequenced filamentous fungi, even lower than the
ectomycorrhizal L. bicolor. The polysaccharide lyase gene set is also very low. Overall, the T. melanosporum
genome encodes a paucity of enzymes involved in PCW depolymerization, but still encodes and expresses
several degrading enzymes able to facilitate the progression of the hyphae in the pectin-rich middle lamella
during the formation of the intraradicular Hartig net.
6.12. Secreted peptidases
The total number of secreted peptidases (49 members) identified in the T. melanosporum genome using the
MEROPS database (http://merops.sanger.ac.uk) is similar to that of other sequenced fungi (Fig. S22).
However, the number of aspartyl protease is much lower in T. melanosporum in comparison to the other
sequenced ascomycetes and the ectomycorrhizal L. bicolor. Several of these proteases may play a role in
developmental processes as they are either up- or down-regulated in fruiting bodies and ectomycorrhizal root
tips (data not shown). Interestingly, two putative aminopeptidases (M28A family), showing a strong amino acid
identity with leupeptin-inactivating enzyme (LIE) were strongly up-regulated in fruiting bodies and
ectomycorrhizal root tips.
6.13. Membrane transporters
A process that is pivotal to the success of ectomycorrhizal associations is the exchange of nutrients between
the symbiont and its host plant. The gene models coding for membrane transporters were identified and
curated by using the Transport Classification Database (http://www.tcdb.org/). A comparison with other
ascomycetes and basidiomycetes (table S25) revealed that the total number of predicted transporters in most
T. melanosporum families is in the lower range of the values reported for Sordariomycetes and
Eurotiomycetes. This is in contrast to the ectomycorrhizal L. bicolor which displays an expansion of several
transporter gene families. Several of the identified transporters however likely play an important role in the
20
SUPPLEMENTARY INFORMATION
symbiosis metabolism as their transcripts as strikingly upregulated in the ectomycorrhizal root tips (table S3).
6.14 Other gene categories
A series of papers describing detailed analyses of the T. melanosporum gene categories and their expression
will be published elsewhere.
7. Non-coding RNAs
7.1. Transfer RNAs (tRNA) gene abundance, anticodon/codon usage and translational selection
tRNA coding genes were searched with tRNAscan (77) and Pol3scan (78). Their identity was further verified
by homology searches conducted against a reference set of tRNA sequences in order to eliminate organellar
and mispredicted genes. A total of 143 tRNA genes was thus identified, 65 of which contain introns. They
correspond to 45 different anticodons (table S26), the maximum number expected for a non-redundant
decoding system capable of decoding all the standard amino acids (79, 80); neither a selenocysteine tRNA,
nor any suppressor tRNA gene were found in the T. melanosporum genome. The anticodon repertoire in this
genome is consistent with a ‘restricted' use of wobbling (i.e., allowed anticodon:codon pairings:
I/ANN:NNU,NNC; GNN:NNU,NNC; UNN:NNA; CNN:NNG). Many other filamentous ascomycetes share the
same assortment of anticodons, suggesting that it was already present in the stem Pezizomycota ancestors.
The number of tRNA genes in T. melanosporum is however at the lower end of the range found in
Pezizomycota. It is about one third of the number found in N. crassa, and substantially less than that found in
A. nidulans, M. grisea, and B. cinerea. The disparity between the tRNA gene repertoire of T. melanosporum
and that of other fungi is even more pronounced when genome size is taken into account: for every Mb of DNA
sequence there are on average 22 tRNA genes in S. cerevisiae, 10.3 in N. crassa, 6.2 in A. nidulans, 5.4 in M.
grisea, 4.5 in B. cinerea, but only 1.1 in T. melanosporum.
Also peculiar is the codon usage in T. melanosporum which shows a strikingly uniform use of codons, as
revealed, for example, by relative synonymous codon usage (RSCU) values close to unity (table S27). When
codon usage for ribosomal proteins, which are usually highly expressed and encoded by genes with a strong
codon bias, is taken into account, a slight preference for some synonymous codons becomes apparent (table
S28). These ‘preferred codons’ correspond to tRNA genes with high copy numbers (cf. tables S26 & S27),
thus suggesting a selection for optimal translation (78). However, translational selection appears to be
extremely weak in T. melanosporum compared with other fungi.
7.2. Spliceosomal RNAs (snRNA)
Spliceosomal RNA gene prediction was performed with cmsearch of the INFERNAL package (81) using the
relevant covariance model from Rfam (82). For each covariance model, the window size and trusted cut-off
score indicated in the RFAM database were used. The T. melanosporum genome contains nine copies of
U1snRNA, 14 copies of U2snRNA, eight copies of U4snRNA and two copies of U6snRNA. No U5snRNA gene
was found. Similarly, no U11, U12, U4at and U6atsnRNA candidate genes were identified, in keeping with the
inability to identify U12-type introns in this genome as in the genomes of all the other fungi analyzed so far.
7.3. Ribosomal RNAs (rRNA)
Due to their high repeat content, ribosomal DNA (rDNA) repeats regions typically do not get assembled into
supercontigs in fungal genomes. Partial sequences of the 18S, 5.8S and 25S sequences from fungal
sequences retrieved from the NCBI were used as initial queries with BLASTN (35) against T. melanosporum
21
SUPPLEMENTARY INFORMATION
genomic sequence. Sequences of the rDNA tandem repeat were found in several scaffolds, including: 23, 24,
297, 298, 354 and 355. The size of the T. melanosporum 18S-5.8S-26S RNA (rRNA) repeat was estimated to
be ~13.6 kbp.
8. Whole-genome exon oligoarray analyses
The T. melanosporum custom-exon expression array (4 x 72K) manufactured by Roche NimbleGen Systems
Limited (Madison, WI) (http://www.nimblegen.com/products/exp/index.html) contained five independent, nonidentical, 60-mer probes per gene model coding sequence. Included in the oligoarray were 12232 annotated
gene models, 3913 random 60-mer control probes and labelling controls. Sequences used for the
oligonucleotide design were from an early draft of the gene catalog containing several TE families. For 1876
gene models, technical duplicates were included on the array. Free-living mycelium of T. melanosporum
Mel28 was grown on 1% malt agar (Cristomalt-D, Difal, Villefranche-sur-Saône, France) for either five weeks
or four month before harvesting. Ectomycorrhizal root tips were sampled from five-month-old Common Hazel
(Corylus avellana L.) plantlets inoculated by a mycelium slurry produced from a fruiting body harvested in
Meuse (France) by Gérard Chevalier. Inoculated plants were grown in the AGRI-TRUFFE (Saint-Maixant,
France) greenhouse. Fruiting bodies of T. melanosporum were collected below Common Hazel trees or oak
trees at different locations [(Auvergne, Meuse, and Dordogne (France), and Piceno (Italy)].
Tissues were snap frozen in liquid nitrogen and RNA extraction was carried out using the RNeasy Plant Mini
Kit including a DNase treatment (Qiagen, Cat No. 74904). RNA quality and integrity were checked prior to
cDNA synthesis using the Bio-Rad Experion analyzer. Total RNA preparations (four biological replicates for
ectomycorrhizas, five for fruiting bodies and seven for free-living mycelium) were amplified using the SMART
PCR cDNA Synthesis Kit (Clontech) according to the manufacturer’s instructions. Single dye labeling of
samples, hybridization procedures, data acquisition, background correction and normalization were performed
at the NimbleGen facilities (NimbleGen Systems, Reykjavik, Iceland) following their standard protocol.
Microarray probe intensities were quantile normalized across all chips. Average expression levels were
calculated for each gene from the independent probes on the array and were used for further analysis. Raw
array data were filtered for non-specific probes (a probe was considered as non-specific if it shared more than
90% homology with a gene model other than the gene model it was made for) and renormalized using the
ARRAYSTAR software (DNASTAR, Inc. Madison, WI, USA). For 1015 gene models no reliable probe was
left. A transcript was deemed expressed when its signal intensity was three-fold higher than the mean signalto-noise threshold (cut-off value) of 3913 random oligonucleotide probes present on the array (50 to 100
arbitrary units). Gene models with an expression value higher than three-fold the cut-off level were considered
as transcribed (table S2, Fig. S23). A Student t-test with false discovery rate (FDR) (Benjamini-Hochberg)
multiple testing correction was applied to the data using the ARRAYSTAR software (DNASTAR). Transcripts
with a significant p-value (<0.05) and more than a five-fold change in transcript level were considered as
differentially expressed in ectomycorrhizal root tips or fruiting body. The complete expression dataset is
available as series (accession number # 000) at the Gene Expression Omnibus at NCBI
(http://www.ncbi.nlm.nih.gov/geo/).
9. References for Supplementary Materials
1. Mello A, Murat C, Bonfante P, FEMS Microbiol. Lett. 260, 1 (2006).
2. Murat C, Diez J, Luis P, Delaruelle C, Dupre C, Chevalier G, Bonfante P, Martin F, New Phytologist 164, 401 (2004).
3. Riccioni C, Belfiori B, Rubini A, Passeri V, Arcioni S, Paolocci F, New Phytol. 180, 466 (2008).
4. Poma A, Limongi T, Pacioni G, Appl. Microbiol. Biotech., 72, 437 (2006).
5. Hall IR, Yun W, Amicucci A, Tr. Biotech., 21, 433 (2003).
6. Pargney JC, Leduc JP, Bull. Soc. bot. Fr., 137, 21-34 (1990).
7. S Jeandroz, C Murat, W Yongjin, P Bonfante, F Le Tacon, J. Biogeography 35, 815 (2008).
22
SUPPLEMENTARY INFORMATION
8. Hibbett DS, Gilbert LB, Donoghue MJ, Nature 407, 506 (2000).
9. Lutzoni F, Kauff F, Cox CJ, McLaughlin D, Celio G, Dentinger B, Padamsee M, Hibbett D, James TY, Baloch E, Grube M, Reeb V, Hofstetter V, Schoch
C, Arnold AE, Miadlikowska J, Spatafora J, Johnson D, Hambleton S, Crockett M, Shoemaker R, Sung GH, Lücking R, Lumbsch T, O'Donnell K,
Binder M, Diederich P, Ertz D, Gueidan C, Hansen K, Harris RC, Hosaka K, Lim YW, Matheny B, Nishida H, Pfister D, Rogers J, Rossman A, Schmitt
I, Sipman H, Stone J, Sugiyama J, Yahr R, Vilgalys R, Am. J. Bot. 91, 1446 (2004).
11. B. A. LePage, R. S. Currah, R. A. Stockey, G. W. Rothwell, Am. J. Bot. 84, 410 (1997).
12. I. J. Alexander, New Phytol. 172, 589 (2006).
13. B. Moyersoen, New Phytol. 172, 753 (2006).
14. D. S. Hibbett, P. B. Matheny, BMC Biol. 7, 13 (2009).
15. JW Taylor, ML Berbee, Mycologia 98, 838 (2006).
16. D. B. Jaffe, J. Butler, S. Gnerre, E. Mauceli, K. Lindblad-Toh, J. P. Mesirov, M. C. Zody, & E. S. Lander, Genome Res. 13, 91 (2003).
17. A. Poma, G. Venora, M. Miranda, G. Pacioni, Caryologia 55, 307 (2002).
18. M. Lynch, J.S. Conery, Science 302, 1401 (2003).
19. H. Quesneville, C. M. Bergman, O. Andrieu, D. Autard, D. Nouaud, M. Ashburner, D. Anxolabehere D, PLoS Comput. Biol., 1, e22 (2005).
20. Z. Bao, S. R. Eddy, Genome Res. 12, 1269 (2002).
21. R. C. Edgar, E. W. Myers, Bioinformatics 21 Suppl 1, i152-158.
22. Mc Carthy E & Mc Donald JF, Bioinformatics 19, 362-367 (2003).
23. Ma J, Bennetzen J.L, Proc. Natl. Acad. Sci. USA 101: 12404 (2004).
24. Smit, AFA, Hubley, R & Green, P. RepeatMasker Open-3.0. 1996-2004 <http://www.repeatmasker.org>.
25. G. Benson, Nucl Ac. Res. 27(2), 573 (1999).
26. The UniProt Consortium, Nucl Ac. Res. 36:D190-D195(2008).
27. E. Birney, M. Clamp, R. Durbin, Genome Res. 14(5), 988 (2004).
28. R. Guigó, S. Knudsen, N. Drake & T. F. Smith, J. Mol. Biol. 226, 141 (1992).
29. I. Korf, BMC Bioinformatics 5, 59 (2004).
30. K. L. Howe, T. Chothia & R. Durbin, Genome Res. 12, 1418 (2002).
31. E.M. Zdobnov & R. Apweiler, Bioinformatics 17(9), 847 (2001).
32. The Gene Ontology Consortium, Nat. Genet. 25(1), 25 (2000).
33. R. L. Tatusov, N. D. Fedorova, J. D. Jackson, A. R. Jacobs, B. Kiryutin, E. V. Koonin, D. M. Krylov, R. Mazumder, S. L. Mekhedov, A. N. Nikolskaya, B.
S. Rao, S. Smirnov, A. V. Sverdlov, S. Vasudevan, Y. I. Wolf, J. J. Yin, D. A. Natale, BMC Bioinformatics 4, 41 (2003)
34. M. Kanehisa, & S. Goto, Nucl Ac. Res. 28, 27 (2000)
35. S. F. Altschul, W. Gish, W. Miller, E. W. Myers & D. J. Lipman, J. Mol. Biol. 215, 403 (1990)
36. Wortman JR, Fedorova N, Crabtree J, Joardar V, Maiti R, et al., Med Mycol 44: S3 (2006).
37. Kasuga T, Mannhaupt G, Glass L, PLoS ONE 4, e5286 (2009).
38. P. P. Calabrese, S. Chakravarty, T. J. Vision, Bioinformatics 19, i74 (2003).
39. A. J. Enright , S. Van Dongen & C. A. Ouzounis CA, Nucl Ac. Res. 30(7), 1575 (2002)
40. T. De Bie, N. Cristianini, J. P. Demuth & M. W. Hahn, Bioinformatics 22(10), 1269 (2006)
41. R. D. Finn, J. Tate, J. Mistry, P. C. Coggill, S. J. Sammut, HR Hotz, G. Ceric, K. Forslund, S. R. Eddy, E. L. L. Sonnhammer & A. Bateman, Nucl Ac.
Res. 36, D281 (2008)
42. C. Catalanotto, G. Azzalin, G. Macino, C. Cogoni, Genes Dev. 16, 790 (2002).
43. P. K. Shiu, N. B. Raju, D. Zickler, R. L. Metzenberg, Cell 107, 905 (2001).
44. R. A. Martienssen, M. Zaratiegui, D. B. Goto, Trends Genet. 21, 450 (2005).
45. C. Matranga, P. D. Zamore, Curr. Biol. 17, R790 (2007).
46. K. K. Adhvaryu, E. U. Selker, Genes Dev. 22, 3391 (2008).
47. F. Malagnac, B. Wendel, C. Goyon, G. Faugeron, D. Zickler, J.L. Rossignol, M. Noyer-Weidner, P. Vollmayr, T.A. Trautner, J. Walter, Cell 91, 281
(1997).
48. J. E. Galagan, E. U. Selker, Trends Genet. 20, 417 (2004).
49. J. A. Jeddeloh, T. L. Stokes, E. J. Richards, Nature Genet. 22, 94 (1999).
50. T. Kanno, M. F. Mette, D. P. Kreil, W. Aufsatz, M. Matzke, Curr Biol. 14, 801 (2004).
51. A.V. Gendrel, Z. Lippman, C. Yordan, V. Colot, R. A. Martienssen, Science 297, 1871 (2002).
52. S. Zeppa, A. M. Gioacchini, C. Guidi, M. Guescini, R. Pierleoni, A. Zambonelli,V. Stocchi, Rapid Commun. Mass Spectrom. 18, 199 (2004).
53. R. Splivallo, S. Bossi, M. Maffei, P. Bonfante, Phytochemistry 68, 2584 (2007).
54. A. M. Gioacchini, M. Menotta, M. Guescini, R. Saltarelli, P. Ceccaroli, A. Amicucci, E. Barbieri, G. Giomaro, V. Stocchi, Rapid Commun Mass
Spectrom. 22, 3147 (2008).
55. S. Gabella, S. Abbà, S. Duplessis, B. Montanini, F. Martin, P. Bonfante, Eukaryot Cell 4, 1599 (2005).
56. L. A. Hazelwood, J. M. Daran, A. J. A. van Maris, J. T. Pronk, R. Dickinson, Appl. Env. Microbiol. 74, 2259 (2008).
57. P. Buzzini, C. Gasparetti, B. Turchetti, M. R. Cramarossa, A. Vaughan-Martini, A. Martini, U. M. Pagnoni, L. Forti, Arch. Microbiol. 184, 187 (2005).
58. F. Pelusio, T. Nillsson, L. Montanarella, R. Tilio, B. Larsen, S. Facchetti, J. Ø. Madsen, J. Agric. Food Chem. 34, 2138 (1995).
59. V. P. Kurup, H. D. Shen, H. Vijay, Int. Arch. Allergy Immunol. 129, 181 (2002).
60. N. P. Keller, G. Turner, J. W. Bennett, Nat. Rev. Microbiol. 3, 937 (2005).
61. P. Bowyer, M. Fraczek, W. Denning, BMC Genomics 7, 251 (2006).
62. D. Bhatnagar, K. C. Ehrlich, T. E. Cleveland, Appl. Microbiol. Biotechnol. 61, 83 (2003).
63. D. M. Gardiner, B. J. Howlett, FEMS Microbiol. Lett. 248, 241 (2005).
64. M. Kimura, T. Tokai, N. Takahashi-Ando, S. Ohsato, M. Fujimura, Biosci. Biotechnol. Biochem. 71, 2105 (2007).
65. R.H. Proctor, M. Busman, J. A. Seo, Y. W. Lee, R. D. Plattner, Fungal Genet. Biol. 45, 1016 (2008).
66. M. Liu, A. Nauta, C. Francke, R. J. Siezen, Appl. Environ. Microbiol. 74, 4590 (2008).
67. J. Hansen, Appl. Environ. Microbiol. 65, 3915 (1999).
68. C. Talora, L. Franchi, H. Linden, P. Ballario, G. Macino, EMBO J. 18, 4961 (1999).
23
SUPPLEMENTARY INFORMATION
69. J. C. Dunlap, J. J. Loros, Curr. Opin. Microbiol. 9, 579 (2006).
70. R. Ambra, B. Grimaldi, S. Zamboni, P. Filetici, G. Macino, P. Ballario, Fungal Genet. Biol. 41, 688 (2004).
71. A. C. Froehlich, B. Noh, R. D. Vierstra, J. Loros, J. C Dunlap, Eukaryot. Cell 4, 2140 (2005).
72. J. Purschwitz, S. Müller, C. Kastner, M. Schöser, H. Haas, E. A. Espeso, A. Atoui, A. M. Calvo, R. Fischer, Curr. Biol. 18, 255 (2008).
73. J. A. Bieszke, E. L. Braun, L. E. Bean, S. Kang, D. O. Natvig, K. A. Borkovich, Proc. Natl. Acad. Sci. (U.S.A.) 96, 8034 (1999).
74. R. Percudani, A. Trevisi, A. Zambonelli, S. Ottonello, Mol. Phylogenet. Evol. 13, 169 (1999).
75. J. D. Bendtsen, H. Nielsen H, G. von Heijne G, S. Brunak, J Mol Biol. 340, 783 (2004)
76. B. L. Cantarel, P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, B. Henrissat, Nucleic Acids Res. 37, D233 (2009).
77. T. M. Lowe, S. R. Eddy, Nucleic Acids Res. 25, 955 (1997).
78. R. Percudani, A. Pavesi, S. Ottonello, J. Mol. Biol. 268, 322 (1997).
79. C. Marck, H. Grosjean, RNA 8, 1189 (2002).
80. R. Percudani, Tr Genet. 17, 133 (2002).
81. Eddy S. R., BMC Bioinformatics, 3:18 (2002).
82. Griffiths-Jones S., Annu. Rev. Genom. Hum. Genet. 8:279–298 (2007).
83. Aguileta G., Marthey S., Chiapello H., Lebrun M.-H., Rodolphe F., Fournier E., Gendrault-Jacquemard A., Giraud T., Syst. Biol., 57:1 (2008).
84. Ronquist F., Huelsenbeck J. P., Bioinformatics 19:1572 (2003).
85. Wicker T., Sabot F., Hua-Van A., Bennetzen J. L., Capy P., Chalhoub B., Flavell A., Leroy L., Morgante M., Panaud O., Paux E., SanMiguel Ph.,
Schulman A.H., Nature Rev. Genet., 8, 973 (2007).
24
SUPPLEMENTARY INFORMATION
The Black Truffle Genome Uncovers Evolutionary Origins and Mechanisms of Symbiosis
Supplementary Tables
25
SUPPLEMENTARY INFORMATION
Table S1. The main features of the T. melanosporum nuclear genome.
Genome features
Values
Size
Chromosomes
GC percentage (total genome)
GC percentage in coding sequences
GC percentage in non-coding regions
tRNA genes
rDNA repeat number
Consensus rDNA repeat size
5S rRNAs
snRNA genes
Percent transposable elements
Protein coding genes (CDSs)
Percent coding
Average CDS size (bp)
Average codon exon size (bp)
Average number of coding exons per gene
Average codon intron size (bp)
Average codon peptide size (aa)
124946 kb
≥8
52.02
55.87
48.82
143
5 in the assembly
~13.6 kb
>7 in the assembly
33
58
7496
18
2073 bp
340
4.51
107
439
Table S2. Validation of predicted gene models of T. melanosporum based on the NimbleGen exon oligoarray.
Samples
Number of gene models with expression value >cut-off (%)
total
Free-living mycelium
8204 (96.9 %)
Ectomycorrhiza
8137 (96.1 %)
Fruiting body
8191 (96.8 %)
8329 (98.4 %)
Values represent the proportion of genes expressed above the background control threshold. A transcript was deemed expressed
when its signal intensity was three-fold higher than the mean background expression value (cut-off value) of 3913 random oligos
present on the array (see section 8 for detailis).
26
SUPPLEMENTARY INFORMATION
Table S3. The most highly upregulated transcripts in T. melanosporum/Corylus avellana ectomycorrhizal root tips
compared to free-living mycelium and fruiting body.
SEQ_ID
ECM FB FLM
level level level
GSTUMT00012772001 13262 731
1
GSTUMT00012792001 11423 250
1
ECM/FLM
ratio
13262
11423
FB/FLM
ratio
731
250
2
2
2
4
2
1
2
4
5
9
9796
7721
6213
4538
2636
2063
2050
1745
1432
1401
90
1
1
2
1836
1
1610
12
692
107
593
35
1
18
1073
1066
593
2
2
1
1
2
902
833
2
1
GSTUMT00006579001 14241 8701
GSTUMT00010279001 7848 1926
GSTUMT00000499001 49524 634
22
12
77
662
651
644
404
160
8
GSTUMT00012667001
GSTUMT00009500001
2
6
575
403
3
453
GSTUMT00012437001
GSTUMT00009894001
GSTUMT00008973001
GSTUMT00008992001
GSTUMT00006890001
GSTUMT00010076001
GSTUMT00003538001
GSTUMT00012780001
GSTUMT00009016001
GSTUMT00005760001
19542 180
18205 2
10866 2
19305 9
5436 3787
2063
1
3588 2818
6542 44
6737 3257
12966 989
GSTUMT00007927001 1073
GSTUMT00008954001 19493
GSTUMT00000763001
GSTUMT00002130001
902
1534
1151
2529
7
2846
Definition
H-type lectin
Fasciclin-like
arabinogalactan protein
Lipase/esterase
Cytochrome P450
Endoglucanase GH5
Laccase
Sporulation-induced protein
Tyrosinase
FAD oxidoreductase
LysM domain protein
DUF1479-domain protein
Major facilitator superfamily
(MFS) permease*
Hypothetical protein
Major facilitator superfamily
(MFS) permease
Cytochrome P450
Beta-glucan synthesisassociated protein (SKN1)
DUF1479-domain protein
DUF2235-domain protein
Phosphatidylserine
decarboxylase
Hypothetical protein
Tuber-specific protein
Size Location TMD
279
414
_
S
0
0
346
396
342
586
481
604
564
87
379
142
S
M
_
S
_
_
_
S
_
S
0
0
0
0
0
0
0
0
0
2
306
496
_
_
0
10
397
575
M
M
0
1
426
403
316
M
_
_
0
0
0
883
86
_
M
2
0
Transcript profiling was performed on free-living mycelium, fruiting bodies and ectomycorrhizal root tips. Values are the means of seven, five and
four biological duplicates, respectively. Based on the statistical analysis, a gene was considered significantly upregulated if it met all two criteria: (i)
t-test P-value < 0.05 (ArrayStar, DNASTAR); (ii) mycorrhiza vs. free-living mycelium fold change ≥ 4; 571 genes (7.6% of the total gene repertoire)
showed an upregulated expression. Before the presence of a transcript can be declared, the signal-to-noise threshold (signal background) was
calculated based on the mean intensity of 3,913 random probes present on the microarray. Cut-off values for signal intensity (50 to 100 arbitrary
units), corresponding to three times the background values estimated from random 60-mer probes on the NimbleGen oligoarrays, were then
subtracted from the normalized intensity values. The highest signal intensity values observed on these arrays were ~65,189 arbitrary units. Signals
below the cut-off values were assigned a signal intensity value of 1. Abbreviations: FB, fruiting body; FLM, free-living mycelium; ECM,
ectomycorrhizal root tips; S, secreted; M, mitochondrial; TMD, transmembrane domain. * truncated sequence.
SUPPLEMENTARY INFORMATION
Table S4. The most highly upregulated transcripts in T. melanosporum fruiting body compared to free-living mycelium
and T. melanosporum/C. avellana ectomycorrhizal root tips.
SEQ_ID
ECM
level
FB
level
GSTUMT00001784001
4
41047
GSTUMT00009814001
1
4478
GSTUMT00009616001 11 10491
GSTUMT00001879001 39
3692
GSTUMT00006890001 5436 3787
GSTUMT00003538001 3588 2818
GSTUMT00010017001 17
1769
GSTUMT00001878001
1
14686
GSTUMT00006097001 105 13736
GSTUMT00012772001 13262 731
GSTUMT00002786001
5
978
GSTUMT00009016001 6737 3257
GSTUMT00009465001 18
6646
GSTUMT00003182001 17
1260
GSTUMT00007927001 1073 593
GSTUMT00002874001 1443 15022
GSTUMT00002703001 24 11228
GSTUMT00008314001 154
626
GSTUMT00009500001 2529 2846
GSTUMT00012516001 29
787
GSTUMT00004141001 2669 13898
GSTUMT00005006001 11706 18751
GSTUMT00006579001 14241 8701
GSTUMT00010468001 10
1363
GSTUMT00012378001 10
4272
GSTUMT00003188001 282 4551
GSTUMT00012817001 64
3666
GSTUMT00009177001
1
342
GSTUMT00005385001 860 1964
FLM FB/FLM Definition
level ratio
3
1
4
2
2
2
2
14
17
1
1
5
10
2
1
30
23
1
6
2
33
44
22
4
12
12
11
1
6
12097
4478
2377
2303
1836
1610
1091
1026
820
731
708
692
666
619
593
500
481
454
453
443
427
423
404
388
368
366
345
342
315
Tuber-specific protein
GAL4-like DNA-binding domain protein
integral membrane protein Pth11-like
Tuber-specific protein
DUF2235 Sporulation associated protein
FAD binding domain
Atrophin-1 family
Hypothetical protein
Tuber-specific small secreted protein
H-type lectin
O-glycosylated cell wall protein
DUF1479-domain protein
WW domain containing protein
Flavin containing amine oxidoreductase
Hypothetical protein
Lipase
Hypothetical protein
Hypothetical protein
Tuber-specific mitochondrial protein
O-methyltransferase
GMC oxidoreductase
Tuber-specific secreted protein
DUF1479-domain protein
Hypothetical protein
Tuber-specific small secreted protein
Hypothetical protein
FAD binding domain
Tuber-specific mitochondrial protein
Tuber-specific protein
Size Location TMD
332
585
253
71
481
564
767
389
131
279
161
379
259
439
306
336
62
133
86
333
607
180
426
890
216
336
510
598
206
_
_
S
_
_
_
_
S
S
_
S
_
_
_
_
S
_
_
M
_
_
S
M
_
S
S
S
M
_
0
0
3
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
3
0
0
0
7
1
0
0
Transcript profiling was performed on free-living mycelium, fruiting bodies and ectomycorrhizal root tips. Values are the means of seven, five and
four biological duplicates, respectively. Based on the statistical analysis, a gene was considered significantly upregulated if it met all two criteria: (i)
t-test P-value < 0.05 (ArrayStar, DNASTAR); and (ii) fruiting bodies vs. free-living mycelium fold change ≥ 4. Before the presence of a transcript
can be declared, the signal-to-noise threshold (signal background) was calculated based on the mean intensity of 3,913 random probes present
on the microarray. Cut-off values for signal intensity (50 to 100 arbitrary units), corresponding to three times the background values estimated from
random 60-mer probes on the NimbleGen oligoarrays, were then subtracted from the normalized intensity values. The highest signal intensity
values observed on these arrays were ~65,189 arbitrary units. Signals below the cut-off values were assigned a signal intensity value of 1.
Abbreviations: FB, fruiting body; FLM, free-living mycelium; ECM, ectomycorrhizal root tips; S, secreted; M, mitochondrial; TMD, transmembrane
domain.
28
SUPPLEMENTARY INFORMATION
Table S5. Tissue-specific transcripts in T. melanosporum. A transcript was considered as tissue specific when it was not
detectable in the two other tissues or if the transcript level in this tissue was at least 100-fold higher than in the two other tissues.
Ratios between 100-1000 are coloured in light pink, ratios higher than 1000 in dark pink.
ECM
level
FB
level
FLM
level
2063
375
19542
18205
10866
19305
6542
19493
902
1534
nd
nd
180
2.1
1.8
9
44
35
1.6
nd
nd
nd
2.0
2.4
1.7
4
4
18
nd
1.8
GSTUMT00001976001 659
GSTUMT00012667001 1151
GSTUMT00008934001 385
GSTUMT00012588001 6518
GSTUMT00009298001 6905
GSTUMT00001432001 13664
GSTUMT00006630001 18170
GSTUMT00003580001 5014
GSTUMT00008388001 3047
GSTUMT00011889001 834
GSTUMT00010195001 6221
2
7
2
4
11
nd
3
nd
nd
6
nd
342
148
41047
4478
10491
1769
14686
13736
978
6646
11228
1363
4272
4907
3776
524
2005
404
1962
34625
Tissue SEQ_ID
Definition
Size Location TMD
604
104
346
396
342
586
87
496
397
575
S
S
M
_
S
S
_
M
M
0
0
0
0
0
0
0
10
0
1
nd
2
nd
29
31
63
110
33
28
8
63
Tyrosinase
Hypothetical protein
Ab hydrolase_3
Cytochrome P450
Cellulase
Multicopper oxidase
Hypothetical protein
Major Facilitator Superfamily
Cytochrome P450
Beta-glucan synthesis-associated protein
Endonuclease/Exonuclease/phosphatase
family
Hypothetical protein
Tuber-specific protein
Hypothetical protein
Glycosyl hydrolase family 12
Major Facilitator Superfamily
Class II Aldolase
Phospholipase A2
Hypothetical protein
Hypothetical protein
Sugar (and other) transporter
461
883
120
353
241
435
283
216
667
810
452
S
_
_
S
S
_
_
S
_
S
_
0
2
0
6
0
10
0
0
3
0
9
nd
nd
3.4
nd
4.4
1.6
14
17
1.4
10
23
3.5
12
17
16
2,4
9
2
11
219
Tuber-specific protein
DEAD/DEAH box helicase
Tuber-specific protein
GAL4-like DNA-binding domain protein
Hypothetical protein
Atrophin-1 family
Hypothetical protein
Tuber-specific protein
O-glycosylated cell wall protein
WW domain containing protein
Hypothetical protein
Hypothetical protein
Hypothetical protein
Tuber-specific protein
Tuber-specific protein
Hypothetical protein
Glycolipid anchored surface protein
Tuber-specific protein
DEAD/DEAH box helicase
Tuber-specific protein
598
476
332
585
253
767
389
131
161
259
62
890
216
229
339
302
446
91
1535
193
M
_
_
_
S
_
S
S
S
_
_
_
S
_
S
_
S
_
_
S
0
0
0
0
3
0
1
0
0
0
0
0
0
0
1
0
1
0
0
0
ECM
GSTUMT00010076001
GSTUMT00004343001
GSTUMT00012437001
GSTUMT00009894001
GSTUMT00008973001
GSTUMT00008992001
GSTUMT00012780001
GSTUMT00008954001
GSTUMT00000763001
GSTUMT00002130001
FB
GSTUMT00009177001
GSTUMT00008309001
GSTUMT00001784001
GSTUMT00009814001
GSTUMT00009616001
GSTUMT00010017001
GSTUMT00001878001
GSTUMT00006097001
GSTUMT00002786001
GSTUMT00009465001
GSTUMT00002703001
GSTUMT00010468001
GSTUMT00012378001
GSTUMT00011444001
GSTUMT00003743001
GSTUMT00003678001
GSTUMT00002553001
GSTUMT00008040001
GSTUMT00003621001
GSTUMT00007269001
nd
nd
4
1,3
11
17
nd
105
4,6
18
24
10
9.8
9.1
1.9
nd
7.6
1
3.2
169
29
SUPPLEMENTARY INFORMATION
GSTUMT00011091001
GSTUMT00003226001
GSTUMT00006318001
nd
nd
3.2
15694
524
3569
106
3.7
26
GSTUMT00005042001
GSTUMT00010990001
GSTUMT00002278001
GSTUMT00012523001
GSTUMT00002331001
GSTUMT00008156001
GSTUMT00011724001
GSTUMT00007237001
GSTUMT00008657001
GSTUMT00005026001
GSTUMT00006197001
GSTUMT00012511001
GSTUMT00002274001
GSTUMT00011356001
GSTUMT00008966001
GSTUMT00006468001
GSTUMT00005001001
GSTUMT00002323001
GSTUMT00009209001
GSTUMT00009688001
nd
nd
nd
nd
4.4
14.8
3
nd
154
16
6,0
4
64,5
1,8
2.8
4,9
134
3
2.4
4.6
nd
nd
2.8
4,0
nd
13.6
10
1.8
58
nd
10.6
3.4
60,5
nd
5.3
1,4
169
2.2
nd
5.4
189
105
1521
891
2469
7595
1294
271
39669
3733
1366
874
13377
356
542
837
18248
386
323
540
Tuber-specific protein
Tuber-specific protein
Tuber-specific protein
68
78
200
M
_
_
1
0
0
Tuber-specific protein
Tuber-specific protein
Tuber-specific protein
Hypothetical protein
Hypothetical protein
Tuber-specific protein
Tuber-specific protein
Tuber-specific protein
peptidyl-prolyl cis-trans isomerase
Tuber-specific protein
Hypothetical protein
Zn(2)-Cys(6) cluster domain
Tuber-specific protein
RNase3 domain
Major Facilitator Superfamily
Sugar transporter
Major Facilitator Superfamily
Tuber-specific protein
Hypothetical protein
Tuber-specific protein
63
64
217
103
494
91
77
197
168
623
480
685
74
1402
514
407
565
225
331
140
M
S
M
S
_
_
_
M
_
M
_
_
M
_
M
_
_
_
_
M
0
0
1
1
0
0
0
1
0
0
1
0
0
0
13
7
14
0
0
0
FLM
30
SUPPLEMENTARY INFORMATION
Table S6. The top 10 most abundant (Tribe-MCL) protein families (excluding TE-related families) in T. melanosporum genome.
Family #
PFAM description
Tuber
Neurospora
Botrytis
Nectria
Magnaporthe
Stagonospora
Sclerotinia
Aspergillus
Laccaria
Total
5
NB-ARC domain
46
3
9
11
4
5
5
5
205
293
7
Protein kinase domain
34
26
30
31
30
32
25
30
48
286
15
Helicase domain
22
23
26
23
23
23
23
22
28
213
22
AAA-ATPase family
22
16
19
18
18
16
16
16
16
157
2
WD40, WD domain
21
19
25
31
18
21
35
26
143
339
10
Short chain dehydrogenase
17
17
31
64
24
33
26
30
12
254
17
Methyltransferase domain
17
25
12
89
10
15
10
7
2
187
0
MFS1, Major Facilitator Superfamily
15
26
36
158
52
82
29
51
18
467
27
SNF2 family N-terminal domain
14
13
15
13
13
18
15
15
15
131
31
Ubiquitin-conjugating enzyme
14
12
13
14
13
13
13
14
12
118
SUPPLEMENTARY INFORMATION
Table S7. The protein families showing the highest rate of contraction in T. melanosporum genome.
N° members
Family
ID
TUBME
NECHA
NEUCR
MAGGR
BOTCI
SCLSC
STANO
ASPFU
PFAM accession
PFAM description
0
15
158
26
52
36
29
82
51
PF07690, PF00083
1
12
79
15
49
55
34
63
27
PF00067,
Major Facilitator Superfamily (MFS), Sugar (and other)
transporter
Cytochrome P450
2
21
31
19
18
25
35
21
26
PF04047, PF08625,
3
9
72
22
27
45
34
48
49
4
7
74
16
51
28
25
76
33
PF07690, PF00083,
PF00854
PF05730
6
0
150
29
10
28
16
56
1
8
12
51
15
16
47
35
43
44
10
17
64
17
24
31
26
33
30
12
7
46
15
24
43
23
37
27
14
0
90
12
13
17
9
71
18
3
44
8
19
35
14
19
8
58
12
16
12
20
3
14
7
38
31
21
6
46
10
17
23
0
10
0
6
Periodic tryptophan protein 2 WD repeat, WD40 associated
domain, PGAP1-like protein
MFS, Sugar (and other) transporter, POT family
CFEM domain
PF06985, PF00106,
PF00596
PF00083, PF07690
Heterokaryon incompatibility protein (HET), short chain
dehydrogenase,
Sugar (and other) transporter, MFS
NAD dependent epimerase/ dehydratase family, 3-beta
hydroxysteroid dehydrogenase/isomerase family
MFS, Sugar (and other) transporter, Beta-ketoacyl synthase
3
PF08659, PF01370,
PF01073
PF07690, PF00083,
PF02801
PF06985, PF00023
30
14
PF00135, PF02734
Carboxylesterase, DAK2 domain
12
21
22
PF00171, PF05893
Aldehyde dehydrogenase family, Acyl-CoA reductase
22
33
20
19
17
15
17
PF00698, PF08659,
PF00975, PF08242
PF00324
Acyl transferase domain, KR domain, Thioesterase domain,
Methyltransferase domain
Amino acid permease
11
4
14
4
PF00069, PF00023
Protein kinase domain, Ankyrin repeat
HET, Ankyrin repeat
The table lists 20 TRIBE-MCL families that are in contraction in the T. melanoposrum lineage (CAFE analysis, P<0.001) (Fig. S11B). Annotations are based on searches of T. melanosporum protein sequences against the PFAM
database. Abbreviations: TUBME, T. melanosporum; NECHA, Nectria haematococca; NEUCR, N. crassa; MAGGR, M. grisea; BOTCI, B. cinerea; SCLSC, ; STANO, Stagonospora nodorum; and ASPFU, A. fumigatus. Further
information is found in SOM text S5.3.
32
SUPPLEMENTARY INFORMATION
Table S8. Protein families showing the highest rate of expansion in T. melanosporum genome.
N° members
Pfam accession
Pfam description
Tetratricopeptide repeat (TPR), NB-ARC domain, PGAP1-like protein
Family ID
TUBME
NECHA
NEUCR
MAGGR
BOTCI
SCLSC
STANO
ASPFU
5
46
11
3
4
9
5
5
5
7
34
31
26
30
30
25
32
30
17
17
89
25
10
12
10
15
7
PF07721, PF00931,
PF07819
PF00069, PF00023,
PF08587, PF07714
PF08242, PF08241
22
22
18
16
18
19
16
16
16
PF01078, PF07726,
Magnesium chelatase, subunit ChlI, ATPase family (AAA)
42
13
10
10
11
11
11
10
10
PF08477, PF00071
Miro-like protein, Ras family
57
10
8
8
8
8
8
8
8
PF00118
TCP-1/cpn60 chaperonin family
72
9
6
6
7
7
6
8
6
PF00012, PF00096
Hsp70 protein, Zinc finger, C2H2 type
96
7
7
5
4
4
4
4
4
Ubiquitin family, Ribosomal protein S27a, Ribosomal L40e family
116
5
4
4
4
5
4
6
5
135
5
9
3
6
2
2
7
3
PF00240, PF01599,
PF01020
PF00160, PF00515,
PF07719
PF00082
149
8
3
4
2
2
2
3
2
PF00069, PF07714
Protein kinase domain, Protein tyrosine kinase
164
9
5
4
5
2
2
3
3
NO PFAM
-
169
10
6
2
2
0
0
0
0
PF01926
GTPase of unknown function
Protein kinase domain, Ankyrin repeat, Ubiquitin associated, domain (UBA),
Protein tyrosine kinase
Methyltransferase domain
Cyclophilin type peptidyl-prolyl cis-trans isomerase/CLD, TPR
Subtilase family
The table lists 20 TRIBE-MCL families that are in expansion in the T. melanopsorum lineage (CAFE analysis, P<0.001) (Fig. S11B). Annotations are based on searches of T. melanosporum protein sequences against the PFAM
database. Abbreviations: TUBME, T. melanosporum; NECHA, Nectria haematococca; NEUCR, N. crassa; MAGGR, M. grisea; BOTCI, B. cinerea; SCLSC, ; and STANO, Stagonospora nodorum; ASPFU,
A. fumigatus. Further information is found in SOM text S5.3.
33
SUPPLEMENTARY INFORMATION
Table S9. Gene families unique to T. melanosporum
Tuber N°
family
ID
PFAM #
PFAM
T. melanosporum gene model ID
description
4601
2
-
GSTUMT00010765001,GSTUMT00010766001
5042
2
-
5046
6
-
GSTUMT00001741001,GSTUMT00007092001
GSTUMT00005968001,GSTUMT00005970001,GSTUMT00005973001,GSTUMT00005978001,
GSTUMT00005985001,GSTUMT00005986001
5932
4
-
5936
4
PF0785
5938
4
-
GSTUMT00003636001,GSTUMT00009890001,GSTUMT00009892001,GSTUMT00012733001
6709
3
-
GSTUMT00005969001,GSTUMT00005976001,GSTUMT00005979001
6712
3
PF01185 Hydrophobin
GSTUMT00006864001,GSTUMT00012443001,GSTUMT00012444001
6713
3
PF0572
GSTUMT00007355001,GSTUMT00012494001,GSTUMT00012778001
6717
3
-
6726
3
PF0771
6728
3
PF00096 zf-C2H2
9370
2
-
9373
2
PF0598
9374
2
-
GSTUMT00000524001,GSTUMT00002591001
9375
2
-
GSTUMT00000531001,GSTUMT00012382001
9377
2
PF01636 APH
GSTUMT00000735001,GSTUMT00012643001
9384
2
PF1064
GSTUMT00001850001,GSTUMT00008931001
9385
2
-
GSTUMT00001883001,GSTUMT00002072001
9386
2
-
GSTUMT00001953001,GSTUMT00011182001
9388
2
PF00226 DnaJ
GSTUMT00001985001,GSTUMT00006727001
9389
2
PF0857
GSTUMT00002245001,GSTUMT00006509001
9390
2
-
GSTUMT00002313001,GSTUMT00005467001
9391
2
-
GSTUMT00002606001,GSTUMT00002607001
9394
2
-
9395
2
PF0407
9396
2
-
9397
2
-
9398
2
PF0950
9401
2
-
GSTUMT00003419001,GSTUMT00010002001
9402
2
-
GSTUMT00003421001,GSTUMT00010000001
9403
2
-
GSTUMT00003540001,GSTUMT00009531001
9406
2
-
GSTUMT00003946001,GSTUMT00006860001
9409
2
-
9411
2
PF0056
ZZ
GSTUMT00004719001,GSTUMT00006855001
9413
2
PF0347
MOSC_N
GSTUMT00004993001,GSTUMT00004995001
9414
2
-
GSTUMT00005128001,GSTUMT00005129001
9416
2
-
GSTUMT00005332001,GSTUMT00007336001
9417
2
-
GSTUMT00005622001,GSTUMT00012378001
9418
2
-
GSTUMT00005917001,GSTUMT00012308001
9419
2
-
GSTUMT00006042001,GSTUMT00012706001
9421
2
-
GSTUMT00006905001,GSTUMT00010024001
9422
2
-
GSTUMT00006917001,GSTUMT00007178001
9423
2
-
GSTUMT00007121001,GSTUMT00008465001
9424
2
-
GSTUMT00007558001,GSTUMT00007584001
9425
2
-
GSTUMT00007581001,GSTUMT00007582001
9431
2
-
GSTUMT00008393001,GSTUMT00012669001
9432
2
-
GSTUMT00008439001,GSTUMT00008440001
9433
2
-
GSTUMT00008916001,GSTUMT00009351001
9435
2
-
GSTUMT00010064001,GSTUMT00012770001
9439
2
-
GSTUMT00010824001,GSTUMT00012767001
9443
2
-
GSTUMT00012279001,GSTUMT00012788001
GSTUMT00000178001,GSTUMT00000180001,GSTUMT00002847001,GSTUMT00012366001
Abhydrolase_3 GSTUMT00003552001,GSTUMT00012317001,GSTUMT00012437001,GSTUMT00012479001
TPMT
GSTUMT00008303001,GSTUMT00008304001,GSTUMT00012650001
Pkinase_Tyr
GSTUMT00012343001,GSTUMT00012419001,GSTUMT00012651001
GSTUMT00012427001,GSTUMT00012671001,GSTUMT00012761001
GSTUMT00000201001,GSTUMT00012769001
MED7
Carb_bind
SAE2
GSTUMT00000345001,GSTUMT00000613001
GSTUMT00002818001,GSTUMT00009789001
YbaK
GSTUMT00002819001,GSTUMT00009788001
GSTUMT00002854001,GSTUMT00002861001
GSTUMT00002932001,GSTUMT00009503001
CDC27
GSTUMT00002952001,GSTUMT00009250001
GSTUMT00004570001,GSTUMT00005977001
34
SUPPLEMENTARY INFORMATION
9445
2
-
GSTUMT00012417001,GSTUMT00012654001
9446
2
-
GSTUMT00012421001,GSTUMT00012653001
9447
2
PF01636 APH
GSTUMT00012446001,GSTUMT00012447001
9448
2
PF00400 WD40
GSTUMT00012471001,GSTUMT00012763001
9449
2
-
GSTUMT00012517001,GSTUMT00012678001
9450
2
-
GSTUMT00012693001,GSTUMT00012694001
The table lists families that are unique to the T. melanopsorum lineage (CAFE analysis, P<0.001). Annotations are based on searches of T. melanosporum
protein sequences against the PFAM database. . Further information is found in SOM text S5.3.
35
SUPPLEMENTARY INFORMATION
Table S10. Genes unique to the symbiotic fungi T. melanosporum and Laccaria bicolor
N°
N°
PFAM
Tuber Laccaria ID
PFAM
description
T. melanosporum
Genoscope gene model ID
1
12
-
-
GSTUMT00012275001
2
5
-
-
GSTUMT00010765001,
GSTUMT00010766001
1
2
6
4
PF01026 TatD_DNase
-
GSTUMT00012621001
GSTUMT00001741001,
GSTUMT00007092001
1
1
1
1
1
1
1
1
1
1
5
2
1
1
1
1
1
1
1
1
PF01031
PF0218
PF0419
PF0179
PF1020
PF0384
PF0046
Dynamin_M
YDG_SRA
PQ-loop
DeoC
GSTUMT00003569001
GSTUMT00003278001
GSTUMT00000062001
GSTUMT00000719001
GSTUMT00001031001
GSTUMT00001736001
DUF2340
GSTUMT00001979001
GSTUMT00002691001
TFIID_20kDa
GSTUMT00011058001
Ribosomal_L34 GSTUMT00011976001
L. bicolor
JGI gene model ID
144541, 293566, 295948, 304863,
316954, 317895, 317896, 317897,
318946, 325570, 325617, 326160
295937, 320976, 321501, 321502,
333496
295047, 299432, 30237, 303664,
328304, 328338
313830, 317358, 325447, 326780
298267, 316255, 316256, 317888,
334853
153937, 30226
161810
294524
296435
149239
244798
305749
165152
150208
The table lists gene models that are unique to the T. melanopsorum and L. bicolor lineages. Annotations are based on searches of protein sequences from
T. melanosporum and L. bicolor (http://genome.jgi-psf.org/Lacbi1/Lacbi1.home.html) against the PFAM database.
36
SUPPLEMENTARY INFORMATION
Table S11. Genes implicated in sexual reproduction in T. melanosporum genome.
Gene
Mating processes
MAT1 (matB)
MAT2 (matA)
Sc MFAL1, MFAL2 (An ppgA )
Sc MFA1, MFA2 (An ppgB)
Sc KEX1
Sc KEX2 (An KexB)
Sc STE13
Sc STE23
Sc RCE1
Sc STE24
Sc RAM1/STE16
Sc RAM2
Sc STE14/Sp mam4
Sc STE6/Sp mam1(An atrD)
Sp Ste11
Sc MCM1
Mating signalling
Sc STE2/Sp mam2 (An preB)
Sc STE3/Sp map3 (An preA)
Sc GPA1 (An fadA)
Sc STE4 (An sfaD)
Sc STE5
Sc STE18
Sc STE20
Sc STE11/Sp byr2 (An steC)
Sc STE7/Sp byr1
Sc FUS3/Sp spk1 (An mpkB)
Sc STE12 (An steA)
Sc FAR1
Sc STE50
Sc DIG1/RST1
Sc DIG2/RST2
EST
#
Function
Gene model
Gene name
MAT1-1-1 mating-type protein (alpha-box domain transcriptional activator)
MAT1-2-1 mating-type protein (HMG-box domain transcriptional activator)
pheromone precursor (Hypothetical mating factor alpha)
pheromone precursor (a-factor)
pheromone processing carboxypeptidase KexA (Kex1) putative
pheromone processing endoprotease KexB (Kex2)
pheromone maturation dipeptidyl aminopeptidase A
a-pheromone processing metallopeptidase Ste23
CAAX prenyl protease 2
CAAX prenyl protease 1
protein farnesyltransferase subunit beta
protein farnesyltransferase/geranylgeranyltransferase type-1 subunit alpha
prenyl cysteine carboxylmethyltransferase Ste14
mating factor a secretion protein STE6
STE11 like HMG-box protein
DNA-binding protein Mcm1 (MADS box family transcription factor)
**
GSTUMT00001090001
GSTUMT00004099001
no
GSTUMT00004029001
GSTUMT00008707001
GSTUMT00010539001
GSTUMT00003807001
GSTUMT00005185001
GSTUMT00011980001
GSTUMT00007324001
GSTUMT00001514001
GSTUMT00010209001
GSTUMT00003208001
GSTUMT00000587001*
GSTUMT00012198001
TmelMAT1-1-1
TmelMAT1-2-1
TmelMFAL1
1
19
TmelKexA
TmelkexB
TmelDap1
TmelSte23
TmelRCE1
TmelSTE24
TmelRAM1
TmelRAM2
TmelSTE14
TmelSTE6
TmelSte11like
TmelMcm1like
9
7
12
33
4
13
7
6
2
no
no
27
pheromone alpha-factor receptor PreB/Ste2
pheromone a factor receptor PreA/Ste3
heterotrimeric G-protein alpha subunit (type 1 G-alpha, GPA1)
heterotrimeric G-protein beta subunit
scaffold protein
heterotrimeric G-protein gamma subunit
serine/threonine-protein kinase Ste20
mitogen activated protein kinase kinase kinase Ste11
mitogen activated protein kinase kinase, Dual specificity protein kinase,
STE7-like
mitogen-activated protein kinase, Fus3
sexual development transcription factor Ste12 (Homeodomain DNA binding)
cell cycle arrest in G1/various other roles
protein kinase regulator Ste50
Transcription factor, interacts ste12 pheromone response
Transcription factor, interacts ste12 pheromone response
GSTUMT00009053001
GSTUMT00012510001
GSTUMT00011064001
GSTUMT00010108001
no
GSTUMT00012095001
GSTUMT00006969001
GSTUMT00005262001
TmelPreB
TmelPreA
TmelGPA1
TmelGPB
1
no
49
11
TmelGPG
TmelSte20
TmelMAPKKK_Ste11
26
4
7
GSTUMT00011865001
GSTUMT00000551001
GSTUMT00006450001
no
GSTUMT00010173001
no
no
TmelMAPKK_Ste7like
TmelMAPK_Fus3
Tmelste12
4
12
10
TmelSte50
7
37
Aspergillus nidulans
ID
AN2755.2
AN4734.2
AN5791.2
Ambiguous
AN1384.2
AN3583.2
AN2946.2
AN8044.2*
AN6528.2
Yes**
AN2002.2
AN3867.2
AN6162.2
AN2300.2
AN3667.2*
AN8676.2*
AN2520.2
AN7743.2
AN0651.2
AN0081.2
AN2742.2 *
AN2067.2*
AN2269.2
AN3422.2*
AN3719.2
AN2290.2
AN7252.2 *
-
SUPPLEMENTARY INFORMATION
Core meiotic genes in Budding Yeast
DSB generation
Sc MEI4
Sc MEK1/MRE4
Sc MER1
Sc MER2/REC107
Sc MER3
Sc NAM8/MRE2
Sc REC102
Sc REC103/SK18
Sc REC104
Sc RED1
Sc SPO11/Sp rec12
Removal of Spo11 protein from DNA
Sc MRE11
Sc RAD50
Sc SAE2/COM1
Resection of ends
Sc XRS2
Strand invasion
Sc RAD51
Sc RAD52
Sc RAD54
Sc RAD55
Sc RAD57
Sc RDH54/TID1
Sc RFA1
Sc RFA2
Sc RFA3
Sc SAE3
Synapsis and synaptonemal complex formation
Sc HOP1
Sc MND1
Sc ZIP1
Sc ZIP2
Regulation of Crossover frequency
Sc MEI5
Sc MLH1
Sc MLH3
Sc MSH4
meiosis-specific protein MEI4
meiosis-specific serine/threonine-protein kinase MEK1
meiotic recombination 1 protein
meiotic recombination 2 protein
ATP-dependent DNA helicase MER3
RNA binding protein required for meiotic recombination
meiotic recombination protein REC102
superkiller protein 8
meiotic recombination protein REC104
protein RED1
required for synaptonemal complex formation
NO
GSTUMT00012509001
no
no
GSTUMT00003621001
GSTUMT00001670001
no
GSTUMT00010494001
no
no
GSTUMT00012793001
double-strand break repair protein MRE11
DNA repair protein RAD50
protein SAE2
TmelMEK1
no
TmelMER3
TmelNAM8_like
5
32
TmelSKI8
5
TmelSPO11
no
GSTUMT00004321001
GSTUMT00001634001
no
Tmelmus-23
TmelRAD50
1
11
AN0556.3
AN3619.3
no
DNA repair protein XRS2
GSTUMT00002619001
TmelRca
2
no
DNA repair protein RAD51
DNA repair and recombination protein RAD52
DNA repair and recombination protein RAD54
DNA repair protein RAD55
DNA repair protein RAD57
DNA repair and recombination protein RDH54
replication factor A protein 1
replication factor A protein 2
replication factor A protein 3
pachytene arrest protein SAE3
GSTUMT00002392001
GSTUMT00010491001
GSTUMT00000296001
no
GSTUMT00009344001
GSTUMT00003008001
GSTUMT00005528001
GSTUMT00003154001
no
no
TmelRad51
Tmelrad22
TmelRAD54
4
7
3
TmelRAD57
TmelRad54b
TmelRFA1
TmelRFA2_like
5
1
12
5
uvsC
radC
AN10677.3
AN6728.3
AN10145.3
AN0855.3
AN7423.3
AN0582.3
no
no
meiosis-specific protein HOP1
meiotic nuclear division protein 1
synaptonemal complex protein ZIP1
protein ZIP2
GSTUMT00003611001* TmelHOP1
no
no
no
meiosis protein 5
DNA mismatch repair protein MLH1
no
GSTUMT00003139001
GSTUMT00008427001
GSTUMT00003206001
GSTUMT00009562001
DNA mismatch repair protein MLH3
MutS protein homolog 4
38
TmelMLH1
TmelMLH1_like
TmelMLH3
TmelMSH4
4
3
1
6
12
AN4279.3
no
no
AN5514.3
AN9090.2
No
AN1387.3
no
no
AN8259.2
AN5516.3
AN1843.3
AN3062.3
no
no
AN0126.3
AN4365.3
SUPPLEMENTARY INFORMATION
Sc MSH5
Sc TAM1/NDJ1
Mismatch repair
Sc MLH2
Sc MSH2
Sc MSH3
Sc MSH6
Sc PMS1
Resolution of recombination intermediates
Sc MMS4/SLX2
Sc SLX1
Sc SLX3/MUS81
Sc SLX4
Sc SLX8
Sc HEX3/SLX5
Sc TOP1/MAK1/MAK17
Sc TOP2/TOR3/TRF3
Sc TOP3/EDR1
Nonhomologous end joining
Sc LIF1
Sc LIG4
Sc YKU70/HDF1/NES24
Sc YKU80/HDF2
Core Meiotic Transcriptome Conserved in S. cerevisiae and S.
pombe
Anaphase-promoting complex
Sc CDC27/ Sp nuc2
Sc APC4/ Sp cut20
Sc CDC16/ Sp cut9
Sc APC1/ Sp cut4
Sc APC5/ Sp apc5
Sc CDC23/ Sp cut23
Sc CDC26/ Sp hcn1
Sc HCT1/ Sp ste9
Sc CDC20/ Sp mfr1
Sc AMA1/ Sp slp1
Septins
Sc CDC10/ Sp spn2
Sc CDC3/ Sp spn5
Sc SPR3/ Sp spn6
MutS protein homolog 5
non-disjunction protein 1
GSTUMT00011879001
no
DNA mismatch repair protein MLH2
DNA mismatch repair protein MSH2
DNA mismatch repair protein MSH3
DNA mismatch repair protein MSH6
DNA mismatch repair protein PMS1
no
GSTUMT00006266001
GSTUMT00011039001
GSTUMT00001828001
GSTUMT00002148001
crossover junction endonuclease MMS4
structure-specific endonuclease subunit slx1
crossover junction endonuclease MUS81
structure-specific endonuclease subunit SLX4
E3 ubiquitin-protein ligase complex SLX5-SLX8 subunit SLX8
E3 ubiquitin-protein ligase complex SLX5-SLX8 subunit SLX5
DNA topoisomerase 1
DNA topoisomerase 2
DNA topoisomerase 3
no
GSTUMT00009143001
GSTUMT00006347001
no
GSTUMT00007668001
no
GSTUMT00008425001
GSTUMT00006716001
GSTUMT00005673001
ligase-interacting factor 1
DNA ligase 4
protein Ku70
protein Ku80
no
GSTUMT00007703001
GSTUMT00005220001
GSTUMT00001928001
APC component
APC component
APC component
APC component
APC component
APC component
APC component
APC regulator
APC regulator
APC regulator
septin
septin
sporulation regulated septin
39
TmelMSH5
no
AN8531.3
AN1564.3
AN10621.3
AN3749.3
AN1708.3
AN6316.3
TmelMSH2
TmelMSH3
TmelMSH6
TmelPMS1
1
1
10
7
TmelSLX1
TmelMUS81
14
5
TmelSLX8
7
TmelTOP1
TmelTOP2
TmelTOP3
12
5
2
AN6878.3
AN8212.3
AN3118.2
no
AN10006.3
no
AN0253.3
AN5406.3
AN4555.3
TmelLIG4
TmelKu70
TmelKu80
5
2
8
no
AN0097.3
AN7753.3
AN4552.3
GSTUMT00009492001
no
GSTUMT00003803001
GSTUMT00007432001
GSTUMT00010617001
GSTUMT00003819001
no
GSTUMT00009718001
GSTUMT00005056001
GSTUMT00000167001
MANTUM00009492001
1
TmelCdc16
Tmelapc1
Tmelapc5
MANTUM00003819001
5
5
11
5
Tmelcdh1
MANTUM00005056001
TmelAMA1
1
1
no
bimA
AN0905.2
AN8002.2
AN2772.2
AN4735.2
AN8013.2
No
AN2965.2
AN2965.2
AN0814.2
GSTUMT00010495001
GSTUMT00001755001
GSTUMT00003373001
TmelCdc10
TmelCdc3
TmelCdc12
9
34
12
AN1394.2
sepB
AN8182.2
SUPPLEMENTARY INFORMATION
Sc SPR28/ Sp spn7
Cell cycle regulators
Sc CDC14/ Sp clp1
Sc CDC5/ Sp plo1
Sc CLB1/ Sp cig2
Sc CLB3/ Sp cdc13
Sc CLB4/ Sp cig1
Sc CLB5
Sc CLB6
Recombination/chromosome cohesion
Sc REC114/ Sp rec7
Sc DMC1/ Sp dmc1
Sc MND1/ Sp mcp7
Sc HOP2/ Sp meu13
Sc SMC3/ Sp smc3
Sc REC8/ Sp rec8
Chromosome segregation
Sc STU1/ Sp dis1
Sc TID3/ Sp ncd10
Sc UBC11/ Sp ubc11
DNA repair
Sc RAD23/ Sp rhp23
Sc EXO1/ Sp exo1
Sc HRR25/ Sp hhp1
sporulation regulated septin
GSTUMT00000090001
TmelCdc11
8
AN4667.2
protein phosphatase
polo kinase
B-type cyclin
B-type cyclin
B-type cyclin
B-type cyclin
B-type cyclin
GSTUMT00011110001
GSTUMT00008436001
GSTUMT00007556001
GSTUMT00005720001
GSTUMT00011271001
no
no
TmelCdc14
MANTUM00008436001
TmelNimE
TmelCLB3
TmelCLB4
3
no
no
1
6
AN5057.2
AN1560.2
nimE
AN2137.2
meiotic recombination protein
DNA-binding helix-hairpin-helix protein, DNA strand exchange
recombinatino and meioic nuclear division, interacts with Hop2
prevents synapsis between nonhomologous chromosomes
cohesin
cohesin complex (meiotic)
no
GSTUMT00009804001
no
no
GSTUMT00000412001
GSTUMT00012822001
TmelDMC1
14
Tmelsmc3
Tmelrec8
2
no
No
AN9092.2
AN1843.2
No
AN6364.2
No
spindle pole body component
chromosome segregation, kinetochore-associated Ndc80 complex
ubiquitin-conjugating enzyme, chromosome segregation in Sp
GSTUMT00003608001
GSTUMT00000570001
GSTUMT00001653001
MANTUM00003608001
TmelTID3_like
MANTUM00001653001
7
1
1
AN5521.2
AN4969.2
AN5495.2
DNA excision-repair, NEF2 subunit
DNA repair, recombination
casein kinase involved in DNA repair and chromsome segregation
GSTUMT00006514001
GSTUMT00009088001
GSTUMT00002552001
TmelRAD23
TmelEXO1
TmelHRR25
30
1
25
AN2304.2
AN3035.2
AN4563.2
Other genes in the conserved meiotic core program
Sc HUL4/ SPBP87.27
Sc LEE1/ Sp scp3
Sc ENA2/ Sp cta3
hect domain E3 ubiquitin-protein ligase
Zn finger transcription factor, unknown function
P-type ATPase sodium pump
GSTUMT00002325001
GSTUMT00003772001
GSTUMT00000681001
TmelHUL4
TmelScp3
TmelENA2_like
10
6
4
Sc PMC1/ SPBC839.06
Sc CHS1/ Sp chs1
Sc ISA1/ SPCC645.03C
Sc HTZ1/ Sp pht1
Sc AUT7/ SPBP8B7.24C
Sc BAG7/ SPBC557.01
Sc ROM2/ SPAC1006.06
Sc RAS2/ Sp ras1
Sc GNA1/ Sp gna1
Sc SGA1/ Sp meu17
Sc CLG1/ SPBC1D7.03
vacuolar ATPase Ca2+ pump
chitin synthase, pheromone inducible
mitochondrial matrix protein, iron metabolism
histone H2AZ varient
required for autophagic vesicle delivery to vacuole in Sc
Rho-GAP
Rho-GEF
Ras
glucosamine acetyl transferase involved in cell cycle progression
sporulation-specific glucosamylase, Sp Mei4 target gene
cyclin-like protein interacts with Sc Pho85
GSTUMT00003738001
GSTUMT00011849001
GSTUMT00009441001
GSTUMT00009338001
GSTUMT00002234001
GSTUMT00005421001
GSTUMT00000084001
GSTUMT00002293001
GSTUMT00004721001
GSTUMT00004366001
GSTUMT00003449001
TmelPmc1
TmelCHS1
TmelISA1
TmelHTZ1
TmelAUT7
TmelBag7
TmelROM2
TmelRAS2
TmelGNA1
TmelSGA1_like
TmelCLG1
15
25
8
12
122
9
9
2
no
359
29
AN0444.2
AN3447.2
AN6642.2,
AN1628.2
AN1189.2
AN4566.2
AN1974.2
AN8039.2
AN5131.2
AN7650.2
AN4719.2
AN0182.2
AN8706.2
AN8904.2
AN4984.2
40
SUPPLEMENTARY INFORMATION
Sc CYB2/ SPAB1A11.03
Sc ECM4/ SPCC1281.07C
cytochrome-c oxidoreductase
glutathione-S-transferase domain, unknown function
Sc TOS7/ SPCC1739.10
Sc ARN2/ SPCC61.01C
Sc GTT1/ SPAC688.04C
Sc RIB5/ SPCC1450.13C
Sc CHO1/ SPCC1442.12
Sc XKS1/ SPCPJ732.02C
Sc PCT1/ SPCC1827.02C
Sc ELC1/ SPBC1861.07
Sc SYF2/ SPBC3E7.13C
Sc PGM2/ SPBC32F12.10
Sc RKI1/ Sp ppi
Sc SUR4/ SPAC1B2.03C
Sc PIB1/ SPBC36B7.05C
Sc PIN3/ Sp csh3
Sc FBP1/ Sp fbp1
Sc GLG1/ SPBC4C3.08
Sc ARE2/ SPAC13G7.05
Sc GDI1/ Sp gdi1
Sc PDC1/ SPAC3H8.01
Sc OXR1/ SPAC8C9.16C
Sc KGD1// SPBC3H7.03C
unknown function in yeasts, pH regulation in A. nidulans
ARN family of transporters for siderophore-iron chelates
ER associated glutathione S-transferase
riboflavin synthase, alpha subunit
phosphatidylserine synthase
xyulose kinase
CTP:phosphocholine cytidylyltransferase
transcription elongation factor
splicesome component
phosphoglucomutase
ribose-5-phosphate isomerase
long chain fatty acid elongation enzyme
RING-type ubiquitin ligase, FYVE finger domain
SH3-domain protein
fructose-1,6-biphosphatase
self-glucosylating initiator of glycogen synthesis
acyl-CoA:sterol acyltransferase
secretory pathway regulator
pyruvate decarboxylase
unknown function
mitochondrial alpha-ketoglutarate dehydrogenase complex
Sc damage response, related to mammalian membrane progesterone
receptors
Sc DAP1/ SPAC26H5.15
Other genes involved in Mating, Karyogamy and Meiosis in Budding and Fisson Yeast
Sc BIM1
microtubule-binding protein
Sc BNI1/Sp fus1
formin
Sc CDC31
spindle pole body component
Sc CMK2
calmodulin-dependent protein kinase
Sc CSM1
mediates accurate chromosome segregation during Meiosis I
Sc CSM3
mediates accurate chromosome segregation during meiosis
Sc DIT1
sporulation-specific enzyme required for spore wall maturation
Sc HO
endonuclease, mating type switching
Sc IDS2
modulator of Ime2 activity
Sc IME1
master transcriptional regulator of meiosis
Sc IME2/Sp mde3
inducer of meiosis, S/T kinase
Sc IME4
mRNA N6-adenosine methyltransferase, IME1 regulation
41
GSTUMT00008475001
GSTUMT00010048001
GSTUMT00004090001
GSTUMT00006030001
GSTUMT00002078001
NO
GSTUMT00010331001
GSTUMT00003859001
GSTUMT00006227001
GSTUMT00008349001
GSTUMT00003942001
GSTUMT00010887001
GSTUMT00002911001
GSTUMT00001433001
GSTUMT00009363001
GSTUMT00003835001
GSTUMT00002841001
NO
GSTUMT00000649001
GSTUMT00005701001
GSTUMT00001327001
GSTUMT00006399001
GSTUMT00006321001
GSTUMT00003778001
GSTUMT00000661001
TmelCYB2
TmelGto_like(1)
TmelGto_like(2)
TmelGto_like(3)
TmePalI
19
4
14
no
2
AN3901.2
AN5831.2
TmelGTT1
TmelRIB5
TmelCHO1
TmelXKS1
TmelPCT1
TmelELC1
TmelSYF2
TmelPgmA
TmelRki1
TmelSUR4
TmelPIB1_like
6
10
4
1
3
no
4
4
12
7
7
Tmelfbp
Tmelgyg
TmelARE2_like
TmelRab-gdi
TmelPDC1
TmelOXR1
Tmelkgd1
149
10
13
10
270
7
19
palI
AN5378.2
AN0629.2
AN4231.2
AN5661.2
AN8790.2
AN1357.2
No
AN1861.2
AN2867.2
AN2440.2
AN8117.2
AN0627.2*
AN2995.2
AN5604.2
AN4082.2
AN4208.2
AN5895.2
AN4888.2
AN3004.2
AN5571.2
GSTUMT00001452001
TmelDap1
4
AN4939.2
GSTUMT00005677001
GSTUMT00006074001
GSTUMT00001917001
GSTUMT00008759001
no
GSTUMT00004513001
no
no
no
no
GSTUMT00001158001
no
TmelEB1_like
TmelsepA
TmelCDC31
TmelCmk2_like
14
7
2
18
TmelCSM3
1
TmelIme2_like
2
AN2862.2
AN6523.2
AN5618.2*
AN2412.2
No
No
AN2705.2
No
No
No
AN6243.2
No
SUPPLEMENTARY INFORMATION
Sc KAR1
Sc KAR3/Sp pkl1
Sc KAR4
Sc KAR5/Sp tht1
Sc KAR9
Sc MUM2
Sc NDT80
Sc RIM4
Sc RME1
Sc SPO1
Sc SPO13
Sc SST2/Sp rgs1
Sc SPO22
Sc SUM1
Sc UME3/SSN8/Sp ume3
Sc UME6
required for karyogamy
kinesin-like motor required for karyogamy
TF required for mating and meiosis
nuclear membrane fusion during karyogamy
cytoplasmic microtubule orientation during karyogamy
essential for meiotic DNA replication
meiosis-specific transcriptional activator
RNA-binding protein, early and middle sporulation gene expression
Zn finger transcriptonal repressor of IME1
meiosis-specific phospholipase B
meiosis specific protein required for Meiosis I and II
RGS protein, regulates desensitization to alpha-factor
meiosis-specific phospholipase A2
transcriptional repressor of middle sporulation-specific genes
C-type cyclin
C6 Zn finger regulator of early meiotic genes
no
GSTUMT00010256001
no
GSTUMT00012548001
GSTUMT00006847001
no
GSTUMT00006371001
GSTUMT00011359001
no
no
no
GSTUMT00010835001
no
no
GSTUMT00010769001
no
Sp pat1
Sp atf21
Sp bgs1
Sp cmk1
Sp dhc1
Sp gpa1
Sp hsk1
Sp isp5
Sp isp6
Sp lid2
Sp map1
Sp map2
Sp mat1-Mc
topisomerase II associated protein (Pat1)
bZip TF, involved in regulation of meiosis
1,3-beta-glucan synthase subunit, Sp Mei4 target gene
calmodulin-dependent protein kinase
dynein heavy chain, homologue of Sc DYN1
GTP binding (alpha-1 subunit) involved in conjugation
homologue of Sc CDC7, Dbf4-dependent kinase
amino acid permease involved in sexual differentiation
serine protease involved in sexual differentiation
homologue of Um rum1, Sc ECM5
MADS-box domain TF, pheromone receptor activator
P-factor pheromone
mating-type M-specific polypeptide Mc, HMG-box TF
Sp mat1-Mi
Sp mat1-Pc
Sp mat1-Pi
Sp mde5
mating-type M-specific polypeptide Mi
mating-type P-specific polypeptide Pc, HMG-box TF
mating-type P-specific polypeptide Pi, homeodomain TF
Mei4-dependent protein 5
Sp mde6
Sp mde7
Sp mfm1,mfm2,mfm3
Sp mei2
Sp mei3
ketch repeat protein, Sp Mei4 target gene
RNA-binding protein involved in meiosis, Sp Mei4 target
M-factor pheromone precursor
RNA-binding protein involved in meiosis
meiosis inducing protein, inactivates Sp Ran1
GSTUMT00010739001
GSTUMT00007619001
GSTUMT00007493001
GSTUMT00007542001
GSTUMT00000794001
GSTUMT00011103001
GSTUMT00000259001
GSTUMT00001488001
GSTUMT00000037001
GSTUMT00011726001
GSTUMT00012198001
no
GSTUMT00001090001*
GSTUMT00008644001*
no
no
no
GSTUMT00006610001
GSTUMT00005820001
no
GSTUMT00000091001
no
GSTUMT00005188001
no
42
No
AN6340.2
No
ambiguous
No
No
AN6015.2*
No
No
No
No
flbA
No
No
AN2172.2
No
TmelKlpA_like
10
TmelKAR5_like
TmelKAR9
no
2
TmelNDT80_like
TmelRIM4
27
no
TmelRGS_FLBA
69
TmelSSN8_like
115
TmelPat1
TmelATF21
Tmel_bgs1_like
TmelCmk1
Tmeldync
TmelGPA2
Tmelhsk1
MANTUM00001488001
TmelAlp2
Tmellid2
TmelMcm1like
14
2
8
25
6
6
1
1
55
4
27
AN2751.3
AN6849.2*
AN3729.2
AN3065.2
nudA
AN3090.2
AN3450.2
AN5678.2
AN0238.2
AN8211.2
AN8676.2*
TmelMAT1-2-1
MANTUM00008644001
1
6
AN1962.2*
Tmelamy1_like
Tmelamy1
no
2
Tmelmde7_like
6
TmelMei2
1
No
No
No
No
No
AN7700.2
No
AN6494.2*
No
SUPPLEMENTARY INFORMATION
Sp mei4
Sp mes1
Sp meu1
Sp meu14
Sp rad22
Sp ran1/pat1
Sp rec6
Sp rec10
Sp rec11
Sp rec15
Sp rem1
Sp rep1
Sp rhp51
Sp spo6
Sp ssm4
Sp sso1
Sp ste4
Sp ste6
Sp ste7
Sp sxa2
fork head domain TF, meiotic regulator
meiosis II protein, Sp Mei4 target gene
Sp Mei4 target gene
involved in Meiosis II nuclear division, Sp Mei4 target gene
DNA repair protein
serine/threonine protein kinase, negative regulator of meiosis
meiotic recombination protein
sister chromatid cohesion
sister chromatid cohesion
meiotic recombination protein
meiotic B-type cyclin
regulator of pre-meiotic DNA replication
homologue of Sc RAD51//52, Nc mei-3
homologue of Sc DBF4, required for orign of replication firing
dynactin complex, homologue of Sc NIP100
syntaxin
SAM domain, similar to Sc STE50
GEF involved in conjugation; related to Sc CDC25
meiotic suppressor protein
serine carboxypeptidase, degrades extracellular P-factor
GSTUMT00006159001
no
GSTUMT00007435001
GSTUMT00006892001
GSTUMT00010491001
GSTUMT00001466001
no
no
no
no
no
no
GSTUMT00002392001
GSTUMT00000807001
GSTUMT00001061001
GSTUMT00002243001
GSTUMT00010173001
GSTUMT00007746001
no
GSTUMT00010123001
*homology limited to functional domain; **Found in a different strain; Sc: Saccharomyces cerevisiae; Sp: Schizosaccharomyces pombe; An: Aspergillus nidulans.
43
TmelSep1
no
TmelApsB
TmelLSP1
Tmelrad22
TmelRan1
1
4
7
64
TmelRad51
TmelnimO
Tmelro-3
TmelPsy1_like
TmelSte50
TmelCdc25
4
1
2
45
7
6
TmelSxa2
no
AN8858.2*
No
apsB
AN3931.2
radC
AN4935.2
No
No
No
No
AN2137.2
No
uvsC
nimO
AN6323.2
AN3416.2
AN7252.2
AN2130.2
No
AN2555.2
SUPPLEMENTARY INFORMATION
Table S12. Genes involved in RNA silencing and DNA methylation in T. melanosporum genome.
Gene model
Name
Putative function
(length)
EST number
FLM
FB
Yeast BRH1
Acc. No.
(length)
Filamentous BRH2
% id.
Acc. No.
(length)
%
id.
BC1G_15614
(1174)
48
NCU021784
(1050)
37
AN2717
(1204)
40
NCU08435
(1352)
32
NCU07534
(1403)
32
BC1G_10104
(1843)
39
NCU08270
(1585)
38
RNA silencing
1. RNA-dependent RNA polymerase
GSTUMT00008250001
GSTUMT00008249001
GSTUMT00000785001
GSTUMT00001133001
TmelRRPA
TmelRRPB
TmelRRPC
RNA-dependent
RNA polymerase
(1542)
BEAH3:
Neosartorya
fischeri
(1599)
RNA-dependent
RNA polymerase
(1218)
BEAH:
Penicillium
marneffei
(1207)
RNA-dependent
RNA polymerase
(1700)
BEAH:
Neurospora crassa
(1402)
8
4
3
14
2
1
-----
-----
-----
-----
-----
-----
2. Dicer
GSTUMT00011356001
GSTUMT00001152001
TmelDCL1
TmelDCL2
RNA
helicase/RNAse III,
putative Dicer-like
protein 1 (1423)
BEAH:
Neosartorya fischeri
(1538)
RNA
helicase/RNAse III,
putative Dicer-like
protein 2 (1490)
BEAH:
Aspergillus clavatus
(1389)
-----
-----
No5
-----
33
8
-----
No5
-----
NCU06766
(1422)
26
95
-----
-----
NCU04730
(1090)
40
1
3
-----
-----
No
-----
BC1G_06939
(940)
39
NCU09434
(990)
34
3. Argonaute
GSTUMT00011067001
GSTUMT00011068001
TmelAGO1
GSTUMT00010783001
TmelAGO2
GSTUMT00007541001
TmelAGO3
Argonaute-like
protein
(944)
BEAH:
Laccaria bicolor
(961)
Argonaute
(863)
BEAH:
Schizosacc.
japonicus
(842)
Argonaute
(868)
BEAH:
Schizosacc. pombe
(834)
5
44
2
-----
-----
SUPPLEMENTARY INFORMATION
4. RNA silencing accessory proteins (Ago-binding protein, H3 K9 methyltransferase and RISC helicases)
GSTUMT00004385001
GSTUMT00006817001
GSTUMT00004046001
GSTUMT00004402001
GSTUMT00003241001
TmelARB1
putative AGO
binding protein-Arb16
(457)
TmelRQH1
-----
TmelDBP2
TmelCLR4
2
ATP-dependent
DEAD/DEAH box
helicase
(1325)
ATP-dependent
DEAD/DEAH box
helicase DEAD,
helicase
(898)
ATP-dependent
DEAD/DEAH
box RNA helicase
DBP2
(547)
-----
2
25
Histone H3 K9
methylase
(355)
12
0
-----
-----
48.27
2
-----
-----
25
NP_014287.1
( 546)
No
44
NCU06316
(537)
AN2087
(1535)
30
NCU08598
(1956)
53
BC1G_02690
(654)
42
MGG_12894
(549)
83
NCU07839
(547)
AN1170
(552)
82
NCU04402
(332)
43
-----
NP_013915.1
(1447)
1
AN0157
(178)
54
72.03
48
-----
Yeast BRH=”best reciprocal hit” against S. cerevisiae.
Filamentous BRH= “best reciprocal hit” against reference Ascomycetes (N. crassa, M. grisea, B. cinerea, A. nidulans)
3BEAH=”best absolute hit” (i.e., against the entire sequence database)
4Functionally characterized N. crassa homolog is indicated, even if not BRH
5Both genes have non-BRH homologs in S. cerevisiae belonging to the DEAH family of helicases (DNA replication) and RNA helicases (splicing),
respectively.
6Argonaute siRNA chaperone (ARC) complex subunit Arb1, required for histone H3 Lys9 (H3-K9) methylation, heterochromatin assembly and siRNA
generation in fission yeast.
1
2
DNA methylation
5. DNA methyltransferases
GSTUMT00012206001
GSTUMT00003328001
GSTUMT00003329001
putative cytosine
DNA methyltransferase
(857)
0
cytosine (C5)-DNA
methyltransferase
(1124)
1
1
-----
TmelPP1
PP1 protein
phosphatase
catalytic subunit
(325)
5
8
NP_011059
(312)
TmelSDS22
PP1 protein
phosphatase
regulatory subunit
(359)
2
Hetero-chromatinassociated, chromo
domain protein HP1
(225)
13
TmelDMT1
TmelDMT2
6
-----
MGG_02795
(893)
40
NCU02034
(845)
38
-----
NCU02247
(1455)
37
91
NCU00043
(309)
99
AN10088
(356)
66
NCU08385
(384)
61
NCU04017
(267)
44
-----
6. DNA methylation accessory components
GSTUMT00009673001
GSTUMT00005072001
GSTUMT00000912001
TmelHP1
45
2
1
NP_012728
(338)
49
-----
-----
SUPPLEMENTARY INFORMATION
Table S13. Distribution of genes involved core RNA silencing components and DNA methyltransferases in different Ascomycetes.
core RNA silencing components
Comparative
abundance/distribution
Tuber
Neurospora
Magnaporthe
A. nidulans
Podospora
S. pombe,
S. japonicus
S. cerevisiae
RdRP
3
3
3
2
2
1
0
Dicer
2
2
2
1
2
1
0
Argonaute
3
2
3
1
2
1
0
DNA methyltransferases
Comparative
abundance/distribution
Tuber
Ascobolus
Neurospora
Magnaporthe
Aspergillus
spp.
Fusarium
spp.
Podospora
S. cerevisiae,
S. pombe
DMT1
(specialized methylases)
1
1
1
1
1
1
1
0
1
1
1
1
0
1
1
0
DMT2
(general methylases)
46
SUPPLEMENTARY INFORMATION
Table S14. Homologs of genes coding for the indicated mycotoxin biosynthesis components in T. melanosporum genome.
Gene name
Function
Putative T. melanosporum homolog
1. Aflatoxin biosynthesis
aflF
Dehydrogenase
aflU
P450 monooxygenase
aflT
Putative ABC transporter
aflC
Polyketide synthase
aflD
Reductase
aflB
Fatty acid synthase β
aflA
Fatty acid synthase α
aflR
Transcriptional activator
aflS
Transcriptional enhancer
aflH
Alcohol dehydrogenase
aflJ
Esterase
aflE
NOR-reductase
aflM
Dehydrogenase
aflN
Monooxigenase
aflG
P450 monooxygenase
aflL
Desaturase
aflI
Oxidase
aflO
O-methyltransferase B
aflP
O-methyltransferase A
aflQ
Oxidoreductase
aflK
VERB synthase
aflV
P450 monooxygenase
aflW
Monooxigenase
aflX
Monooxigenase/oxidase
nadA
NADH oxidase
hxtA
hexose transporter
glcA
Glucosidase
sugR
Sugar regulator
2. Fumonisin biosynthesis
NPT
Nicotinate phosphoribosyl transferase
WDR1
WD protein
PNG1
Peptide N-glycanase
ZNF1
Transcription factor/kinase
ZBD1
Zinc-binding dehydrogenase/reductase
FUM1
Polyketide synthase
FUM6
P450 monooxygenase
FUM7
Dehydrogenase
FUM8
Aminotransferase
FUM9
Dioxygenase
FUM10
Fatty acyl-CoA synthetase
FUM11
Tricarboxylate transporter
FUM12
P450 monooxygenase
FUM13
Short-chain dehydrogenase/reductase
FUM14
Peptide synthetase condensation domain
FUM15
P450 monooxygenase
FUM16
Fatty acyl-CoA synthetase
FUM17
Longevity assurance factor
FUM18
Longevity assurance factor
FUM19
ABC transporter
ORF20
Transcription factor
ORF21
Transcription factor/kinase
MPU1
Mannose-P-dolichol utilization
3. Gliotoxin biosynthesis
AFUA6G09580
C6 finger domain protein
AFUA6G09590
Zinc alcohol dehydrogenase
AFUA6G09600
Zinc metallo peptidase
AFUA6G09610
Non-ribosomal peptide synthase
AFUA6G09620
Hypothetical protein
AFUA6G09630
C6 finger domain protein
AFUA6G09640
Aminotransferase GliI
AFUA6G09650
Membrane dipeptidase
AFUA6G09660
Non-ribosomal peptide synthase
AFUA6G09670
P450 oxidoreductase
AFUA6G09680
O-methyltransferase
AFUA6G09690
Glutathione S-transferase
AFUA6G09700
Gliotoxin biosynthesis protein
AFUA6G09710
MFS gliotoxin efflux transporter
AFUA6G09720
Methyltransferase
AFUA6G09730
P450 oxidoreductase
4. Trichothecene biosynthesis
47
YES
YES
YES
YES
NO
YES
YES
NO
NO
YES
YES
YES
YES
YES
YES
YES
NO
YES
YES
YES
YES
YES
YES
NO
YES
YES
YES
NO
YES
YES
YES
YES
NO
YES
YES
NO
YES
NO
YES
YES
YES
NO
NO
YES
YES
YES
YES
YES
NO
NO
YES
YES
YES
NO
YES
NO
NO
NO
YES
YES
NO
NO
YES
YES
YES
YES
YES
SUPPLEMENTARY INFORMATION
TRI5
TRI4
TRI11
TRI101
TRI3
TRI8
TRI13
TRI7
Trichodiene synthase
Trichodiene oxygenase
Isotrichodermin C-15 hydroxylase
Trichothecene 3-O-acetyltransferase
15-O-acetyltransferase
T-2 toxin biosynthesis protein
P450 monooxygenase
T-2 toxin biosynthesis protein
48
NO
YES
NO
NO
NO
NO
NO
NO
SUPPLEMENTARY INFORMATION
Table S15. Genes involved in sulfur assimilation, metabolism and S-volatile compounds in T. melanosporum genome.
Gene
model1
Name2
EST number3
Putative function
(length)2
FLM
Yeast BRH4
FB
Acc. No.
(length)
Filamentous BRH4
% id.
Acc. No.
(length)
% id.
Sulfur assimilation
1. Sulfate internalization & reduction
GSTUMT00000861001
TmelSUL1
Sulfate permease
(826)
0
77
NP_009853.1
(859)
41.97
GSTUMT00005125001
TmelSUL2
Sulfate permease
(520)
0
0
No
-----
Sulfate permease
(719)
1
24
NP_015328.1
(754)
35.19
TmelST2
Similar to sulfate (and other anions)
transporter
(1091)
1
69
NP_011641.1
(1036)
40.88
NCU02632
(1119)
56.18
TmelMET3
ATP sulfurylase
(566)
9
168
NP_012543.1
(511)
64.02
NCU01985
(574)
78.84
GSTUMT00003745001
TmelMET14
Adenosine 5'-phosphosulfatekinase
(200)
0
10
NP_012925.1
(202)
70.35
MGG_06348
(209)
76.10
GSTUMT00002663001
TmelMET16A
PAPS reductase
(273)
53
139
NP_015493.1
(261)
54.15
GSTUMT00001561001
TmelMET16B
1
9
No
-----
No
-----
GSTUMT00001411001
TmelTRX1
34
34
NP_013144.1
(103)
43.56
AN0170
(190)
53.66
GSTUMT00008708001
TmelTRR1
Thioredoxin reductase
(344)
20
3
NP_010640.1
(319)
70.49
MGG_01284
(329)
74.59
GSTUMT00006994001
TmelMET22
3'(2'),5'-bisphosphate nucleotidase
(342)
0
33
NP_014577.1
(357)
43.92
GSTUMT00001735001
GSTUMT00000585001
GSTUMT00002747001
TmelST1
PAPS reductase
(289)
Thioredoxin
(119)
49
BC1G_02187 (826)
No
BC1G_03076 (668)
BC1G_12227 (309)
MGG_04311 (355)
56.99
----55.85
62.45
57.88
SUPPLEMENTARY INFORMATION
GSTUMT00000610001
TmelMET10
GSTUMT00010016001
TmelECM17
Sulfite reductase alpha subunit
(1044)
Sulfite reductase beta subunit
(1517)
11
26
9
13
NP_116686.1
(1035)
NP_116579.1
(1442)
38.68
50.26
BC1G_04925
(1098)
BC1G_08712
(1531)
54.41
AN2229
(490)
59.34
BC1G_09615
(525)
69.72
AN8277
(438)
AN8057
(371)
AN1513
(429)
82.08
68.62
2. Cys/Met biosynthesis & interconversion
GSTUMT00000943001
TmelMET2
Homoserine acetyl-transferase
(438)
1
25
NP_014122.1
(486)
50.53
GSTUMT00004828001
TmelCYSA
Serine acetyl-transferase
(459)
2
1
No
-----
GSTUMT00009485001
TmelMET17
13
29
NP_013406.1
(444)
62.79
GSTUMT00003830001
TmelCYSB
4
1
No
-----
GSTUMT00008902001
TmelCYSSYNTH2
0
0
NP_011526.1
(393)
53.89
GSTUMT00009043001
TmelSTR2
Cystathionine gamma-synthase
(542)
7
7
NP_012664.1
(639)
40.72
AN3456
(670)
55.62
GSTUMT00006346001
TmelSTR3
Cystathionine beta-lyase
(447)
1
5
NP_011331.1
(465)
48.66
AN7051
(459)
77.28
GSTUMT00006909001
TmelCYS4
Cystathionine beta-synthase
(510)
1
4
NP_011671.1
(507)
52.48
BC1G_10936
(449)
79.69
GSTUMT00010528001
TmelCYS3
Cystathionine gamma-lyase
(411)
157
333
NP_009390.1
(394)
65.45
AN1446
(420)
75.44
GSTUMT00005323001
TmelMET6
4
8
NP_011015.1
(767)
60.65
BC1G_12307
(768)
74.48
GSTUMT00005542001
TmelMHMT
0
0
-----
-----
AN5019
(439)
24.34
GSTUMT00008874001
TmelSAM1
0
17
75.20
TmelSAH1
9
32
NCU02657
(396)
MGG_05155
(450)
87.37
GSTUMT00011415001
NP_010790.1
(384)
NP_010961.1
(449)
Homocysteine synthase
(452)
Cysteine synthase
(367)
Cysteine synthase, mitochondrial
(419)
cobalamin-independent
Met synthase
(774)
similar to MHMTase,
truncated cobalamin-independent
Met synthase
(376)
S-adenosyl-methionine synthetase
(382)
Adenosyl-homo-cysteinase
(417)
Cys/Met utilization & catabolism
3a. Cysteine catabolism
Sulfinoalanine-independ. Cys degradation
50
75.30
77.51
68.98
83.21
SUPPLEMENTARY INFORMATION
GSTUMT00006350001
TmelAAT2_SPT
Aspartate amino-transferase
(417)
3
10
NP_013127.2
(418)
51.57
AN6048
(446)
69.93
GSTUMT00000780001
TmelTHIOTRANS
Thiosulfate sulfur-transferase
(336)
3
11
NP_014894.1
(304)
41.39
MGG_04087
(346)
50.77
Sulfinoalanine-depend. Cys degradation
GSTUMT00004091001
TmelCDI1
Cysteine dioxygenase
(223)
1
3
-----
-----
NCU06625
(228)
51.72
GSTUMT00003243001
TmelAAT
Aspartate amino-transferase
(410)
4
5
No
-----
AN1993
(430)
72.66
NCU06112.3 ( 545)
similar to glutamic
acid decarboxylase
isoform 67
Contig4544
-----
Sulfinoalanine decarboxylase
2
10
No
-----
Contig3853
----MGG_03869.6
( 515)
cysteine sulfinic
acid decarboxylase
TmelNFR1
AIF-like (NAD/FAD)
oxido-reductase
(551)
14
21
-----
-----
BC1G_09030
(548)
59.27
GSTUMT00002834001
TmelTDI1B
Taurine dioxygenase
(361)
20
0
NP_013043.1
(412)
43.03
AN6739
(377)
56.39
GSTUMT00003308001
TmelTDI1A
Taurine dioxygenase
(381)
49
1
No
-----
MGG_03117
(372)
54.05
GSTUMT00010867001
TmelTDI1C
Taurine dioxygenase
(393)
3
99
No
-----
BC1G_03189
(395)
64.91
7
6
72.14
1
NCU04636
(506)
No
81.82
0
NP_009912.2
(497)
No
GSTUMT00001651001
Taurine degradation
3b. Cysteine desulfuration
GSTUMT00004448001
TmelNFS1_1
GSTUMT00010387001
TmelNFS1_2
Cysteine desulfurase
(404)
Cysteine desulfurase
51
-----
-----
SUPPLEMENTARY INFORMATION
GSTUMT00010387001
TmelNFS1_2
GSTUMT00004634001
TmelNFS1_3
GSTUMT00003703001
No express data
TmelNFS1_4
GSTUMT00002637001
TmelNFS1_5
GSTUMT00002211001
No express data
GSTUMT00001222001
No express data
TmelNFS1_6
TmelNFS1_7
GSTUMT00010391001
TmelNFS1_8
GSTUMT00004633001
TmelNFS1_9
GSTUMT00008466001
TmelNFS1_10
GSTUMT00006565001
TmelNFS1_11
GSTUMT00008685001
TmelNFS1_12
GSTUMT00010389001
TmelNFS1_13
GSTUMT00007003001
TmelNFS1_14
GSTUMT00000388001
TmelNFS1_15
GSTUMT00008467001
TmelNFS1_16
GSTUMT00003702001
TmelNFS1_17
GSTUMT00001219001
TmelNFS1_18
GSTUMT00008793001
TmelNFS1_19
Cysteine desulfurase
(374)
Cysteine desulfurase
(562)
Cysteine desulfurase
(277)
Cysteine desulfurase
(379)
Cysteine desulfurase
(276)
Cysteine desulfurase
(345)
Cysteine desulfurase
(276)
Cysteine desulfurase
(635)
Cysteine desulfurase
(398)
Cysteine desulfurase
(234)
Cysteine desulfurase
(605)
Cysteine desulfurase
(247)
Cysteine desulfurase
(247)
Cysteine desulfurase
(194)
Cysteine desulfurase
(218)
Cysteine desulfurase
(107)
Cysteine desulfurase
(590)
Cysteine desulfurase
(199)
0
1
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
2
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
2
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
0
0
No
-----
No
-----
4
0
No
-----
No
-----
0
0
No
-----
No
-----
NP_011569.1
(574)
33.27
NCU10675
(604)
43.33
Methionine uptake, modification & catabolism
4. Methionine uptake & utilization
GSTUMT00005629001
TmelMUP1
Methionine permease
(638)
7
52
0
SUPPLEMENTARY INFORMATION
GSTUMT00011592001
TmelMUPA
Methionine permease
(690)
4
2
No
-----
BC1G_05423
(573)
38.11
GSTUMT00006065001
MANTUM00006065001
Methionine (& other) permease
(576)
2
3
No
-----
AN10825
(610)
58.59
GSTUMT00009497001
MANTUM00009497001
Methionine (& other) permease
(540)
0
34
No
-----
MGG_00285
(508)
36.34
GSTUMT00007355001
TmelTMT1
similar to thiol methyl-transferase
(267)
0
2
-----
-----
AN6094
(284)
47.55
GSTUMT00012494001
TmelTMT2
0
0
-----
-----
No
No
GSTUMT00012778001
TmelTMT3
0
0
-----
-----
No
No
6
403
NP_010960.1
(184)
44.77
AN10562
(214)
66.67
0
0
NP_009897.1
(168)
51.52
MGG_02496
(148)
67.69
5
44
NP_011313.1
(500)
38.63
BC1G_07219
(524)
60.59
5
5
No
-----
AN8172
(591)
45.56
50
220
NP_013145.1
(563)
51.61
NCU02397
(519)
57.17
4
179
NP_014032.1
(348)
60.68
AN2286
(353)
70.57
12
3
No
-----
NCU09285
(347)
61.00
539
1
NP_014051.1
(360)
43.29
AN5355
(360)
54.60
similar to thiol methyl-transferase
(250)
similar to thiol methyl-transferase
(273)
Methionine sulfoxide reduction
Peptide methionine sulfoxide
reductase
(175)
similar to protein-methionine-RGSTUMT00000808001
TmelMSRB
oxide reductase
(199)
5. Met degradation via 2-oxo 4-methyl-thiobutanoate-Ehrlich/”fusel” pathway
Aromatic amino acid amino
GSTUMT00008805001
TmelARO8A
transferase
(575)
Aromatic amino acid amino
GSTUMT00009292001
TmelARO8B
transferase
(570)
2-oxo acid/ phenylpyruvate
GSTUMT00006321001
TmelPDC1ARO10
decarboxylase
(563)
Alcohol dehydrogenase, class V
GSTUMT00006862001
TmelALCC/ADH5A
(349)
hypothetical protein similar to
GSTUMT00006980001
TmelADH5B
alcohol dehydrogenase
(364)
Alcohol dehydrogenase, class V
GSTUMT00001645001
TmelADH6
(346)
GSTUMT00004791001
TmelMSRA
53
SUPPLEMENTARY INFORMATION
GSTUMT00001074001
TmelADH4
Alcohol dehydrogenase, class IV
(492)
26.37
AN1868
(495)
69.45
NP_013117.1
(337)
43.36
AN10230
(356)
62.46
0
NP_015443.1
(411)
41.79
AN4290
(380)
57.96
0
4
NP_012558.1
(244)
55.70
AN3593
(241)
61.18
1
0
NP_010876.2
(227)
35.51
BC1G_15439
(231)
45.70
2
0
NP_013722.1
(179)
45.93
AN9527
(179)
70.59
0
0
NP_014590.1
(396)
33.48
MGG_14297
(479)
55.21
0
1
0
3
1
NP_011258.1
(465)
6. Methionine salvage pathway
GSTUMT00007013001
TmelMEU1
GSTUMT00011972001
TmelMRI1
GSTUMT00008825001
TmelMDE1
GSTUMT00002414001
TmelUTR4
GSTUMT00006393001
TmelADI1
GSTUMT00011721001
TmelSPE2
Methylthio-adenosine
phosphorylase
(402)
Methyl-thioribose 1-phosphate
isomerase
(363)
Methyl-thioribulose-1-phosphate
dehydratase
(233)
2,3-diketo-5-methyl-thiopentyl-1-P
enolase-phosphatase
(256)
Acireductone dioxygenase
(Fe2+/Ni2+)
(173)
SAM decarboxylase proenzyme
(495)
Glutathione biosynthesis, reduction/oxidation & utilization
7. GSH biosynthesis
GSTUMT00005466001
TmelGSH1
γ-glutamyl-Cysteine synthetase
(655)
11
2
NP_012434.1
(678)
41.35
MGG_07317
(696)
62.48
GSTUMT00006508001
TmelGSH2
Glutathione synthetase
(512)
5
7
NP_014593.1
(491)
39.88
BC1G_07364
(510)
51.44
8. Glutathione reduction/oxidation & utilization
GSTUMT00003950001
TmelGLR1
Glutathione reductase- NADPH
(467)
10
28
NP_015234.1
(483)
56.87
MGG_12749
(485)
68.51
GSTUMT00005402001
TmelHYR1
Glutathione peroxidase
(158)
8
19
NP_012303.1
(163)
51.57
NCU09534
(168)
49.40
GSTUMT00006625001
TmelGST2
Glutathione S-transferase
(224)
307
1
NP_014170.1
(354)
38.56
MGG_09138
(224)
44.50
GSTUMT00006029001
GSTUMT00006030001
TmelGST3B
Glutathione S-transferase
(337)
0
0
No
-----
No
-----
GSTUMT00006718001
TmelGSTO1
Glutathione S-transferase
(234)
100
53
-----
-----
AN3299
(237)
45.21
54
SUPPLEMENTARY INFORMATION
GSTUMT00010331001
TmelGTT1
Glutathione S-transferase
(303)
0
6
NP_012304.1
(234)
33.82
BC1G_15218
(250)
46.12
GSTUMT00010048001
TmelECM4
Glutathione S-transferase
(330)
3
1
NP_013002.1
(370)
47.22
AN10273
(345)
63.35
GSTUMT00003178001
TmelGSTR1
Glutathione S-transferase related
(332)
2
11
-----
-----
MGG_12319
(330)
38.95
GSTUMT00004090001
TmelGST3A
Glutathione S-transferase
(360)
5
9
No
-----
NCU04368
(347)
59.16
GSTUMT00011418001
TmelGSTR2
3
1
No
-----
AN10695
(290)
66.39
GSTUMT00001002001
TmelPCS
0
0
-----
-----
-----
-----
GSTUMT00008111001
TmelGFA1
3
0
-----
-----
AN7594
(137)
32.14
GSTUMT00006398001
TmelGFA2
0
0
-----
-----
MGG_12759 (158)
GSTUMT00005441001
TmelFDH1
2
1
NP_010113.1
(386)
69.6
AN7632
(380)
83.11
GSTUMT00004188001
TmelECM38
0
21
NP_013402.1
(660)
36.01
MGG_02134
(696)
36.31
GSTUMT00003663001
TmelGGTASE
5
3
No
-----
AN5658
(607)
53.71
GSTUMT00010539001
TmelDAP_1
3
9
NP_011893.1
(818)
42.34
BC1G_13641
(922)
62.84
GSTUMT00002049001
TmelDAP2
2
0
No
-----
AN2572
(723)
47.81
GSTUMT00003350001
TmelDAP3
0
0
-----
-----
No
-----
GSTUMT00000118001
TmelDAP4
0
1
-----
-----
AN6438
(774)
53.00
GSTUMT00000119001
GSTUMT00000120001
TmelDAP5
0
0
-----
-----
-----
-----
Glutathione S-transferase related
(481)
Phytochelatin synthase
(416)
similar to glutathione-dependent
formaldehyde-activating enzyme
(159)
similar to glutathione-dependent
formaldehyde-activating enzyme
(120)
S-(hydroxy-methyl) GSH
dehydrogenase
(373)
γ-glutamyl-trans-peptidase
(657)
γ-glutamyl-trans-peptidase
(591)
Dipeptidyl amino-peptidase
(cytoplasmic)
(907)
Dipeptidyl amino-peptidase
(secreted)
(710)
Dipeptidyl amino-peptidase
(cytoplasmic)
(654)
Dipeptidyl amino-peptidase
(secreted)
(757)
Dipeptidyl amino-peptidase
(cytoplasmic)
55
31.62
SUPPLEMENTARY INFORMATION
GSTUMT00000119001
GSTUMT00000120001
TmelDAP5
Dipeptidyl amino-peptidase
(cytoplasmic)
(546)
Dipeptidyl amino-peptidase
GSTUMT00000121001
TmelDAP6
(secreted)
(443)
Dipeptidyl amino-peptidase
GSTUMT00000122001
TmelDAP7
(cytoplasmic)
(782)
9. Ancillary biosynthetic/interconversion reactions & accessory components
Choline sulfatase
(573)
0
0
-----
-----
-----
-----
0
0
-----
-----
-----
-----
1
-----
-----
-----
-----
-----
0
0
-----
-----
AN6847
(617)
52.9
3
7
-----
-----
AN3341
(511)
61.02
0
0
-----
-----
-----
-----
4
2
-----
-----
AN8058
(381)
50.26
GSTUMT00006469001
TmelCHS1
GSTUMT00004987001
TmelCHR1
GSTUMT00000370001
TmelCHR2
GSTUMT00003848001
TmelSUOX1
GSTUMT00009048001
TmelHOM6
Homoserine dehydrogenase
(367)
1
2
NP_012673.1
(359)
49.29
MGG_11450
(371)
60.56
GSTUMT00007110001
TmelMET7
Folyl polyglutamate synthase
(521)
2
7
NP_014884.1
(548)
46.00
AN3840
(518)
48.34
GSTUMT00003931001
TmelFOL3
4
4
38.68
TmelSHM2
0
7
GSTUMT00005470001
TmelGSTHM
3
5
No
-----
NCU01337
(413)
BC1G_06851
(478)
AN10745
(601)
45.35
GSTUMT00006315001
NP_013831.1
(427)
NP_013159.1
(469)
GSTUMT00003351001
TmelMET12
1
4
NP_015302.1
(657)
45.18
BC1G_02720
(664)
58.01
GSTUMT00010817001
TmelMET13
12
66
NP_011390.2
(600)
54.19
AN5883
(628)
64.55
GSTUMT00008511001
TmelMOCOBP
0
0
-----
-----
MGG_08902
(212)
47.59
GSTUMT00001509001
TmelMOCOSULF1
0
7
-----
-----
BC1G_15280
(814)
52.11
GSTUMT00008546001
TmelMOCOSULF2
0
0
-----
-----
No
-----
putative chromate efflux transporter
(518)
putative chromate efflux transporter
(487)
Sulfite oxidase
(402)
Dihydrofolate synthetase
(415)
Serine hydroxymethyl transferase
(473)
Glycine hydroxylmethyl transferase
(502)
5,10-Methylene tetrahydrofolate
reductase 1
(608)
5,10-Methylene tetrahydrofolate
reductase 2
(600)
similar to Mo cofactor biosynth
protein
(254)
putative Mo cofactor sulfurase
(780)
putative Mo cofactor sulfurase/
SeCys lyase
56
68.80
76.81
70.25
SUPPLEMENTARY INFORMATION
GSTUMT00004264001
TmelMOBP
GSTUMT00002750001
TmelLIPA
putative molybdopterin-binding
protein
(296)
putative lipoate synthase
(426)
44.36
AN0183
(310)
58.50
NP_014839.1
(414)
65.8
BC1G_01056
(372)
73.58
20
-----
-----
MGG_11778
(162)
4
12
-----
-----
AN4361
(295)
10
7
NP_012594.1 (352)
3
0
NP_010539.1
(191)
47.46
AN8663
(731)
82.14
3
33
NP_012218.1
(640)
42.17
AN6359
(679)
57.55
54
25
NP_010615.1
(194)
48.19
BC1G_01974
(168)
79.25
1
2
NP_014508.1
(121)
54.87
MGG_08844
(120)
79.83
3
0
3
1
53
NP_013903.1
(274)
10. S-metabolism-related transcription factors & regulators
Sulfur metabolite activator
control protein
(289)
Sulfur metabolite activator
control protein
(257)
Sulfur metabolite activator
control protein
(351)
Zn-finger DNA-binding regulator of
Met biosynthetic genes
(413)
S-metabolite repression
control protein (SconB);
Ub-ligase component
(717)
S-metabolism negative regulator
SconC/SkpA component of
SCF Ub-ligase
(160)
HrtA subunit of
SCF Ub-ligase
(112)
61.68
GSTUMT00008609001
TmelCYS3-1
GSTUMT00004031001
TmelCYS3-2
GSTUMT00000814001
TmelCBF1
GSTUMT00012172001
TmelMET32
GSTUMT00000398001
TmelMET30
GSTUMT00006439001
TmelSKP
GSTUMT00002417001
TmelHRT1
GSTUMT00000778001
TmelAPC11
putative Apc11 subunit of
APC/C Ub ligase
(86)
1
4
NP_010276.1
(165)
46.46
AN10394
(105)
78.48
GSTUMT00006793001
TmelCUL3
CulC subunit of
SCF Ub-ligase
(757)
2
6
NP_011517.1
(744)
25.10
AN3939
(824)
47.57
57
49.56
BC1G_07375 (296)
49.22
82.61
SUPPLEMENTARY INFORMATION
Table S16. Genes involved in non-polyketide secondary metabolism in T. melanosporum genome.
Gene model
Name
EST number
Putative function
(length)
Yeast BRH
Acc. No.
(length aa)
FLM
FB
5
44
NP_011313.1 (500)
5
5
No
3
3
NP_011079.1 (443)
3
49
2
%id.
Filamentous BRH
Acc. No.
%id.
(length aa)
1. Fusel alcohol/acid formation (Ehrlich pathway)
GSTUMT00008805001
TmelARO8A
GSTUMT00009292001
TmelARO8B
GSTUMT00008941001
MANTUM00008941001
GSTUMT00003753001
TmelBAT1
GSTUMT00011000001
TmelTOXF
GSTUMT00000969001
TmelECA39
GSTUMT00006321001
TmelPDC1
GSTUMT00005441001
TmelFDH1
GSTUMT00006862001
TmelADH5A
GSTUMT00006980001
TmelADH5B
GSTUMT00001645001
TmelADH6
GSTUMT00001074001
TmelADH4
GSTUMT00004330001
TmelALD5
aromatic amino acid aminotransferase
(575 )
aromatic amino acid aminotransferase
(579)
putative aromatic amino acid
aminotransferase
(445)
branched-chain-amino-acid
aminotransferase (mitochondrial)
(405)
putative branched-chain amino acid
aminotransferase
(367)
branched-chain amino acid
aminotransferase (mitochondrial
precursor)
(449)
pyruvate decarboxylase
(563)
S-(hydroxymethyl) glutathione
dehydrogenase
(373)
alcohol dehydrogenase 3
(349)
hypothetical protein similar to alcohol
dehydrogenase
(364)
NADP-dependent alcohol dehydrogenase
6
(359)
NAD-dependent methanol dehydrogenase
(492)
aldehyde dehydrogenase
(496)
58
38.63
BC1G_07219 (524)
60.59
AN8172
(591)
45.56
40.00
AN0780
(438)
34.09
No
-----
No
-----
0
No
-----
AN0385
(395)
63.10
5
106
NP_012682.1 (376)
50
220
NP_013145.1 (563)
2
1
NP_010113.1 (386)
15
168
NP_014032.1 (348)
12
3
No
539
1
NP_014051.1 (360)
0
1
NP_011258.1 (465)
6
35
NP_010996.1 (520)
-----
41.47
51.61
69.60
60.68
----43.29
26.37
55.65
BC1G_10603 (406)
NCU02397 (519)
AN7632
(380)
AN2286 (353)
NCU09285
(347)
BC1G_09005 (402)
BC1G_02180 (299)
BC1G_06362 (497)
54.79
57.17
83.11
70.57
61.00
55.12
70.89
73.43
SUPPLEMENTARY INFORMATION
GSTUMT00012138001
TmelSNQ2L
GSTUMT00007904001
TmelSNQ2L1
putative ABC-cassette multidrug
transporter
(1514)
putative ABC-cassette multidrug
transporter
(1405)
1
1
NP_010294.1 (1501)
36.73
BC1G_03332 (1562)
0
0
No
-----
BC1G_00425 (1476)
4
2
NP_015297.1 (398)
3
6
No
-----
BC1G_02275 (435)
11
87
NP_013580.1
51.50
BC1G_09652
(460)
3
5
NP_013555.1 (1045)
1
1
NP_013935.1 (443)
2
5
NP_013947.1 (451)
2
8
NP_014441.1 (396)
0
0
No
4
13
NP_015208.1 (288)
7
3
NP_012368.1 (352)
5
54
NP_015256.1 (353)
0
0
NP_009557.1 (473)
55.81
69.30
2. Isoprenoids
2a. Five-carbon isoprene unit synthesis
GSTUMT00003447001
TmelERG10
GSTUMT00006667001
TmelACAT1
GSTUMT00001922001
GSTUMT00001923001
GSTUMT00001924001
TmelERG13
GSTUMT00009050001
TmelHMGR
GSTUMT00001515001
TmelRAR1
GSTUMT00008001001
TmelERG8
GSTUMT00005252001
TmelMVD1
GSTUMT00003692001
TmelMVD2
GSTUMT00006247001
TmelIDI1
acetyl-CoA C acetyltransferase
(acetoacetyl-CoA thiolase)
(397)
acetyl-CoA C acetyltransferase
(acetoacetyl-CoA thiolase)
(424)
3-hydroxy-3-methylglutaryl-CoA synthase
(472)
3-hydroxy-3-methylglutaryl-coenzyme A
reductase
(1092)
mevalonate kinase
(442)
phosphomevalonate kinase (425)
diphosphomevalonate decarboxylase
(390)
similar to diphosphomevalonate
decarboxylase
(174)
isopentenyl-diphosphate delta-isomerase
(299)
63.41
43.92
38.50
34.49
56.82
BC1G_11051 (401)
BC1G_01518 (1153)
AN3869
(736)
BC1G_07491 (527 )
BC1G_14194 (383)
72.78
66.25
63.50
67.35
58.52
50.12
68.77
-----
No
-----
58.12
AN0579
(269)
73.59
2b. Polyisoprenoid/terpenoid biosynthesis
GSTUMT00003820001
TmelFPP1
GSTUMT00011059001
TmelBTS1
GSTUMT00003780001
TmelCOQ1
farnesyl pyrophosphate synthetase
(391)
geranyl-geranyl pyrophosphate
synthetase (343)
putative isoprenyl (decahexa-prenyl) pyrophosphate synthetase
(343)
59
47.36
44.48
44.97
NCU01175 (348)
NCU01427 (434)
BC1G_08074 (221)
64.45
72.30
54.82
SUPPLEMENTARY INFORMATION
GSTUMT00002915001
TmelNUS1
GSTUMT00000645001
TmelRER2
GSTUMT00004443001
TmelERG9
GSTUMT00004857001
TmelERG1
GSTUMT00008512001
putative isoprenyl (undecaprenyl)
diphosphate synthase
(324)
putative cis-prenyl transferase
(dehydrodolichyl diphosphate synthase)
(267)
putative squalene synthetase
(472)
32.11
3
NP_010088.1 (375)
3
3
NP_009556.1
(286)
1
1
NP_012060.1 (444)
putative squalene monooxygenase
(466)
0
5
NP_011691.1 (496)
-----
putative polyisoprenoid
(β-carotene/ lignostilbene) dioxygenase
(521)
0
0
-----
-----
MGG_08016
(558)
45.05
GSTUMT00006317001
MANTUM00006317001
putative prenylcysteine oxidase
(563)
6
11
-----
-----
AN3057
(555)
49.48
GSTUMT00011429001
MANTUM00011429001
putative prenylcysteine oxidase precursor
(705)
7
5
-----
-----
-----
-----
0
0
-----
-----
MGG_10859
(1109)
51.18
2
6
-----
-----
No
-----
AN9177
(389)
47.38
45.74
52.07
41.24
BC1G_13721 (95)
54.39
4
BC1G_01412
(354)
BC1G_01273 (483)
AN11008 (484)
71.97
64.33
59.19
3. Fatty acid peroxidation products
linoleate diol synthase/
fatty acid oxygenase
(1119)
linoleate diol synthase/
fatty acid oxygenase
(1079)
GSTUMT00000322001
TmelPPO1
GSTUMT00006891001
TmelPPO2
GSTUMT00007644001
TmelOYE2
12-oxophytodienoate reductase
(378)
37
0
NP_012049.1
(400)
41.88
GSTUMT00004213001
TmelCYP2C30
linoleic acid epoxygenase
(564)
2
7
No
-----
AN7399
(544)
36.36
GSTUMT00000137001
TmelCYP617
cytochrome P450 hydroxylase
(552)
0
3
No
-----
BC1G_11822
(486)
40.73
GSTUMT00009186001
TmelCYP52
4
12
NP_010690.1
(486)
29.24
AN7131
(475)
51.11
GSTUMT00004620001
TmelCYPNF5
0
1
No
-----
No
-----
cytochrome P450 52A3
(515)
cytochrome P450 52A13
(epoxygenase activity)
(388)
60
SUPPLEMENTARY INFORMATION
GSTUMT00003498001
TmelCYPNF3
cytochrome P450
(457)
54
10
No
-----
AN8615
(451)
35.08
GSTUMT00005386001
TmelEPHX2.1
similar to epoxide hydrolase 2
(248)
0
1
-----
-----
BC1G_13574
(281)
42.22
GSTUMT00010225001
TmelEPHX2.2
putative epoxide hydrolase
(305)
7
0
No
-----
MGG_05175 (367)
GSTUMT00005692001
TmelEPHX2.3
9
1
-----
-----
MGG_07232 (341)
GSTUMT00005966001
TmelLAP2
1
3
NP_014353.1
(671)
46.78
GSTUMT00008016001
TmelCYPNF4B
0
2
No
-----
GSTUMT00005402001
TmelHYR1
8
18
NP_012303.1
(163)
51.57
GSTUMT00010330001
TmelCBR1
0
1
No
-----
NCU05223 (880)
GSTUMT00006538001
TmelAKR1
0
3
No
-----
AN7708
(284)
57.97
GSTUMT00005097001
TmelGRE3
aldose reductase
(322)
0
0
NP_011972.1
(327)
53.46
AN0423
(320)
66.56
putative aryl-alcohol dehydrogenase
(353)
13
0
No
-----
MGG_12003 (308)
0
0
NP_015237.1 (342)
49.54
AN9474
(349)
48.41
0
0
NP_009560.1 (497)
48.68
AN3829
(532)
65.18
1
8
No
AN3591
(537)
71.62
putative epoxide hydrolase
(326)
leucyl aminopeptidase /
epoxide (leukotriene-A4) hydrolase
(615)
cytochrome P450 17-alpha-hydroxylase,
17,20-lyase
(521)
glutathione peroxidase
(involved in omega-6 fatty acid 20:4(ω-6)
metabolism)
(158)
carbonyl reductase (NADPH)
(involved in omega-6 fatty acid 20:4(ω-6)
metabolism)
(273)
aldehyde reductase
(283)
BC1G_09514 (670)
BC1G_10350
(461)
NCU09534 (168)
34.79
41.23
61.15
37.59
49.4
36.3
4. Other non-polyketide secondary metabolism-related enzymes
GSTUMT00010607001
TmelAAD
GSTUMT00009316001
TmelAAD1
GSTUMT00000440001
MANTUM00000440001
GSTUMT00008742001
TmelMMSDH1
putative aryl-alcohol dehydrogenase
(345)
succinate-semialdehyde dehydrogenase
(Glu and other aa catabolism/ butanoate
metabolism)
(449)
methylmalonate-semialdehyde
dehydrogenase
(Val, Leu, Ile degradation/propanoate
metabolism)
(541)
61
-----
58.77
SUPPLEMENTARY INFORMATION
GSTUMT00006975001
TmelBCKD_E1A
GSTUMT00000127001
TmelBCKDHB
GSTUMT00007426001
TmelECH3
GSTUMT00007620001
TmelHIBCH1
GSTUMT00008499001
TmelBHBD
GSTUMT00000078001
TmelMCCA
GSTUMT00000074001
TmelMCCB
GSTUMT00008529001
TmelNAHF1
GSTUMT00002943001
TmelNAHG1
GSTUMT00007846001
TmelNAHG2
putative 2-oxoisovalerate dehydrogenase
alpha
(Val, Leu, Ile degradation/ methyloxopentanoate metabolism)
(450)
putative 2-oxoisovalerate dehydrogenase
beta
(Val, Leu, Ile degradation/ methyloxopentanoate metabolism)
(394)
similar to methylglutaconyl-CoA hydratase
(Val,Leu,Ile degradation/methylbutanoate/oxopentanoate metabolism
(290)
similar to 3-hydroxyisobutyryl-CoA
hydrolase
(Val, Leu, Ile degradation)
(474)
similar to 3-hydroxybutyryl-CoA
dehydrogenase
(butanoate metabolism)
(311)
3-methylcrotonyl-CoA carboxylase (alpha
subunit)
(Val, Leu, Ile degradation,
methylbutanoate metabolism)
(735)
3-methylcrotonyl-CoA carboxylase (beta
subunit)
(Val, Leu, Ile degradation,
methylbutanoate metabolism)
(566)
phenolic aldehyde
(salicylaldehyde/vanillin) dehydrogenase
(447)
phenolic acid
(salicylate)
hydroxylase
(399)
protein similar to monooxygenase
(putative salicylate hydroxylase)
(444)
62
1
11
No
-----
AN1726
(465)
64.32
0
6
No
-----
BC1G_14088
(305)
57.70
0
6
No
-----
BC1G_10324 (326)
2
18
NP_010321.1
(500)
35.91
4
44
-----
-----
2
81
NP_009767.1
(1835)
41.9
6
18
-----
-----
BC1G_08864 (567)
1
0
No
-----
AN4050
(483)
1
0
-----
-----
BC1G_05013 (688)
1
0
-----
-----
BC1G_03120
(479)
BC1G_06802 (503)
AN7008
(320)
BC1G_08870 (346)
58.01
53.19
70.43
72.78
72.44
46.91
50.42
59.66
SUPPLEMENTARY INFORMATION
GSTUMT00003194001
TmelNAHG3
protein similar to monooxygenase
(putative salicylate/phenolic acid
hydroxylase)
(438)
GSTUMT00012239001
TmelNAHG4
putative phenolic acid hydroxylase
(473)
1
0
-----
-----
MGG_06552
(431)
51.75
GSTUMT00012487001
TmelIRL
similar to isoflavone reductase
(272)
0
0
-----
-----
-----
-----
GSTUMT00004447001
TmelFRL1
putative flavonol reductase
(341)
0
1
NP_010830.1
(344)
38.53
AN5977
(335)
58.82
GSTUMT00007431001
TmelFRL2
putative flavonol reductase
(129)
0
0
No
-----
No
-----
63
0
0
-----
-----
AN7684
(507)
40.35
SUPPLEMENTARY INFORMATION
Table S17. Genes involved in light perception and potential photoresponses in T. melanosporum genome.
Gene model
Putative function
(length)
Name
EST
number
FLM
Yeast BRH
FB
Acc.No.
(length)
%
id.
Filamentous
BRH
Acc. No.
%
(length)
id.
1. Photoreceptors and light-dependent regulators
GSTUMT00007548001
GSTUMT00001635001
GSTUMT00005055001
TmelWC-1
TmelWC-2
TmelLAEA
putative blue light
photoreceptor
(874)
Blue light regulator 2
(453)
Methyl transferase master
regulator of secondary
metabolism
(314)
GSTUMT00007375001
TmelVEA
Regulator of sexual
development
(696)
GSTUMT00009560001
TmelVEB
VeA–like protein
(366)
GSTUMT00012049001
TmelVOSA
GSTUMT00007042001
TmelPHY-1
GSTUMT00000117001
Regulator of secondary
metabolism
(490)
Sensory transduction
histidine kinase
(1008)
0
8
4
13
1
0
3
-----
NP_013856
(560)
-----
-----
----
50
----
----
54
NCU02356
(1131)
BC1G_01840
(510)
50
NCU00902
(532)
46
AN0807
(361)
51
47
AN1052
(538)
58
NCU01731
(554)
42
4
1
-----
----
MGG_01620
(455)
69
4
1
-----
----
MGG_00617
(473)
41
1
0
-----
----
NCU04834
(1536)
57
BC1G_13906
(339)
39
NCU01735
(293)
38
NCU01427
(433)
72
BC1G_08074
(221)
55
NCU02305
(449)
54
Bacteriorhodopsin-like
protein
(315)
0
TmelBTS1
Geranyl geranyl
pyrophosphate synthetase
(343)
5
TmelHPS
Probable hexaprenyl
pyrophosphate sinthetase,
mitocondrial
(343)
TmelORP-1
3
BC1G_13505
(1160)
26
NP_009610
(344)
27
NP_015256
(355)
44
2. Accessory components & modulators
GSTUMT00011059001
GSTUMT00003780001
GSTUMT00005262001
TmelMAPKKK_Ste11
GSTUMT00005199001
TmelCAMK
GSTUMT00008887001
TmelCKA
Serine/threonine protein
kinase
(881)
Ca2+/calmodulin-dependent
protein kinase
(606)
Casein kinase II, alpha
subunit
(335)
64
0
54
0
NP_009557
(473)
45
2
5
NP_013466
(717)
56
MGG_12855
(660)
42
1
2
NP_011336
(560)
34
BC1G_06577
(704)
51
7
3
NP_014704
(339)
63
BC1G_01857
(336)
92
SUPPLEMENTARY INFORMATION
Table S18. Genes coding for transduction pathway proteins in T. melanosporum and closest homologs in yeast and filamentous ascomycetes
Gene Model #
Gene name
Putative function (length aa)
EST number
Yeast BRH
Filamentous BRH
Mycelium
Fruiting
body
Accession #
(length aa)
% id
Accession #
(length aa)
% id
29
BC1G_13906
(339)
AN7743
(427)
AN2520
(431)
AN4932
(319)
BC1G_11854
(500)
AN6680
(490)
AN8262
(405)
MGG_00258
(599)
BC1G_03450
(652)
AN5720
(319)
MGG_04698
(418)
AN10166
(425)
AN8548
(371)
AN0857
(393)
AN2249
(440)
NCU09823
(502)
MGG_05214
(372)
AN0751
39
G-protein coupled receptors (GPCR)
GSTUMT00000117001
Tmel BacRhodopsin
GPCR similar to bacterial Rhodopsin (315)
---
26
GSTUMT00012510001
Tmel Ste3p
Pheromone receptor Ste3p (374)
---
----
GSTUMT00009053001
Tmel Ste2p
Pheromone receptor Ste2p (367)
---
1
GSTUMT00001471001
Tmel mPR 1
GPCR similar to mPR receptor (306)
---
1
GSTUMT00012001001
Tmel mPR 2
GPCR similar to mPR receptor (543)
3
---
GSTUMT00008690001
Tmel animalGPCR like
GPCR similar to animal GPCR (590)
11
1
NP_009610
(344)
NP_012743
(470)
NP_116627
(431)
NP_014641
(317)
NP_013123
(543)
---
GSTUMT00010085001
Tmel cAMP_R 1
cAMP receptor (385)
---
---
---
---
GSTUMT00012270001
Tmel cAMP_R 2
cAMP receptor (404)
---
---
---
---
GSTUMT00009991001
Tmel Gpr1 like
Glucose sensor (596)
2
1
30
GSTUMT00003917001
Tmel PQloop 1
3
3
GSTUMT00011683001
Tmel PQloop 2
7
2
GSTUMT00007380001
Tmel PQloop 3
3
6
GSTUMT00000156001
Tmel Pth11 related 1
13
2
GSTUMT00004902001
Tmel_Pth11 related 2
GPCR of unknown function with PQ loop
domain (320)
GPCR of unknown function with PQ loop
domain (307)
GPCR of unknown function with PQ loop
domain (347)
GPCR related to Magnaporthe grisea Pth11p
(397)
GPCR related to M. grisea Pth11p (374)
NP_010249
(961)
NP_014549
(308)
NP_014549
(308)
NP_010639
(317)
----
---
---
---
---
GSTUMT00002940001
Tmel_Pth11 related 3
GPCR related to M. grisea Pth11p (369)
3
2
---
---
GSTUMT00012481001
Tmel_Pth11 related 4
GPCR related to M. grisea Pth11p (388)
---
---
---
---
GSTUMT00004192001
Tmel_Pth11 related 5
GPCR related to M. grisea Pth11p (383)
1
---
---
---
GSTUMT00010450001
Tmel_Pth11 related 6
GPCR related to M. grisea Pth11p (341)
1
---
---
---
65
27
41
39
---
31
40
28
---
30
32
56
53
39
44
44
39
42
50
38
38
42
29
24
39
33
SUPPLEMENTARY INFORMATION
GSTUMT00012382001
Tmel_Pth11 related 7
GPCR related to M. grisea Pth11p (266)
---
---
---
---
GSTUMT00000531001
Tmel_Pth11 related 8
GPCR related to M. grisea Pth11p (337)
2
---
---
---
GSTUMT00012745001
Tmel_Pth11 related 9
GPCR related to M. grisea Pth11p (213)
---
---
---
---
GSTUMT00012376001
Tmel_Pth11 related 10
GPCR related to M. grisea Pth11p (284)
---
---
---
---
GSTUMT00012373001
Tmel_Pth11 related 11
GPCR related to M. grisea Pth11p (354)
---
---
---
---
GSTUMT00012424001
Tmel_Pth11 related 12
GPCR related to M. grisea Pth11p (387)
---
---
---
---
GSTUMT00010364001
Tmel_Pth11 related 13
GPCR related to M. grisea Pth11p (282)
1
---
---
---
GSTUMT00009022001
Tmel_Pth11 related 14
GPCR related to M. grisea Pth11p (250)
---
---
---
---
GSTUMT00003188001
Tmel_Pth11 related 15
GPCR related to M. grisea Pth11p (336)
---
4
---
---
GSTUMT00003592001
Tmel_Pth11 related 16
GPCR related to M. grisea Pth11p (369)
6
---
---
---
GSTUMT00003181001
Tmel_Pth11 related 17
GPCR related to M. grisea Pth11p (408)
---
3
---
---
GSTUMT00004274001
Tmel_Pth11 related 18
GPCR related to M. grisea Pth11p (308)
10
12
---
---
GSTUMT00004223001
Tmel_Pth11 related 19
GPCR related to M. grisea Pth11p (481)
---
2
---
---
GSTUMT00009616001
Tmel_Pth11 related 20
GPCR related to M. grisea Pth11p (253)
---
18
---
---
GSTUMT00012518001
Tmel_Pth11 related 21
GPCR related to M. grisea Pth11p (219)
---
---
---
---
GSTUMT00012596001
Tmel_Pth11 related 22
GPCR related to M. grisea Pth11p (323)
---
---
---
---
Adenylate cyclase (AC) and associated protein (CAP)
GSTUMT00006808001
Tmel AC
Adenylate cyclase (2015)
8
8
41
GSTUMT00007790001
Tmel CAP
AC associated protein (520)
2
2
NP_012529
(2026)
NP_014261
(526)
Phosphodiesterases (PDE)
GSTUMT00009828001
Tmel PDEase I
High affinity cAMP phosphodiesterase (794)
5
---
NP_015005
(526)
GSTUMT00003009001
Tmel PDEase II
Low affinity cAMP phosphodiesterase (684)
---
4
NP_011266
66
(545)
AN10886
(372)
AN10886
(372)
BC1G_12534
(335)
AN1930
(361)
MGG_09070
(372)
NCU02903
(483)
AN9387
(313)
BC1G_09796
(309)
NCU00700
(438)
NCU06531
(541)
BC1G_10169
(363)
NCU00700
(438)
AN9387
(313)
BC1G_01571
(243)
BC1G_12534
(335)
BC1G_09188
(371)
27
27
32
34
30
31
36
38
29
26
26
27
52
39
36
26
BC1G_02865
(1740)
AN0999
(530)
59
28
BC1G_08417
(946)
43
28
NCU00237
49
38
46
SUPPLEMENTARY INFORMATION
(369)
Ras (monomeric G-protein)
GSTUMT00004380001
Tmel Ras1
Ras small G-protein Ras1p like (382)
4
3
GSTUMT00002293001
Tmel Ras2
Ras small G-protein, Ras2p like (224)
1
1
GSTUMT00000073001
Tmel Rsr1 (ras like)
Ras-like protein Rsr1 (203)
31
3
GSTUMT00007546001
Tmel Ras like
Ras-like protein (232)
---
1
Heterotrimeric G-proteins
GSTUMT00011064001
Tmel Gpa1
30
19
GSTUMT00011103001
Tmel Gpa2
5
1
GSTUMT00005851001
Tmel Gpa3
9
5
GSTUMT00010108001
Tmel Gpb
6
5
GSTUMT00012095001
Tmel Gpg
Gpa1, alpha subunit of heterotrimeric Gprotein (353)
Gpa2, alpha subunit of heterotrimeric Gprotein (351)
Gpa3, alpha subunit of heterotrimeric Gprotein (352)
Gpb, beta subunit of heterotrimeric G-protein
(347)
Gpg, gamma subunit of heterotrimeric Gprotein (88)
19
7
31
38
30
76
Regulator of G-protein signaling (RGS)
GSTUMT00010835001
Tmel Rgs FlbA
GSTUMT00004040001
Tmel RgsA
Regulator of G-protein, GTPase activating
protein (508)
Regulator of G-protein (376)
GSTUMT00008293001
Tmel RgsB
Regulator of G-protein (350)
6
4
GSTUMT00002373001
Tmel RgsC
Regulator of G-protein (1230)
---
2
cAMP-dependent protein kinase catalytic
subunit (356)
cAMP-dependent protein kinase catalytic
subunit (430)
Regulatory subunit of the cyclic AMPdependent protein kinase (514)
Protein kinase similar to PKA (940)
2
4
3
2
4
6
2
5
Mitogen activated protein kinase similar to
Fus3/Kss1 (351)
Mitogen activated protein kinase similar to
6
6
5
2
cAMP-dependent protein kinase (PKA)
GSTUMT00000447001
Tmel PkaA
GSTUMT00000085001
Tmel PkaB
GSTUMT00007366001
Tmel PkaR
GSTUMT00004530001
Tmel Pka like
Mitogen Activated protein kinase (MAPK)
GSTUMT00000551001
Tmel MAPK Fus3 like
GSTUMT00001034001
Tmel MAPK MpkA like
67
(687)
NP_014301
(322)
NP_014301
(322)
NP_011668
(272)
NP_014744
(309)
72
NP_010937
(449)
NP_010937
(449)
NP_010937
(449)
NP_014855
(423)
NP_012619
(110)
43
NP_013557
(698)
---
63
NP_014945
(435)
NP_013603
(1127)
27
NP_015121
(380)
NP_012371
(397)
NP_012231
(416)
NP_012075
(824)
72
NP_011554
(368)
NP_011895
58
58
60
38
44
54
41
51
---
31
46
53
69
69
AN0182
(213)
BC1G_04437
(233)
AN4685
(203)
AN3434
(198)
92
MGG_00365
(354)
BC1G_08985
(354)
AN1016
(357)
AN0081
(353)
AN2742
(96)
92
AN5893
(720)
MGG_03146
(581)
MGG_03726
(365)
NCU03937
(1261)
21
MGG_06368
(532)
AN4717
(397)
AN4987
(413)
BC1G_14873
(978)
84
AN3719
(355)
AN5666
93
82
64
61
62
78
87
72
51
62
47
56
68
70
80
SUPPLEMENTARY INFORMATION
GSTUMT00005852001
Tmel MAPK Hog1 like
MpkA/Slt2 (427)
Mitogen activated protein kinase similar to
Hog1 (332)
60
17
MAPK kinase (MAPKK)
GSTUMT00011865001
Tmel MAPKK Ste7 like
MAPK kinase similar to Ste7 (573)
3
1
GSTUMT00003004001
Tmel MAPKK Pbs2 like
MAPK kinase similar to Pbs2 (346)
7
2
GSTUMT00005681001
Tmel MAPKK Mkk2 like
MAPK kinase similar to Mkk1/Mkk2 (438)
11
8
MAPKK kinase (MAPKKK)
GSTUMT00002374001
Tmel MAPKKK Ssk like
MAPKK Kinase similar to Ssk2 (1356)
6
6
GSTUMT00005262001
Tmel MAPKKK Ste11 like
MAPKK Kinase similar to Ste11p (881)
2
5
GSTUMT00006322001
Tmel MAPKKK Bck1 like
MAPKK Kinase similar to Bck1 (1927)
3
3
68
(484)
NP_013214
(435)
74
NP_010122
(515)
NP_012407
(668)
NP_015185
(506)
45
NP_014428
(1579)
NP_013466
(717)
NP_012440
(1478)
43
57
56
56
43
(331)
NCU07024
(359)
83
AN3422
540)
BC1G_07633
(605)
BC1G_11713
(495)
61
AN10153
(1314)
MGG_12855
(660)
AN4887
(1559)
55
70
59
49
63
SUPPLEMENTARY INFORMATION
Table S19. Genes coding for heat shock- and stress-related proteins in T. melanosporum genome.
Gene name
Putative function
(length)
GSTUMT00008363001
CLP FAMILY
Tmelbag102
BAG family molecular chaperone regulator 1B
(266)
-
5
-
-
NCU01220 (358)
38.46
GSTUMT00004219001*
HSP90 FAMILY-associated
Tmelhsp98
Heat shock protein HSP98
(931)
51
15
NP_013074.1
(908)
50.94
NCU00104 (811)
71.40
GSTUMT00012112001**
Tmelhsp90
2
-
no file BRH
Tmelhsp90
718
26
72.62
no file BRH
BC1G_07315
(702)
no file BRH
GSTUMT00012114001**
no file BRH
NP_013911.1
(705)
GSTUMT00012113001**
Tmelhsp90
-
-
no file BRH
no file BRH
no file BRH
12
2
40,34
MGG_00814 (329)
59.02
46
8
37.93
11
4
6
29,13
AN6921 (211)
BC1G_01523
(588)
BC1G_10364
(497)
56.00
72
1
1
no file BRH
NP_010500.1
(350)
NP_012805.1
(216)
NP_014670.1
(589)
NP_010452.1
(506)
NP_009713.1
(385)
38,98
NCU06340 (472)
41,86
Gene model
EST number
Fruiting
Mycelium
body
Yeast BRH
Acc. No. (length
aa)
% id
Filamentous BRH
Acc. No. (length
aa)
% id
BAG FAMILY
GSTUMT00012030001
MANTUM00012030001
GSTUMT00005825001
Tmelwos2
GSTUMT00000199001
Tmelsti1
GSTUMT00010261001
TmelCDC37
GSTUMT00008279001
FES FAMILY
TmelCNS1
Heat shock protein 90, putative
(701)
Heat shock protein 90, putative
(701)
Heat shock protein 90, putative
(701)
Uncharacterized protein C1711.08
(324)
Protein wos2
(232)
Heat shock protein sti1 homolog
(570)
Hsp90 co-chaperone Cdc37
(515)
Cyclophilin seven suppressor 1
(394)
GSTUMT00006161001
PAM16 FAMILY
TmelFES1
Hsp70 nucleotide exchange factor FES1
(213)
4
6
NP_009659.1
(290)
48.5
BC1G_08200
(218)
49.77
GSTUMT0001150200*
GRPE FAMILY
Tmelpam16
Mitochondrial import inner membrane translocase subunit
tim16
(136)
-
3
NP_012431.1
(149)
39.39
BC1G_00158
(161)
62.50
GSTUMT00002348001
HSP20 FAMILY
Tmelmge-1
GrpE protein homolog, mitochondrial precursor
(255)
1
1
NP_014875.1
(228)
50.29
NCU01516 (239)
57.78
69
57.81
75.37
63.41
53.33
SUPPLEMENTARY INFORMATION
GSTUMT00011894001
Tmelhsp30_3
GSTUMT00008756001
Tmelhsp30_2
GSTUMT00000685001
HSP9/12 FAMILY
Tmelhsp30_1
GSTUMT00000236001
Tmelhsp9_1
GSTUMT00004694001
DNAJ FAMILY
Tmelhsp9_2
GSTUMT00001300001
GSTUMT00004488001
GSTUMT00005543001
GSTUMT00006401001
GSTUMT00009811001
GSTUMT00010039001
GSTUMT00011043001
GSTUMT00003606001
GSTUMT00007480001
GSTUMT00010431001
GSTUMT00004954001
GSTUMT00005686001
GSTUMT00005206001
GSTUMT00007528001
30 kDa heat shock protein
(161)
30 kDa heat shock protein
(149)
30 kDa heat shock protein
(207)
Heat shock protein hsp9
(87)
Heat shock protein hsp9
(100)
DnaJ protein homolog xdj1
(433)
Uncharacterized J domain-containing protein C1778.01c
MANTUM00004488001
(442)
DnaJ-related protein SCJ1 precursor
TmelSCJ1
(404)
DnaJ homolog 1, mitochondrial precursor
Tmelmdj1
(489)
Chaperone protein dnaJ
TmeldnaJ
(343)
J domain-containing protein spf31
Tmelspf31
(212)
Uncharacterized J domain-containing protein C2E1P5.03
precursor
(383)
MANTUM00011043001
Uncharacterized J domain-containing protein C1071.09c
MANTUM00003606001
(367)
DnaJ homolog subfamily C member 3
Tmeldnajc3
(521)
Mitochondrial import inner membrane translocase subunit
tim14
(99)
Tmelpam18
DnaJ homolog subfamily B member 12
TmelDnajb12
(349)
DnaJ homolog subfamily C member 7
TmelDNAJC7
(553)
Protein psi1
Tmelpsi1
(373)
Translocation protein sec63
Tmelsec63
(684)
Tmelxdj1
70
476
161
No hit
No hit
AN3555 (181)
43.89
-
1
-
-
no
no
7
-
-
-
no
no
1
1
No hit
2
-
No hit
NP_116640.1
(109)
31.63
no
no
2
6
no
BC1G_09575
(429)
54.46
7
4
2
-
6
-
no
NP_011801.1
(433)
NP_013941.2
(377)
NP_116638.1
(511)
4
5
no
6
4
1
6
2
-
-
1
-
-
1
1
20
11
20
12
8
9
(90)
52.81
63.76
no
AN7143 (448)
BC1G_09129
(417)
BC1G_11653
(460)
BC1G_10020
(358)
-
-
AN0590 (211)
69.15
NP_116699.1
(295)
19.82
BC1G_00200
(394)
43.40
no
AN6233 (300)
48.80
52.17
AN3463 (520)
50.22
53.01
BC1G_10252
(107)
70.00
50.67
AN4441 (340)
67.59
no
NCU00170 (785)
BC1G_06525
(381)
BC1G_03757
(697)
56.14
no
NP_012462.1
(645)
NP_013108.1
(168)
NP_013884.1
(224)
no
NP_014391.1
(352)
NP_014897.1
(663)
46.60
AN3725
43.86
36.52
38.68
27.30
65.81
52.61
42.54
49.22
53.96
SUPPLEMENTARY INFORMATION
CPN60-TCP1 FAMILY
GSTUMT00005900001
Tmelcct1
GSTUMT00009564001*
Tmelcct2
GSTUMT00007765001
Tmelcct3
GSTUMT00011767001
Tmelcct4
GSTUMT00007689001
Tmelcct5
GSTUMT00008182001
Tmelcct6
GSTUMT00000052001
Tmelcct7
GSTUMT00011157001
Tmelcct8
GSTUMT00009341001
HSP70 FAMILY
TmelHSP60
GSTUMT00000289001**
Tmelhsp70
GSTUMT00000288001**
Tmelhsp70
GSTUMT00000866001
TmelHSPa12b
GSTUMT00001960001
TmelHSPa12a_1
GSTUMT00004279001
TmelHSPa12a_2
GSTUMT00009214001*
Tme HSPa12a_3
GSTUMT00001987001
TmelHSPa12a_4
GSTUMT00010876001
TmelHSPa12a_5
GSTUMT00001748001
TmelHSP88
GSTUMT00008839001*
TmelSSB
GSTUMT00010477001*
TmelbipA
T-complex protein 1 subunit alpha
(506)
Probable T-complex protein 1 subunit beta
(567)
T-complex protein 1 subunit gamma
(538)
T-complex protein 1 subunit delta
(531)
T-complex protein 1 subunit epsilon
(550)
T-complex protein 1 subunit zeta
(641)
Probable T-complex protein 1 subunit eta
(555)
Probable T-complex protein 1 subunit theta
(547)
Heat shock protein 60, mitochondrial precursor
(592)
Heat shock 70 kDa protein
(648)
Heat shock 70 kDa protein
(648)
Heat shock 70 kDa protein 12B
(587)
Heat shock 70 kDa protein 12A
(649)
Heat shock 70 kDa protein 12A
(642)
Heat shock 70 kDa protein 12A
(637)
Heat shock 70 kDa protein 12A
(587)
Heat shock 70 kDa protein 12A
(587)
Heat shock protein Hsp88
(692)
Heat shock protein SSB
(613)
78 kDa glucose-regulated protein homolog precursor
(666)
71
34
11
8
16
-
6
8
5
2
12
1
28
5
10
2
2
56
15
3
-
532
NP_010498.1
(559)
NP_012124.1
(527)
NP_012520.1
(534)
NP_010138.1
(528)
NP_012598.1
(562)
NP_010474.1
(546)
NP_012424.1
(550)
NP_011493.1
(718)
NP_013360.1
(572)
73.22
BC1G_07480
(567)
BC1G_11867
(539)
71.92
NCU01843 (541)
84.48
69.77
NCU02839 (534)
79.42
68.07
81.75
67.46
MGG_11889 (552)
BC1G_07745
(541)
BC1G_14004
(545)
BC1G_02512
(754)
71.76
NCU01589 (491)
84.75
75.69
61.55
69.57
81.32
81.37
86.38
79.55
76.13
no
no
no
35
no
NP_013076.1
(639)
81,59
NCU09602 (647)
88,78
4
-
-
-
AN2646 (688)
32,39
4
5
-
-
no
2
-
no
no
no
BC1G_12181
(741)
93.80
2
1
no
no
MGG_07156 (613)
58.79
1
-
no
no
AN0587 (586)
35,54
1
-
no
-
69.94
6
10
76,14
no
BC1G_09769
(712)
BC1G_10846
(425)
no
18
35
36
no
NP_009728.1
(693)
NP_014190.1
(613)
NP_012500.1
(682)
70,24
NCU03982 (662)
82,18
50,29
77.67
SUPPLEMENTARY INFORMATION
3
6
TmelSSC1
Ribosome-associated complex subunit SSZ1
(542)
Heat shock protein SSC1, mitochondrial precursor
(679)
24
6
NP_011931.2
(538)
NP_012579.1
(654)
GSTUMT00002165001*
CLPA/B FAMILY
TmelHSP10
10 kDa heat shock protein, mitochondrial
(104)
12
5
GSTUMT00009533001*
USP FAMILY
TmelHSP78
Heat shock protein 78, mitochondrial precursor
(793)
8
GSTUMT00002779001
MANTUM00002779001
GSTUMT00004495001
CYCLOPHILIN FAMILY
MANTUM00004495001
GSTUMT00002739001
TmelSSZ1
GSTUMT00011349001*
CPN10 FAMILY
GSTUMT00005334001
GSTUMT00008642001
GSTUMT00008657001
Uncharacterized protein C167.05
(731)
Universal stress protein A family protein C25B2.10
(357)
Peptidyl-prolyl cis-trans isomerase-like 1
(162)
peptidyl prolyl cis-trans isomerase Cyclophilin, putative
MANTUM00008642001
(152)
peptidyl-prolyl cis-trans isomerase, mitochondrial precursor
MANTUM000086457001
(168)
Tmelcyp1
GSTUMT00008859001
Tmelcyp3
GSTUMT00007865001
Tmelcyp15
GSTUMT00010900001
Tmelcyp4
GSTUMT00010103001*
FKBP FAMILY
Tmelcyp8
GSTUMT00010499001
Tmelfpr1A
GSTUMT00000766001
NO FAMILY
Tmelfkbp22
GSTUMT00009281001
TmelAGP2
GSTUMT00004833001
GSTUMT00007543001*
TmelSSU81
TmelYAR1
Peptidyl-prolyl cis-trans isomerase H (174)
Peptidyl-prolyl cis-trans isomerase cyp15
(645)
Peptidyl-prolyl cis-trans isomerase-like 3
(167)
Peptidyl-prolyl cis-trans isomerase-like 2
(567)
FK506-binding protein 1A
(108)
FK506-binding protein 2
(139)
General amino acid permease AGP2
(555)
Protein SSU81
(318)
Ankyrin repeat-containing protein YAR1
72
43,89
AN4616 (564)
58.56
63,29
AN6010 (667)
68.89
NP_014663.1
(106)
63,11
MGG_13221 (106)
78.85
1
NP_010544.1
(811)
59,14
12
3
-
-
MGG_09507 (719)
49.73
3
2
-
-
AN6061 (440)
70.30
1
1
no
no
74.48
2
3
no
no
29
9
no
1
1
no
NP_013633.1
(182)
50.00
AN8680 (163)
BC1G_15349
(154)
BC1G_01740
(182)
BC1G_01528
(181)
-
2
no
no
AN0380 (630)
65.34
3
-
no
no
68.86
2
2
no
no
MGG_03239 (169)
BC1G_11448
(754)
4
1
62.38
AN3598 (109)
70.37
4
3
48.31
AN8343 (136)
63.00
2
27
49,9
BC1G_09689
(463)
77.41
2
5
1
4
31,12
32,5
AN7698 (288)
-
53.04
-
NP_014264.1
(114)
NP_010807.1
(135)
NP_009690.1
(596)
NP_011043.1
(367)
NP_015085.1
AN1163
(801)
66.38
56.21
64.90
67.46
55.29
SUPPLEMENTARY INFORMATION
GSTUMT00002556001*
TmelDHN1
GSTUMT00006625001
Tmelgst2
GSTUMT00006718001
TmelGsto1
GSTUMT00001990001
MANTUM00001990001
GSTUMT00001866001
Tmelucp7
GSTUMT00007789001*
TmelNAM9
GSTUMT00000027001
TmeldgoD
(233)
DHN1
(264)
Glutathione S-transferase 2
(224)
Glutathione transferase omega-1
(234)
J domain-containing protein C21orf55
(512)
UBA domain-containing protein 7
(922)
37S ribosomal protein NAM9, mitochondrial precursor
(311)
D-galactonate dehydratase
(387)
*The gene model has been edited.
** The gene models have been merged.
73
(200)
-
6
-
NCU04667 (328)
42,44
1
NP_014170.1
(354)
307
38,56
44.50
100
53
-
-
1
-
-
2
2
35,18
2
42,29
AN8798 (884)
BC1G_14515
(424)
38.71
2
NP_010606.1
(668)
NP_014262.1
(486)
MGG_09138 (225)
BC1G_12605
(137)
BC1G_01357
(565)
-
5
-
-
MGG_07935 (385)
67.18
55,36
42.35
55.32
SUPPLEMENTARY INFORMATION
Table S20. Amino acyl-tRNA synthetase genes in T. melanosporum.
Gene Model
Name
Putative function
(length)
EST number
Mycelium
GSTUMT00007997001 TmelAARS
GSTUMT00006820001 TmelCARS
GSTUMT00002673001 TmelDARS
GSTUMT00008559001 TmelDARS2
GSTUMT00010264001 TmelEARS
GSTUMT00001301001 TmelEARS2
GSTUMT00002349001 TmelFARS2
GSTUMT00004266001 TmelFARSA
GSTUMT00000015001 TmelFARSB
GSTUMT00003052001 TmelGARS
GSTUMT00004084001 TmelHARS
GSTUMT00005330001 TmelIARS
GSTUMT00011254001 TmelIARS2
GSTUMT00002716001 TmelKARS
GSTUMT00009944001 TmelLARS
GSTUMT00006079001 TmelLARS2
GSTUMT00008601001 TmelMARS
GSTUMT00011730001 TmelMARS2
Alanyl-tRNA
synthetase, cytoplasmic
(959)
Cysteinyl-tRNA
synthetase
(718)
Aspartyl-tRNA
synthetase, cytoplasmic
(552)
Aspartyl-tRNA
synthetase,
mitochondrial
(593)
Glutamyl-tRNA
synthetase, cytoplasmic
(721)
Glutamyl-tRNA
synthetase,
mitochondrial
(381)
Phenylalanyl-tRNA
synthetase,
mitochondrial
(421)
Phenylalanyl-tRNA
synthetase alpha chain
(484)
Phenylalanyl-tRNA
synthetase beta chain
(592)
Glycyl-tRNA synthetase
(609)
Histidyl-tRNA
synthetase
(490)
Isoleucyl-tRNA
synthetase, cytoplasmic
(1075)
Isoleucyl-tRNA
synthetase,
mitochondrial
(956)
Lysyl-tRNA synthetase,
cytoplasmic
(618)
Leucyl-tRNA
synthetase, cytoplasmic
(1066)
Leucyl-tRNA
synthetase,
mitochondrial
(838)
Methionyl-tRNA
synthetase, cytoplasmic
(817)
Methionyl-tRNA
synthetase,
mitochondrial
74
Yeast BRH
Fruiting Acc. No.
%
body (length aa) id
Filamentous BRH
Acc. No.
(length aa)
%
id
6
10
NP_014980
AN9419
62%
(958)
(962)
2
8
NP_014152
BC1G_05496
44%
42%
(767)
(797)
2
5
NP_013083
AN4550
61%
(557)
(556)
54%
0
1
NP_015221
AN1710
38%
(658)
(680)
46%
6
22
NP_011269
NCU08894
54%
(708)
(637)
57%
0
1
NP_014609
BC1G_09576
47%
54%
(563)
(613)
0
1
NP_015372
BC1G_07094
53%
55%
(469)
(493)
2
5
NP_116631
NCU05095
54%
(503)
(518)
3
2
NP_013161
BC1G_10189
55%
70%
(595)
(610)
8
17
NP_009679
MGG_06321
58%
(667)
(666)
4
4
NP_015358
BC1G_07247
55%
64%
(546)
(535)
8
4
NP_009477
BC1G_13385
61%
74%
(1072)
(1070)
4
0
NP_015285
AN10393
38%
(1002)
(1001)
46%
1
4
NP_010322
AN1913
63%
(591)
(607)
69%
1
4
NP_015165
BC1G_15093
55%
60%
(1090)
(1125)
1
1
NP_013486
MGG_04042
42%
(894)
(918)
4
24
NP_011780
BC1G_04159
62%
62%
(751)
(643)
0
1
NP_011687
AN3865
38%
(575)
(580)
68%
62%
61%
47%
55%
SUPPLEMENTARY INFORMATION
(589)
Asparaginyl-tRNA
GSTUMT00011574001 TmelNARS synthetase, cytoplasmic
(579)
Prolyl-tRNA synthetase
GSTUMT00009788001 TmelPARS
(317)
Glutaminyl-tRNA
GSTUMT00008811001 TmelQARS synthetase, cytoplasmic
(791)
Asparaginyl-tRNA
synthetase,
GSTUMT00000298001 TmelQARS2
mitochondrial
(492)
Arginyl-tRNA
GSTUMT00011569001 TmelRARS synthetase, cytoplasmic
(633)
Seryl-tRNA synthetase,
GSTUMT00005319001 TmelSARS cytoplasmic
(444)
Seryl-tRNA synthetase,
GSTUMT00001005001 TmelSARS2 mitochondrial
(527)
Threonyl-tRNA
GSTUMT00010648001 TmelTARS
synthetase, cytoplasmic
(658)
Threonyl-tRNA
synthetase,
GSTUMT00009747001 TmelTARS2
mitochondrial
(459)
Valyl-tRNA synthetase
GSTUMT00009839001 TmelVARS
(905)
Tryptophanyl-tRNA
GSTUMT00002674001 TmelWARS synthetase, cytoplasmic
(444)
Tryptophanyl-tRNA
synthetase,
GSTUMT00007805001 TmelWARS2
mitochondrial
(369)
Tyrosyl-tRNA
GSTUMT00012225001 TmelYARS synthetase, cytoplasmic
(418)
Tyrosyl-tRNA
synthetase,
GSTUMT00011504001 TmelYARS2
mitochondrial
(449)
75
2
11
NP_011883
AN7479
55%
(554)
(575)
11
8
NP_011884
BC1G_07479
39%
55%
(688)
(589)
2
3
NP_014811
NCU07926
50%
(809)
(652)
62%
4
9
NP_009953
NCU06866
39%
(492)
(490)
48%
5
5
NP_010628
AN6368
61%
(607)
(651)
58%
2
2
NP_010306
BC1G_06103
61%
71%
(462)
(472)
0
0
NP_011875
NCU09594
38%
(446)
(552)
4
0
NP_116578
BC1G_11336
56%
65%
(734)
(788)
0
1
NP_012727
NCU03129
42%
(462)
(499)
55%
6
2
NP_011608
MGG_04396
48%
(1104)
(1100)
49%
1
1
NP_014544
BC1G_07147
60%
61%
(432)
(475)
1
1
NP_010554
AN6488
46%
(379)
(385)
54%
1
3
NP_011701
MGG_02449
65%
(394)
(389)
58%
1
0
NP_015228
NCU03030
38%
(492)
(670)
36%
61%
50%
SUPPLEMENTARY INFORMATION
Table S21. Translation factor genes in T. melanosporum.
EST number
Gene Model
Name
Putative function
(length)
Yeast BRH
Acc. No.
(length)
FLM
FB
79
71
17
9
1
0
3
10
8
10
11
30
8
5
-----
-----
3
3
6
3
4
1
2
4
9
5
2
2
4
3
6
5
5
11
-----
3
6
NO
Filamentous BRH
% id
Acc. No.
( length)
% id
1. Initiation
GSTUMT90000712001
TmelSUI1
GSTUMT00004265001
TmelTIF11
GSTUMT00001399001
TmelEIF1A2
GSTUMT00002346001
TmelSUI2
GSTUMT00006311001
TmeleIF2A
GSTUMT00006825001
TmelSUI3
GSTUMT00008017001
TmelGCD11
GSTUMT00004384001
TmelGCN3
GSTUMT00004293001
TmelCDC7
GSTUMT00011372001
TmelEIF2B3
GSTUMT00010167001
TmelGCD2
GSTUMT00003358001
TmelGCD6
GSTUMT00012096001
TmelTIF32
GSTUMT00005362001 TmelSPAC25G10.08
GSTUMT00004436001
TmelTIF33
GSTUMT00000237001
TMELTIF35
GSTUMT00008033001
GSTUMT00008034001
TMELMOE1
GSTUMT00004946001 TMELSPBC4C3.07
Initiation factor 1
(114)
Initiation factor 1A
(145)
Initiation factor 1A-like
(148)
Initiation factor 2a
(306)
Initiation factor 2A
(713)
Initiation factor 2b
(316)
Initiation factor 2g
(530)
Initiation factor 2Ba
(332)
Initiation factor 2Bb
(137)
Initiation factor 2Bg
(573)
Initiation factor 2Bd
(506)
Initiation factor 2Be
(683)
Initiation factor 3A
(1072)
Initiation factor 3B
(725)
Initiation factor 3C
(854)
Initiation factor 3G
(289)
Initiation factor 3D
(578)
Initiation factor 3F
(332)
NP_014155.1
(108)
NP_013987.1
(145)
----NP_012540.1
(304)
NP_011568.1
(642)
60.00
64.06
NP_015087.1
(285)
NP_010942.1
(527)
NP_012951.1
(305)
45.83
NP_013394.1
(381)
NP_014903.1
(578)
NP_011597.1
(651)
38.37
NP_010497.1
(712)
NP_009635.1
(964)
NP_015006.1
(763)
34.23
NP_014040.1
(812)
NP_010717.1
(274)
35.86 BC1G_06212
(870)
35.77
AN10765
(290)
BC1G_12797
----(601)
AN10182
----(346)
81.57
39.14
AN2992
(305)
AN4470
(515)
AN0167
(353)
63.90
83.74
35.63
AN1344
40.00
(434)
22.40 BC1G_00730 35.46
(557)
35.94 BC1G_08380 40.83
(478)
AN10459
41.67
(705)
37.24
AN2743
69.60
(1036)
35.83 BC1G_11866 60.98
(745)
Initiation factor 3E
(454)
5
13
-----
-----
GSTUMT00005308001
TMELGCD10
Initiation factor 3g
(534)
1
4
NP_014337.1
(478)
29.18
GSTUMT00011424001
TMELSPAC821.05
3
6
-----
GSTUMT00009054001
TMELSUM1
6
11
NP_013866.1
(347)
GSTUMT00009595001
TMELEIF3J
1
1
-----
GSTUMT00007524001
TMELEIF3K
-----
2
-----
GSTUMT00003100001
TMELEIF3X
2
13
NP_013725.1
(1277)
76
82.93
AN0105
42.36
(136)
68.04 NCU08277 81.72
(311)
37.79 MGG_00847 50.41
(705)
TMELINT6
Initiation factor 3K
(233)
Initiation factor 3X
(1296)
62.71
-----
GSTUMT00006261001
GSTUMT00006262001
GSTUMT00006263001
Initiation factor 3H
(366)
Initiation factor 3I
(333)
Initiation factor 3J
(283)
AN4742
(115)
AN8712
(150)
65.70
66.67
73.17
67.97
AN2907
(449)
54.55
AN8066
(551)
42.97
AN1270
59.49
(367)
54.30 BC1G_04452 73.64
(336)
MGG_05134 45.52
----(273)
-----
BC1G_11363 64.98
(246)
24.94 BC1G_11770 57.75
(1307)
-----
SUPPLEMENTARY INFORMATION
GSTUMT00004430001
TMELEIF3L
GSTUMT00008763001
TMELEIF4A3
GSTUMT00004300001
TMELEIF4A3B
GSTUMT00007732001
TMELSCE3
GSTUMT00011734001
TMELEIF4EA
GSTUMT00009184001
TMELCDC33
GSTUMT00007631001
TMELEIF4EB
GSTUMT00004418001
TMELTIF471
Initiation factor 3l
(478)
Initiation factor 4A-III
(472)
Initiation factor 4A-IIIb
Initiation factor 4B
(493)
Initiation factor 4Ea
(340)
Initiation factor 4Eb
(249)
Initiation factor 4Ec
(253)
Initiation factor 4G
(1068)
Initiation factor 5
(420)
Initiation factor 5A
(158)
Initiation factor 5B
(1075)
Initiation factor 6
(246)
NCU06279 69.87
(475)
70.30 BC1G_00466 92.66
(399)
6
4
-----
20
36
NP_012397.1
(395)
6
3
11
5
3
2
NO
9
3
NP_014502.1
(213)
49.20
4
2
NO
-----
10
6
40.46
23
13
10
11
2
6
1
8
NP_011466.1
(914)
NP_015366.1
(405)
NP_012581.1
(157)
NP_009365.1
(1002)
NP_015341.1
(245)
NP_010304.1
(399)
NP_015489.1
(436)
-----
62.28 BC1G_07971 91.73
(400)
28.80 BC1G_07588 39.13
(575)
AN8191
48.86
----(351)
41.56
AN3411
68.72
(245)
BC1G_00854 45.73
(245)
AN6060
(1519)
AN6067
(424)
NCU05274
(164)
AN4038
(1073)
MGG_01671
(246)
48.33
60.58
GSTUMT00004696001
TMELTIF5
GSTUMT00002158001
TMELHYP2
GSTUMT00009928001
TMELFUN12
GSTUMT00006515001
TMELTIF6
GSTUMT00011471001
TMELPABP
Poly-A binding protein
(750)
3
2
NP_011092.1
(577)
60.16 MGG_09505 83.40
(762)
GSTUMT00000021001
TMELTEF1A
Elongation factor 1A
(421)
129
42
NP_015405.1
(458)
74.47 BC1G_09492 80.41
(461)
GSTUMT00010479001
GSTUMT00010480001
TMELTEF1B
6
4
NO
GSTUMT00010962001
TMELTEF1C
7
4
NO
GSTUMT00001146001
TMELEFB1
14
1
NP_009398.1
(206)
BC1G_04388 55.46
(757)
NCU09513 58.78
----(651
50.47 NCU06035 59.31
(232)
GSTUMT00005165001
GSTUMT00005166001
TMELTEF4
10
8
NP_015277.1
(415)
47.94 BC1G_00939 57.86
(416)
GSTUMT00001342001
GSTUMT00001343001
GSTUMT00001344001
TMELRIA1A
Elongation factor 2
(830)
30
14
NP_014776.1
(842)
79.35
GSTUMT00007649001
TMELRIA1B
Elongation factor 2
(1075)
1
-----
NP_014236.1
(1110)
46.65 BC1G_09557 47.89
(1042)
TMELHEF3
Elongation factor 3
(1064)
135
17
NP_014384.1
(1044)
62.16 BC1G_15638 73.82
(947)
GSTUMT00006283001
TMELSUP45B
Release factor 1
(445)
2
3
NO
-----
GSTUMT00002415001
TMELSUP45A
8
3
71.90
GSTUMT00006992001
TMELSUP35
5
17
NP_009701.1
(437)
NP_010457.1
(685)
71.43
66.61
77.14
73.89
77.06
84.55
2. Elongation
GSTUMT00001743001
GSTUMT00001744001
3. Termination
Elongation factor 1A
(589)
Elongation factor 1A
(615)
Elongation factor 1Bα
(220)
Elongation factor 1Bγ
(402)
Release factor 1
(434)
Release factor 3
(730)
77
-----
AN6330
(845)
NO
89.68
-----
AN8853
85.31
(436)
65.19 BC1G_14662 64.56
(727)
SUPPLEMENTARY INFORMATION
Table S22. Comparison of T. melanosporum translation factor genes with those of other eukaryotic organisms.
BLAST RESULT
Gene name
GSTUMT No.
Best match of
Neurospora
crassa
homolog
Best
matcha
S. cerevisiae
S. pombe
00000712001
00004265001
00001399001
00002346001
00006311001
00006825001
00008017001
00004384001
00004293001
00011372001
00010167001
00003358001
00012096001
00005362001
00004436001
00000237001
00008034001
00004946001
00006263001
00005308001
00011424001
00009054001
00009595001
00007524001
00003100001
00004430001
00008763001
00004300001
00007732001
00011734001
00009184001
00007631001
00004418001
00004696001
00002158001
00009928001
00006515001
00011471001
Sp
Sp
Sp
Sc
Sp
Sp
Sc
Sp
Sc
Animal
Sp
Sp
Sp
Sp
Sp
Sp
Sp
Sp
Animal
Sc
Plant
Sp
NF
Animal
Sp
Animal
Sp
Sp
Sp
Plant
Sp
Plant
Sp
Sp
Sc
Sp
Sc
Sp
9e-31
3e-40
NFd
3e-90
2e-76
1e-47
0.00
2e-54
6e-11
6e-13
7e-55
e-109
2e-93
e-121
e-118
3e-32
NF
NF
NF
6e-47
NF
e-102
NF
NF
2e-80
NF
e-143
e-139
NF
e-23
2e-45
NF
e-58
6e-84
6e-57
0.00
e-107
2e-88
7e-31
4e-46
2e-09
5e-89
e-142
3e-49
0.00
3e-58
5e-07
8e-23
1e-60
e-143
0.00
e-175
0.00
e-55
2e-98
9e-86
NF
2e-46
6e-47
e-132
NF
NF
0.00
NF
0.00
0.00
5e-29
2e-21
3e-51
9e-20
e-86
2e-87
5e-53
0.00
e-100
e-106
4e-26
6e-45
6e-05
7e-64
4e-87
3e-43
0.00
2e-38
2e-05
2e-25
2e-54
3e-78
e-143
e-132
e-120
2e-39
4e-80
4e-45
2e-18
2e-23
3e-47
6e-98
NF
6e-22
e-141
e-107
e-159
e-177
8e-12
2e-40
e-36
4e-28
3e-41
2e-68
6e-51
0.00
2e-97
4e-84
5e-26
1e-37
NFd
e-68
e-67
4e-41
e-172
7e-46
2e-06
4e-14
8e-51
8e-95
e-117
e-123
e-140
e-34
3e83
4e-43
2e-16
8e-12
5e-51
e-78
NF
2e-17
9e-90
4e-83
e-150
e-163
9e-9
2e-44
5e-32
2e-30
3e-38
7e-38
4e-42
0.00
5e-93
e-73
Sc
Animal
Sp
Sp
Sp
Sc
Sc
Sp
Sc
Sp
Sp
Sp
Sp
Sp
Sp
Animal
Sp
Sp
Animal
Sc
Sp
Sp
Animal
Animal
Sp
Animal
Sp
Sp
Sp
Plant
Sp
Animal
Sp
Sp
Sc
Sc
Sc
Sp
00000021001
00010479001
00010962001
00001146001
00005165001
00001342001
00007649001
00001743001
Sp
Animal
Animal
Sp
Sc
Sc
Sp
Sp
e-139
NF
NF
2e-44
3e-72
0.00
e-120
0.00
e-140
5e-63
NF
8e-51
3e-54
0.00
0.00
0.00
NF
8e-72
e-112
e-47
2e-29
0.00
e-113
6e-57
NF
NF
NF
3e-27
3e-34
e-170
e-143
3e-52
Sp
Animal
Animal
Sp
Sc
Sc
Sp
Sc
00006283001
00006992001
00002415001
Animal
Sc
Animal
e-169
0.00
e-176
e-155
0.00
e-166
e-172
4e-42
0.00
e-165
6e-42
e-173
--Sc
Animal
Animalsb Plantsc
1. Initiation
EIF1
EIF1A
EIF1A-like
EIF2α
EIF2A
EIF2β
EIF2γ
EIF2Bα
EIF2Bβ
EIF2Bγ
EIF2Bδ
EIF2Bε
EIF3A
EIF3B
EIF3C
EIF3G
EIF3D
EIF3F
EIF3E
EIF3γ
EIF3H
EIF3I
EIF3J
EIF3K
EIF3X
EIF3L
EIF4A3
EIF4A3B
EIF4B
EIF4EA
EIF4EB
EIF4EC
EIF4G
EIF5
EIF5A
EIF5B
EIF6
PABP
2. Elongation
EEF1A
EEF1A
EEF1A
EEF1Bα
EEF1Bγ
EEF2
EEF2
EEF3
3. Termination
ERF1
ERF3
ERF1
Sp, S. pombe; Sc, S. cerevisiae.
Anopheles gambiae, Drosophila melanogaster, Homo sapiens, Mus musculus, Oryctolagus cuniculus,Rattus norvegicus, Xenopus laevis.
cArabidopsis thaliana, Oryza sativa,Triticum aestivum, Zea mays.
dNF, not found.
a
b
78
SUPPLEMENTARY INFORMATION
Table S23. Comparison of major classes of carbohydrate-active enzymes in T. melanosporum and comparison with other
sequenced ascomycetous and basidiomycetous fungi.
Species
Pezizomycetes
Tuber melanosporum
GH*
GT
CBM
CE
PL
95
96
18
13
3
Saccharomycetes
Saccharomyces cerevisiae S288C
Candida albicans SC5314
Candida glabrata CBS138
45
58
38
67
69
73
12
4
12
3
3
3
0
0
0
Eurotiomycetes
Aspergillus nidulans FGSC A4
Aspergillus oryzea RIB40
Aspergillus fumigatus Af293
247
285
263
91
114
103
36
30
55
29
26
29
19
21
13
Sordariomycetes
Magnaporthe grisea 70-15
Trichoderma reesei
Fusarium graminearum PH-1
Neurospora crassa 74A
Podospora anserina S mat+
231
200
243
171
229
94
103
110
76
88
58
36
61
39
75
47
16
42
21
16
4
3
20
3
7
Archiascomycetes
Schizosaccharomyces pombe 972H
46
61
5
5
0
Basidiomycetes
Cryptococcus neoformans JEC21
Ustilago maydis 521
Laccaria bicolor S238
Phanerochaete chrysosporium RP78
Coprinopsis cinerea Okayama 7 (#130)
75
97
165
179
210
68
64
88
66
72
10
9
14
47
90
9
3
1
7
4
13
20
7
* GH, glycosyl hydrolase family; GT, glycosyl transferase family; PL, polysaccharide lyase family; CBM, carbohydrate-binding module family. Grey cells
indicate the species having the largest set of enzymes. The highest numbers of entries in each category is indicated in bold.
79
SUPPLEMENTARY INFORMATION
Table S24. Carbohydrate-active enzymes (CAZymes) encoded in the T. melanosporum genome.
Gene model ID
Family
Definition
Metabolism
Carbohydrate Esterase Family
GSTUMT00009714001
GSTUMT00012611001
GSTUMT00004210001
GSTUMT00007845001
GSTUMT00009302001
GSTUMT00005594001
GSTUMT00000534001
GSTUMT00012125001
GSTUMT00003032001
GSTUMT00007976001
GSTUMT00006497001
GSTUMT00012789001
GSTUMT00003723001
CE1
CE1
CE4
CE4
CE4
CE4
CE4
CE4
CE4
CE4
CE4
CE8
CE12
candidate esterase related to S-formylglutathione hydrolase and feruloyl esterase
candidate esterase distantly related to feruloyl esterase
candidate esterase related to chitin deacetylase; N-term CBM18 module
candidate esterase related to cyclic imide hydrolase
candidate esterase related to chitin deacetylase
candidate esterase distantly related to chitin deacetylase
candidate esterase related to chitin deacetylase
candidate esterase related to chitin deacetylase
candidate esterase related to chitin deacetylase
candidate esterase related to chitin deacetylase; N-term CBM18 module
candidate esterase related to chitin deacetylase; 3 N-term CBM18 modules
candidate esterase distantly related to pectin methylesterase; possibly GPI-anchored
candidate esterase related to rhamnogalacturonan acetylesterase
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
pectin
pectin
Carbohydrate-binding Module Family
GSTUMT00001211001
GSTUMT00012353001
GSTUMT00004210001
GSTUMT00002591001
GSTUMT00000524001
GSTUMT00010448001
GSTUMT00007976001
GSTUMT00009270001
GSTUMT00006497001
GSTUMT00004733001
GSTUMT00007975001
GSTUMT00000402001
GSTUMT00007852001
GSTUMT00010976001
GSTUMT00007490001
GSTUMT00006246001
GSTUMT00007342001
GSTUMT00007945001
CBM1
CBM18
CBM18
CBM18
CBM18
CBM18
CBM18
CBM18
CBM18
CBM18
CBM19
CBM21
CBM21
CBM32
CBM43
CBM48
CBM48
CBM48
candidate ß-glycosidase related to endoglucanase; C-term CBM1 module
Chitin binding domain
candidate esterase related to chitin deacetylase; N-term CBM18 module
candidate esterase related to chitin deacetylase; N-term CBM18 module
candidate esterase related to chitin deacetylase; 3 N-term CBM18 modules
candidate ß-1,3/6-glucan-active enzyme; N-term CBM18 module; GPI-anchored
distantly related to protein phosphatase glycogen-binding regulatory subunit
candidate α-glycosidase distantly related to α-amylase; N-term CBM21 module
candidate α,α-trehalase; C-term CBM32 module
candidate ß-1,3-glucanosyltransferase; C-term CBM43 module; GPI-anchored
candidate α-1,4-glucan branching enzyme
Glycosyl Hydrolase Family
GSTUMT00001302001
GSTUMT00003864001
GSTUMT00005036001
GSTUMT00010944001
GSTUMT00007083001
GSTUMT00003224001
GSTUMT00005119001
GSTUMT00001003001
GSTUMT00006007001
GSTUMT00005599001
GSTUMT00008973001
GSTUMT00008530001
GSTUMT00006999001
GSTUMT00010324001
GSTUMT00003767001
GSTUMT00003582001
GSTUMT00009198001
GSTUMT00009298001
GSTUMT00008058001
GSTUMT00004119001
GSTUMT00006246001
GSTUMT00002531001
GSTUMT00004257001
GSTUMT00005820001
GSTUMT00007852001
GSTUMT00006610001
GSTUMT00004366001
GSTUMT00002130001
GSTUMT00006794001
GH1
GH1
GH2
GH2
GH3
GH3
GH3
GH3
GH3
GH3
GH5
GH5
GH5
GH5
GH5
GH5
GH10
GH12
GH13
GH13
GH13
GH13
GH13
GH13
GH13
GH13
GH15
GH16
GH16
candidate ß-glucosidase
candidate ß-glycosidase distantly related to ß-galactosidase/b-glucosidase
candidate ß-mannosidase
candidate ß-glycosidase related to ß-mannosidase
candidate ß-glucosidase
candidate ß-glucosidase
candidate ß-glycosidase distantly related to bacterial ß-N-acetylhexosaminidase
candidate ß-glucosidase or exo-b-1,3-glucosidase
candidate ß-glycosidase
candidate ß-glucosidase
candidate endoglucanase
candidate exo-1,3-b-glucanase
candidate exo-1,3-b-glucanase
candidate ß-1,6-glucanase
candidate ß-glycosidase distantly realted to bacterial endoglycoceramidase
candidate ß-glycosidase distantly related to bacterial endoglycoceramidase
candidate ß-xylanase
candidate endo-1,4-glucanase (xyloglucan-specific)
candidate α-glycosidase related to oligo-1,6-glucosidase
candidate glycogen-debranching enzyme
candidate α-1,4-glucan branching enzyme
candidate α-1,3/4-glucan synthase
candidate maltotriose-producing a-amylase
candidate α-amylase
candidate α-glycosidase distantly related to a-amylase
candidate α-amylase
candidate glucoamylase
candidate cell-wall ß-1,6-glucan active enzyme; membrane-anchored
candidate ß-1,3/6-glucan-active enzyme; GPI-anchored
80
cell-wall ß-glucan
cellulose/xyloglucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall
xylan
xyloglucan
starch/glycogen
starch/glycogen
starch/glycogen
cell-wall a-glucan
cell-wall a-glucan
starch/glycogen
starch/glycogen
starch/glycogen
starch/glycogen
cell-wall ß-glucan
cell-wall ß-glucan
SUPPLEMENTARY INFORMATION
GSTUMT00003034001
GSTUMT00001660001
GSTUMT00004630001
GSTUMT00002321001
GSTUMT00004733001
GSTUMT00003996001
GSTUMT00008448001
GSTUMT00010908001
GSTUMT00007484001
GSTUMT00011326001
GSTUMT00011529001
GSTUMT00011196001
GSTUMT00001975001
GSTUMT00006229001
GSTUMT00010084001
GSTUMT00006876001
GSTUMT00012061001
GSTUMT00010928001
GSTUMT00012011001
GSTUMT00004199001
GSTUMT00005998001
GSTUMT00008333001
GSTUMT00004616001
GSTUMT00010862001
GSTUMT00003340001
GSTUMT00007119001
GH16
GH16
GH16
GH16
GH16
GH17
GH17
GH17
GH17
GH18
GH18
GH18
GH18
GH18
GH20
GH20
GH24
GH25
GH28
GH28
GH31
GH31
GH31
GH31
GH31
GH32
GSTUMT00007318001
GSTUMT00001902001
GSTUMT00006536001
GSTUMT00003229001
GSTUMT00011089001
GSTUMT00006238001
GSTUMT00006375001
GSTUMT00001932001
GSTUMT00008220001
GSTUMT00006578001
GSTUMT00011889001
GSTUMT00008775001
GSTUMT00012573001
GSTUMT00008986001
GSTUMT00001211001
GSTUMT00012497001
GSTUMT00006099001
GSTUMT00010976001
GSTUMT00004023001
GSTUMT00010889001
GSTUMT00007490001
GSTUMT00002553001
GSTUMT00000527001
GSTUMT00012299001
GSTUMT00004785001
GSTUMT00008680001
GSTUMT00004651001
GSTUMT00004301001
GSTUMT00000874001
GSTUMT00006913001
GSTUMT00005050001
GSTUMT00005750001
GSTUMT00008143001
GSTUMT00007295001
GSTUMT00011606001
GSTUMT00007970001
GH36
GH36
GH38
GH43
GH45
GH47
GH47
GH47
GH47
GH47
GH55
GH55
GH61
GH61
GH61
GH61
GH63
GH65
GH72
GH72
GH72
GH72
GH75
GH76
GH76
GH76
GH76
GH76
GH76
GH78
GH78
GH81
GH81
GH81
GH85
GH92
GSTUMT00008776001
GSTUMT00001255001
GSTUMT00004007001
GSTUMT00005391001
–
–
–
–
GSTUMT00010384001
GSTUMT00011693001
GSTUMT00007486001
GSTUMT00009030001
GT1
GT1
GT1
GT1
candidate ß-(trans)glycosidase
candidate ß-(trans)glycosidase; membrane-anchored
candidate ß-(trans)glycosidase distantly related to bacterial endo-1,3-b-glucanase
candidate cell-wall ß-1,6-glucan active enzyme; membrane-anchored
candidate ß-1,3/6-glucan-active enzyme; N-term CBM18 module; GPI-anchored
candidate ß-(trans)glycosidase related to ß-1,3-glucanase; membrane-anchored
candidate ß-(trans)glycosidase distantly related to exo-b-1,3-glucanase
candidate ß-(trans)glycosidase distantly related to exo-b-1,3-glucanase
candidate ß-(trans)glycosidase distantly related to exo-b-1,3-glucanase
candidate ß-glycosidase related to chitinase
candidate ß-glycosidase distantly related to chitinase
candidate ß-glycosidase related to chitinase
candidate ß-glycosidase distantly related to bacterial chitinase
candidate chitinase
candidate ß-glycosidase distantly related to N-acetylglucosaminidase
candidate ß-glycosidase related to exochitinase
candidate ß-glycosidase
candidate ß-glycosidase
candidate polygalacturonase
candidate ß-glycosidase related to rhamnogalacturonase
candidate α-1,4-glucan lyase
candidate α-1,4-glucan lyase
candidate α-glucosidase
candidate α-glucosidase
candidate α-glycosidase related to a-glucosidase
candidate ß-(transglycosidase) distantly related to ß-fructosidase and ßfructosyltransferase
candidate α-glycosidase distantly related to plant α-galactosidase
candidate α-galactosidase distantly related to α-galactosidase
candidate α-mannosidase
candidate arabinanase
candidate endoglucanase
candidate α-glycosidase related to α-mannosidase; transmembrane-anchored
candidate α-glycosidase related to α-1,2-mannosidase; transmembrane-anchored
candidate α-1,2-mannosidase; transmembrane-anchored
candidate α-1,2-mannosidase
candidate α-glycosidase distantly related to animal α-1,2-mannosidase
candidate exo-b-1,3-glucanase
candidate ß-glycosidase related to exo-b-1,3-glucanase
candidate ß-glycosidase distantly related to endoglucanase
candidate ß-glycosidase distantly related to endoglucanase
candidate ß-glycosidase related to endoglucanase; C-term CBM1 module
candidate ß-glycosidase distantly related to endoglucanase
candidate processing a-glucosidase; ER-retention signal
candidate α,α-trehalase; C-term CBM32 module
candidate ß-1,3-glucanosyltransferase; GPI-anchored
candidate ß-1,3-glucanosyltransferase; GPI-anchored
candidate ß-1,3-glucanosyltransferase; C-term CBM43 module; GPI-anchored
candidate ß-1,3-glucanosyltransglycosylase; GPI-anchored
fragment of candidate chitosanase
candidate α-(trans)glycosidase
candidate α-(trans)glycosidase
candidate α-(trans)glycosidase
candidate cell-wall α-(trans)glycosidase; GPI-anchored
candidate cell-wall α-(trans)glycosidase; GPI-anchored
candidate cell α-(trans)glycosidase; GPI-anchored
candidate α-L-rhamnosidase
candidate ß-glycosidase related to α-L-rhamnosidase
candidate ß-glycosidase related to ß-1,3-glucanase
candidate ß-glycosidase related to ß-1,3-glucanase
candidate ß-glycosidase distantly related to ß-1,3-glucanase
candidate ß-glycosidase distantly related to endo-b-N-acetylglucosaminidase
candidate α-glycosidase distantly related to bacterial α-1,2-mannosidase;
transmembrane-anchored
not assigned
not assigned
not assigned
not assigned
Glycosyl Transferase Family
candidate ß-glycosyltransferase related to plant UDP-Glc:sterol ß-glucosyltransferase
candidate ß-glycosyltransferase; N-terminal domain
candidate UDP-Glc: sterol ß-glucosyltransferase
candidate ß-glycosyltransferase; C-terminal domain
81
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
cell-wall chitin
pectin
pectin
starch/glycogen; ?
starch/glycogen; ?
sucrose?
cellulose/xyloglucan
N-glycan
N-glycan
N-glycan
cell-wall a-mannan
N-glycan
cell-wall ß-glucan
cell-wall ß-glucan
cellulose?
cellulose?
cellulose
cellulose?
N-glycans
trehalose
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall chitin
cell-wall
cell-wall
cell-wall
cell-wall
cell-wall
pectin
pectin
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall ß-glucan
cell-wall
SUPPLEMENTARY INFORMATION
GSTUMT00003671001
GT1
GSTUMT00001747001
GSTUMT00009079001
GSTUMT00002409001
GSTUMT00008447001
GSTUMT00011849001
GSTUMT00008119001
GSTUMT00005561001
GSTUMT00008120001
GSTUMT00004894001
GT2
GT2
GT2
GT2
GT2
GT2
GT2
GT2
GT2
candidate ß-glycosyltransferase distantly related to plant UDP-Glc: sterol ßglucosyltransferase
candidate chitin synthase
candidate chitin synthase
candidate chitin synthase
candidate chitin synthase
candidate chitin synthase
candidate chitin synthase
candidate chitin synthase
candidate chitin synthase
candidate dolichyl-phosphate ß-glucosyltransferase
GSTUMT00008671001
GT2
candidate ß-glycosyltransferase related to dolichyl-phosphate ß-mannosyltransferase
GSTUMT00003828001
GSTUMT00001241001
GT3
GT4
candidate glycogen synthase
candidate NDP-sugar α-glycosyltransferase related to α-1,3/6-mannosyltransferases
GSTUMT00011323001
GSTUMT00003184001
GT4
GT4
candidate UDP-GlcNAc: phosphatidylinositol-α-N-acetylglucosaminyltransferase
candidate NDP-sugar a-glycosyltransferase related to α-1,2-mannosyltransferase
GSTUMT00000764001
GSTUMT00000765001
GSTUMT00000797001
GSTUMT00000799001
GSTUMT00000802001
GSTUMT00001113001
GSTUMT00001114001
GSTUMT00001115001
GSTUMT00003187001
GSTUMT00004809001
GSTUMT00006327001
GSTUMT00006328001
GSTUMT00006329001
GSTUMT00006330001
GSTUMT00006331001
GSTUMT00006332001
GSTUMT00006333001
GSTUMT00008118001
GSTUMT00009893001
GSTUMT00009901001
GSTUMT00009971001
GSTUMT00009975001
GSTUMT00011362001
GSTUMT00011363001
GSTUMT00011364001
GSTUMT00011366001
GSTUMT00011486001
GSTUMT00011488001
GSTUMT00011489001
GSTUMT00011491001
GSTUMT00011493001
GSTUMT00011494001
GSTUMT00011496001
GSTUMT00011497001
GSTUMT00011507001
GSTUMT00002531001
GSTUMT00005701001
GSTUMT00011097001
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT4
GT5
GT8
GT8
GSTUMT00011376001
GT15
candidate trehalose phosphorylase (TP)
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate NDP-sugar α−glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate α-glycosyltransferase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose-phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate NDP-sugar α-glycosyltransferase/α-glycan phosphorylase related to TP
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate trehalose phosphorylase
candidate α-1,3/4-glucan synthase
candidate glycogenin
candidate NDP-sugar α-glycosyltransferase distantly related to inositol 1-αgalactosyltransferase
candidate α-1-2-mannosyltransferase
GSTUMT00008926001
GT15
candidate α-1-2-mannosyltransferase
GSTUMT00000550001
GSTUMT00001845001
GSTUMT0001196100
GT20
GT20
GT20
GSTUMT00011684001
GSTUMT00006689001
GT21
GT22
GSTUMT00002928001
GT22
GSTUMT00006440001
GT22
candidate α,α-trehalose-phosphate synthase
candidate α−α-trehalose-phosphate synthase
candidate bifunctional α,α-trehalose-phosphate synthase / α,α-trehalose-phosphate
phosphatase
candidate NDP-sugar ß-glycosyltransferase related to ceramide glucosyltransferase
candidate Dol-P-sugar a-glycosyltransferase distantly related to Dol-P-Man: α-1,2mannosyltransferase
candidate Dol-P-sugar α-glycosyltransferase distantly related to α-1,2/6mannosyltransferases
candidate Dol-P-sugar α-glycosyltransferase related to Dol-P-Man: Man3-GlcNAcphosphatidylinositol α-1,2-mannosyltransferase
82
chitin
chitin
chitin
chitin
chitin
chitin
chitin
chitin
N-glycans; Oglycans
N-glycans; Oglycans
glycogen
N-glycans; Oglycans
GPI
N-glycans; Oglycans
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
trehalose
α-1,3/4-glucan
glycogen
N-glycans; Oglycans
N-glycans; Oglycans
trehalose
trehalose
trehalose
N-glycans; Oglycans
GPI
SUPPLEMENTARY INFORMATION
GSTUMT00011236001
GT22
GSTUMT00003782001
GT24
candidate Dol-P-sugar α-glycosyltransferase;related to Dol-P-Man: Man2-GlcNAcphosphatidylinositoll α-1,2-mannosyltransferase α-1,2-mannosyltransferase
candidate UDP-Glc: glycoprotein α-glucosyltransferase; ER-retention signal
GSTUMT00004829001
GT32
candidate α-1-6-mannosyltransferase
GSTUMT00006149001
GSTUMT00011329001
GSTUMT00011393001
GSTUMT00002332001
GT32
GT33
GT34
GT34
GSTUMT00005850001
GSTUMT00002656001
GSTUMT00001582001
GSTUMT00002470001
GSTUMT00009848001
GT35
GT39
GT39
GT39
GT41
GSTUMT00007491001
GSTUMT00007493001
GSTUMT00003075001
GSTUMT00009264001
GT48
GT48
GT50
GT57
GSTUMT00001621001
GT57
GSTUMT00001398001
GSTUMT00010811001
GT58
GT59
GSTUMT00011096001
GSTUMT00004018001
GSTUMT00010033001
GT62
GT62
GT62
GSTUMT00011157001
GSTUMT00006001001
GSTUMT00007196001
GSTUMT00003042001
GT66
GT69
GT69
GT76
candidate NDP-sugar α-glycosyltransferase
candidate NDP-sugar ß-glycosyltransferase distantly related to ß-mannosyltransferase
candidate NDP-sugar a-glycosyltransferase
candidate NDP-sugar α-glycosyltransferase distantly related to α-1,2galactosyltransferase
candidate glycogen phosphorylase
candidate Dol-P-Man: protein-O-mannosyltransferase
candidate protein-O-mannosyltransferase
candidate Dol-P-Man: protein O-mannosyltransferase
candidate NDP-sugar ß-glycosyltransferase related to plant and animal peptide Nacetylglucosaminyltransferase
candidate 1,3-β-glucan synthase
candidate 1,3-β-glucan synthase
candidate Dol-P-sugar α-glycosyltransferase related to α-1,4-mannosyltransferase
candidate Dol-P-sugar α-glycosyltransferase related to Dol-P-Glc: GlcMan9GlcNAc2PP-Dol α-1,3-glucosyltransferase
candidate Dol-P-sugar α-glycosyltransferase related to Dol-P-Glc: Man9GlcNAc2-PPDol α-1,3-glucosyltransferase
candidate Dol-P-sugar α-glycosyltransferase related to α-1,2/3-mannosyltransferases
candidate Dol-P-sugar α-glycosyltransferase distantly related to α-1,2glucosyltransferase
candidate α-1/6-mannosyltransferase
candidate NDP-sugar α-glycosyltransferase related to α-1,2/6-mannosyltransferases
candidate NDP-sugar α-glycosyltransferase distantly related to α-1,2/6mannosyltransferases
candidate oligosaccharyltransferase
GSTUMT00010168001
GT90
candidate NDP-sugar α-glycosyltransferase
candidate Dol-P-sugar α-glycosyltransferase distantly related to α-1,6mannosyltransferase
candidate NDP-sugar ß-glycosyltransferase
GPI
N-glycans; Oglycans
N-glycans; Oglycans
starch/glycogen
O-glycans
O-glycans
O-glycans
N-glycans; Oglycans
β-1,3-glucan
β-1,3-glucan
GPI
N-glycans
N-glycans
N-glycans
N-glycans
GPI
Polysaccharide Lyase Family
GSTUMT00006239001
GSTUMT00008534001
GSTUMT00004952001
PL1
PL1
PL4
candidate polysaccharide lyase related to pectin lyase (EC 4.2.2.10)
candidate pectate lyase
candidate polysaccharide lyase related to rhamnogalacturonan lyase
pectin
pectin
pectin
Carbohydrate-active enzymes (CAZymes) are categorized into different classes and families in the CAZy database (http://www.cazy.org).
83
SUPPLEMENTARY INFORMATION
Table S25. Distribution of genes coding for membrane transporter families in T. melanosporum, other sequenced ascomycetes
and Laccaria bicolor.
Transporter type Family
ATP-Dependent
ABC (ATP-binding
Cassette)
ArsAB (Arsenite-Antimonite
Efflux Family)
F-ATPase (H+- or Na+translocating F-type, V-type
and A-type ATPase
MPT (Mitochondrial Protein
Translocase)
P-ATPase (P-type ATPase)
Sec (General Secretory
Pathway)
TOTAL PROTEINS (ATPDependent)
Ion Channels
Amt (Ammonium
Transporter)
Annexin
ClC (Chloride Channel)
Mid1 (Yeast StretchActivated, Cation-Selective,
Ca2+ Channel)
MIP (Major Intrinsic Protein)
MIT (CorA Metal Ion
Transporter)
MPP (Mitochondrial and
Plastid Porin)
MscS (Small Conductance
Mechanosensitive Ion
Channel
NSCC2 (Non-selective
Cation Channel-2)
TRP-CC (Transient
Receptor Potential Ca2+
Channel)
VIC (Voltage-gated Ion
Channel )
TOTAL PROTEINS (Ion
Channels)
Secondary Transporter
AAAP (Amino Acid/Auxin
Permease)
ACR3 (Arsenical
Resistance-3)
AE (Anion Exchanger)
AEC
APC (Amino AcidPolyamine-Organocation)
ACT (Amino
Acid/Choline Transporter)
LAT (L-type
Amino Acid Transporter)
YAT (Yeast
Amino Acid Transporter)
BASS
CaCA (Ca2+:Cation
Antiporter )
Evolutionary
changes in
Tuber
Tuber1
Magnaporte2
Neurospora
Botrytis
Aspergillus
Saccharomyces
Laccaria
27
38
29
41
39
22
53
c
1
2
1
1
1
1
1
n
21
20
21
17
28
25
23
n
20
17
15
23
17
18
18
21
16
21
24
16
18
24
n
n
7
7
7
6
7
10
8
n
4
1
3
3
1
3
1
3
3
1
3
4
2
3
3
0
1
7
1
3
n
n
n
1
3
1
6
1
1
1
10
1
5
1
4
1
7
n
c
7
5
3
4
7
5
5
e
1
1
1
1
1
2
1
n
2
3
2
2
2
0
3
n
1
1
1
1
1
1
1
n
3
4
5
4
3
1
2
n
5
6
5
7
4
2
4
e
4
10
8
10
15
7
3
c
1
2
2
1
1
2
1
2
2
1
2
3
1
2
2
1
1
0
3
2
2
n
n
n
14
26
17
30
47
24
27
c
5
5
4
13
18
4
9
c
4
6
4
3
6
2
7
n
5
0
15
1
9
0
14
0
23
1
17
1
11
0
c
n
5
7
8
9
6
4
7
c
4
84
SUPPLEMENTARY INFORMATION
CCC (Cation-Chloride
Cotransporter)
CDF (Cation Diffusion
Facilitator)
CHR (Chromate Ion
Transporter)
CNT (Concentrative
Nucleoside Transporter)
CPA1 (Monovalent
Cation:Proton Antiporter-1)
CPA2 (Monovalent
Cation:Proton Antiporter-2)
DAACS
DASS (Divalent Anion:Na+
Symporter)
DMT (Drug/Metabolite
Transporter)
ENT (Equilibrative
Nucleoside Transporter)
FNT (Formate-Nitrite
Transporter)
GPH (Glycoside-PentosideHexuronide: Cation
Symporter)
GUP (Glycerol Uptake)
KUP (K+ Uptake Permease)
LCT (Lysosomal Cystine
Transporter)
MC (Mitochondrial Carrier)
MFS (Major Facilitator
Superfamily)
MOP
(Multidrug/Oligosaccharidyllipid/Polysaccharide)
Flippase)
MTC (Mitochondrial
Tricarboxylate Carrier)
NCS1 (Nucleobase:Cation
Symporter-1)
NCS2 (Nucleobase:Cation
Symporter-2)
NiCoT (Ni2+-Co2+
Transporter)
Nramp (Metal Ion (Mn2+iron) Transporter)
NSS
OPT (Oligopeptide
Transporter)
Oxa1 (Cytochrome Oxidase
Biogenesis)
PiT (Inorganic Phosphate
Transporter)
POT (Proton-dependent
Oligopeptide Transporter)
RND (ResistanceNodulation-Cell Division)
SSS (Solute:Sodium
Symporter)
SulP (Sulfate Permease)
TDT (Teluriteresistance/Dicarboxylate
Transporter)
ThrE
Trk (K+ Transporter)
ZIP (Zinc (Zn2+)-Iron (Fe2+)
Permease)
TOTAL PROTEINS
1
1
1
1
1
1
0
n
5
8
8
8
6
5
8
c
1
1
1
0
2
0
4
n
1
1
1
0
1
0
1
n
2
4
3
4
6
2
5
c
1
0
3
1
2
0
2
0
3
1
1
0
2
0
c
c
1
1
1
1
1
3
1
n
11
10
10
6
11
9
10
e
1
1
1
1
1
1
1
n
0
1
1
1
1
1
0
c
1
4
1
3
1
1
2
1
1
3
1
1
2
1
0
0
2
0
2
1
1
c
e
n
2
34
2
41
2
34
2
38
2
38
1
34
0
36
n
c
91
251
141
204
356
85
96
c*
2
3
2
1
2
2
6
n
1
1
1
1
1
1
0
e
4
8
3
8
11
10
11
c
3
2
2
2
2
0
3
e
1
2
1
1
1
0
1
n
1
0
0
2
2
0
11
1
0
11
1
1
3
0
1
0
n
c
3
12
6
3
10
c
1
1
1
1
1
1
1
n
1
3
1
1
4
1
0
n
3
4
2
3
4
2
3
n
1
1
1
1
1
1
1
n
1
4
3
6
2
4
2
4
4
4
1
4
2
5
c
n
3
1
2
3
1
4
3
1
2
2
1
2
3
3
3
1
2
0
1
3
2
n
n
n
7
8
5
8
7
3
5
e
85
SUPPLEMENTARY INFORMATION
(Secondary Transporter)
Incompletely
Characterized Transport
Systems
ATP-E (ATP Exporter)
Ctr (Copper Transporter )
FP (Ferroportin)
ILT (Iron/Lead Transporter)
MgtE (Mg2+ Transporter-E)
MHP (Metal Homeostasis
Protein)
MnHP (Mn2+ Homeostasis
Protein)
PF27
PLI (Phospholipid Importer)
PPI (Protein Importer)
SHP (Stress-Induced
Hydrophobic Peptide)
YaaH (ATO)
1
2
1
1
0
1
6
1
1
0
1
2
0
1
0
1
2
1
1
0
1
3
1
0
0
1
1
1
1
1
0
1
2
7
0
1
1
7
0
1
1
7
0
1
1
5
0
1
1
8
0
0
0
0
0
0
1
1
3
3
1
3
0
2
0
1
1
4
2
2
2
1
1
0
1
3
8
3
4
1
8
5
2
2
Family names are based on the Transport Classification Database: http://www.tcdb.org/). Lineage specific gene gain and loss in the membrane transporter
families have been estimated using CAFE33. Listed are 74 transporter families containing 31 expansions (8), 42 no change (n), 24 contraction (c) families in
the T. melanosporum lineage. “*” indicates significant expansion (P<0.001) in the Tuber branch (SOM). Abbreviations: xxx
1http://genome.jgi-psf.org/Lacbi1/Lacbi1.home.html
86
SUPPLEMENTARY INFORMATION
Table S26. T. melanosporum tRNA genes grouped by anticodons. The tRNAScan-SE algorithm was used with default parameters and the
Eukaryotic model.
87
SUPPLEMENTARY INFORMATION
Table S27. Overall codon usage of T. melanosporum genes
88
SUPPLEMENTARY INFORMATION
Table S28. Codon usage of T. melanosporum genes coding for ribosomal proteins
89
SUPPLEMENTARY INFORMATION
The Black Truffle Genome Uncovers Evolutionary Origins and Mechanisms of Symbiosis
Supplementary Tables
90
SUPPLEMENTARY INFORMATION
Figure S1. The life cycle of the black truffle of Perigord (Tuber melanosporum). (1) The spores released from mature truffles germinate (2)
in the following Spring, producing a vegetative mycelium, which results in colonisation of tree root tips and further development of the symbiosis (3).In the
ectomycorrhizal symbiotic relationship, long, branching fungal filaments known as hyphae ramify between cells of the host root’s outer layers, form a sheath
around the root, and radiate outwards into the surrounding soil and litter. In early Summer, extramatrical hyphae aggregate to form fruit body initials (4). The
latter developed to the fruit body during Fall and early Winter giving rise to mature truffles (5).
91
SUPPLEMENTARY INFORMATION
Figure S2. Phylogeny of some representative sequenced Ascomycetes and Basidiomycetes. The alignment was realized with the two
best phylogenetic genes (MS277 and MS456) identified by (83). The Bayesian Inference (BI) based upon the posterior probability distribution of trees was
performed with MrBayes (84) with the following settings: Lset rates=gamma; Prset aamodelpr=mixed; mcmcp ngen=100,000 samplefreq=50; other settings
= default. The sump burnin=500 command was used to verify the stationary of the analysis. The sumt burnin=500 command was used to produce summary
statistics for trees sampled during the Bayesian analysis. The consensus tree was visualized and edited with FigTree v1.2.1 (http://tree.bio.ed.ac.uk/).
92
SUPPLEMENTARY INFORMATION
Figure S3. The diversity and distribution of class I and class II transposable elements in T. melanosporum. The TE were identified
using the REPET pipeline (19). TE structures are depicted according to (85). The number of TE occurrences and the % genome coverage were identified
with RepeatMasker (www.repeatmasker.org) using the 846 consensus sequences coming from the TEdenovo pipeline (19).
93
SUPPLEMENTARY INFORMATION
Figure S4. Genome coverage (%) of the different T. melanosporum transposable element families.
94
SUPPLEMENTARY INFORMATION
Figure S5. Major cycles of LTR retrotransposon activity. T. melanosporum underwent at least two cycles of LTR-R amplifications. The most
recent activity peaks at an estimated 2.5 Mya, preceded by a gradual increase starting 5.5 Mya. An old activity occurred at >10 Mya. The decrease between
10 to 5 Mya probably reflects element deterioration leading to loss of ability to detect these elements. Consensus full-length copies of each element are
shown. A substitution mutattion rate of 1.3 x 10-8 was used.
95
SUPPLEMENTARY INFORMATION
Figure S6. Phylogenetic relationships amongst RNA silencing- and DNA methylation-related gene products from T.
melanosporum and other fungi. Functionally characterized components from N. crassa and S. pombe as well as putative homologs from T.
melanosporum are in bold; different types of siRNA-related processes are indicated on the right of the neighbor-joining trees.
96
SUPPLEMENTARY INFORMATION
97
SUPPLEMENTARY INFORMATION
Figure S7. Distribution of gene density in T. melanosporum and in representative ascomycetes from the Eurotiomycetes and
Sordariomycetes.
98
SUPPLEMENTARY INFORMATION
Figure S8. (A) Analysis of molecular divergence between the Pezizomycete T. melanosporum and selected fungi, the Zygomycete Rhizopus oryzae,
Ascomycetes from the Saccharomycetes (Saccharomyces cerevisiae), Eurotiomycetes (Aspergillus nidulans), Leotiomycetes (Botrytis cinerea),
Sordariomycetes (Neurospora crassa, Magnaporthe grisea), and Basidiomycetes (Laccaria bicolor). The truffle–yeast pair displays the lowest amino acid
identity (41.8%), in agreement with their proposed ancient separation, > 450 Myr ago (13). In the figure, we represent the cumulative frequencies of amino
acid identity across each set of potential orthologous pairs shown.
(B) Distribution of protein similarity (%) between T. melanosporum and representative ascomycetes. BLASTP (Best Reciprocal Hits)
99
SUPPLEMENTARY INFORMATION
Fig. S9. Synteny between Tuber melanosporum and selected ascomycete genomes. (A) Oxford plot of T. melanosporum scaffolds (x
axis) plotted against Coccidioides immitis chromosomes (y axis). In such a presentation, conserved segmental homologies are visualized
by diagonally oriented clusters, or at least by co-clustering of genes on genomic scaffolds. The lack of such clusters indicates the lack of
any major synteny between the two fungal genomes, although several microsyntenic regions can be visualized, e.g., between T.
melanosporum scaffold 6 and Coccidioides immitis chromosome 12. (B) Table summarizing the number of genomic blocks and genes
showing a synteny between Tuber melanosporum and selected Ascomycete belonging to the major Ascomycete phylogenetic groups.
100
SUPPLEMENTARY INFORMATION
Figure S10. The largest syntenic region between T. melanosporum (Pezizomycetes) and Coccoides immitis (Eurotiomycetes).
This region only contains 99 genes with 39 orthologs. The scheme shows the central part of this syntenic region.
101
SUPPLEMENTARY INFORMATION
Figure S11. Gene families in the truffle genome. (A) The percentage of amino-acid identity of the top-scoring self-matches for protein-coding
genes in T. melanosporum, Saccharomyces cereviseae, Aspergillus nidulans, Neurospora crassa, Magnaporthe grisea, and Botrytis cinerea. For each
fungus, the protein-coding regions for each gene were compared with those of every other gene in the same genome using BLASTX. Top scoring matches
were aligned and percentage of identities were calculated. N. crassa and T. melanosporum possesse a low set of highly similar (>90%) gene pairs, 13 and
7, respectively. (B) Expanding and contracting gene families (as determined by TRIBE-MCL) in T. melanosporum and representative sequenced
ascomycetes. The numbers on the branches show the numbers of expanded (left, red), unchanged (middel, black) or contracted (right, blue) protein families
along the lineages.
102
SUPPLEMENTARY INFORMATION
Figure S12. Distribution of multigene families (Tribe-MCL) in T. melanosporum, representative ascomycetes and the
ectomycorrhizal basidiomycete Laccaria bicolor.
103
SUPPLEMENTARY INFORMATION
Figure S13. Functional comparison of the PFAM protein domains of T. melanosporum with other sequenced ascomycetes. Hierarchical clustering based on the relative of PFAM protein domains. The top
100 PFAM domains found in T. melanosporum were selected. The frequency values were transformed into z-scores, which are measure of relative enrichment (red) and depletion (green). The data were clustered
according to species and PFAM domains by using a euclidean distance metric (Cluster 3.0) (http://www.falw.vu/~huik/cluster.htm). The results were visualized by using Java Treeview
(http://sourceforge.net/projects/jtreeview/). The T. melanosporum proteome is enriched for proteins containing TPR1, histone, and PPR domains (see section 5.3).
104
SUPPLEMENTARY INFORMATION
Figure S14. Distribution of orphan gene coding for lineage-specific protein on the largest genomic supercontigs of
T. melanosporum. Orphans are in yellow, whereas gene models are in blue and TE in red. Orphan genes are randomly scattered over
the protein-coding regions.
105
SUPPLEMENTARY INFORMATION
Fig. S15. Heat map showing the identity of 92 established fungal allergens to their best homologs in the predicted
T. melanosporum proteome and in seven additional fungal genomes, including the reference GRAS fungi S. cerevisiae and N.
crassa, and the highly allergenic A. fumigatus. The map was constructed from an MS excel file using EPClust. Identities are coded by increasing
colour saturation, with bright red corresponding to the highest degree of identity and black indicating the lack of a given allergen homolog in a given
genome.
106
SUPPLEMENTARY INFORMATION
Figure S16. (Top panel) Outline of sulfur metabolism and of the corresponding genes and pathways in T. melanosporum. Numbers in bold rightside to the names of the 10 main S-pathways indicate the number of the corresponding genes; numbers in italics indicate the number of gene products
involved in specific reactions. Orphan reactions are crossed; pathways or components that are more represented in Tuber than in Neurospora are indicated
with green numbers. (Bottom panel) Relative expression levels of genes involved in sulfate internalization and reduction (A) and
Cys/Met biosynthesis and interconversion (B) in different lifecycle stages of T. melanosporum. The specific log2 ratios utilized for
EPICLUST analysis, represented in a false color scale, are indicated above each column; gene names, shown on the right, are as specified in the metabolic
map.
107
SUPPLEMENTARY INFORMATION
Figure S17. Metabolic map of the “Cys/Met biosynthesis & interconversion” pathway and mRNA expression levels of the corresponding
enzymes. Expression levels are the mean of filtered and normalized hybridization signals derived from multireplicate experiments: expression values for
free-living mycelia (FLM) and fruiting bodies (FB) are shown in green and black, respectively. Of note: i) the preference for homocysteine (rxn. #2) vs
cysteine (rxn. #1) biosynthesis in FB; and ii) the disproportionately high expression levels of cystathionine γ-lyase (CGL, #5) and cystathionine β-lyase
(CBL, #4) in FBs compared to those of the preceding cystathionine synthase enzymes (#3 and #5), both of which are less expressed (3-6 fold) in FBs than
in FLM. Alternative reactions supported by CGL and CBL homologs from lactic bacteria (Liu et al 2008), potentially relevant for S-VOC formation in T.
melanosporum, are indicated by dashed arrows. OAS, O-acetylserine; OAH, O-acetylhomoserine; SAM, S-adenosylmethionine; SAH, Sadenosylhomocysteine; DMS, dimethylsulfide; DMDS, dimethyl-disulfide; DMTS, dimethyl-trisulfide.
108
SUPPLEMENTARY INFORMATION
Figure S18. Predicted Ehrlich pathways leading to characteristic truffle volatile organic compounds (VOC). Based on known yeast
pathways, the catabolism of five amino acids could lead in T. melanosporum to the formation of the aldehydes and alcohols (given with their structures) that
are key contributors of the truffle aroma. Compounds with a high volatility are in bold. Genes indentified in T. melanosporum potentially involved in the
Ehrlich pathway (transamination, decarboxylation and reduction steps, full arrows) are listed on the right part of the figure. The formation of dimethyl sulfide
(DMS), dimethyl disulfide (DMDS) and dimethyltrisulfide (DMTS) from 3-(methylthio)propanal (dashed arrow) can occur through chemical non-enzymatic
degradation.
109
SUPPLEMENTARY INFORMATION
Figure S19. Schematic representation of the mating type locus in T. melanosporum. The idiomorphic regions are in light grey, the
black lines indicate the common flanking regions. The MAT1-2 idiomorph is 4770 bp long and contains the MAT1-2-1 gene (red arrowed
box). The MAT1-1 idiomorph is 7470 bp long and contains the MAT1-1-1 gene (blue arrowed box). Within each MAT gene, the position of
introns and conserved HMG-box and α-box regions are indicated. The arrows indicate the direction of transcription. An additional putative
ORF was detected in the MAT1-2 idiomorph (pink arrowed box; gene model: GSTUMT00001089001). The yellow, green and dark-grey
boxes represent regions sharing sequence similarities between idiomorphs. Yellow boxes indicate inverted repeat elements with about
82% of sequence similarity. The dark-grey and green boxes indicate regions with 71% and 76% of sequence similarity, respectively. The
regions marked with the green boxes show high similarity (BlastX: Score = 242, E-value = 2e-62 ) with an ankyrin repeat-containing
protein. The 45° grid pattern indicates the opposite orientation of similar sequences between the idiomorphs. The flanking region
downstream the MAT1-1 idiomorph contains a 493 bp long insertion used to design a backward primer to specifically amplify the MAT1-1
idiomorph. A putative ORF (white arrowed box; gene model: GSTUMT00001092001) with no significant similarity in the GenBank
database but highly conserved (94% of sequence identity) between idiomorphs is present downstream the MAT locus.
110
SUPPLEMENTARY INFORMATION
Fig. S20. Schematic comparison of the genomic regions flanking the MAT12 locus in T. melanosporum, Fusarium graminearum,
Botrytis cinerea and Coccidioides immitis drawn using Chromomapper software (Niculita-Hirzel & Hirzel 2008). This analysis shows
the low level of synteny in T.melanosporum mating type region compared with other ascomycetes. In T. melanosporum, the only conserved gene in linkage
with MAT-1-2 is the gene coding for cytochrome c oxidase, subunit VIa/COX13 gene (GSTUMT00001086001).
111
SUPPLEMENTARY INFORMATION
Figure S21. Hierarchical clustering tree view of transcripts coding for carbohydrate-active enzymes (CAZymes) from T. melanosporum in freeliving mycelium, ectomycorrhizal root tips and fruiting bodies. Clustering analysis was carried out using the EPCLUST clustering tool. Each horizontal
line displays the expression ratio for one gene in symbiotic tissues, fruiting bodies or free-living mycélium vs. a mean expression reference calculated from
all arrays. Each gene is represented by a row of coloured boxes and each stage is represented by a single column. Regulation levels range from pale to
saturated colors (red for induction; green for repression). Black indicates no change in gene expression. ECM, T. melanosporum/Corylus avelana
ectomycorrhizae. See also Supplementary section 9.
112
SUPPLEMENTARY INFORMATION
Figure S22. Distribution of secreted peptidase gene models in T. melanosporum and other saprotrophic or pathogenic fungi.
Secreted peptidase classification is based on the MEROPS database (http://merops.sanger.ac.uk). Signal peptide was predicted using TargetP
(http://www.cbs.dtu.dk/services/TargetP/).
113
SUPPLEMENTARY INFORMATION
Figure S23. Distribution of expression levels for the predicted T. melanosporum gene models. Low expression, <500; medium expression,
500-5000 and high expression >5000. FLM free-living mycelium ; ECM ectomycorrhizal root tips ; FB fruiting bodies.
114