Downloaded from genome.cshlp.org on May 27, 2020 - Published by Cold Spring Harbor Laboratory Press
Letter
Assessing the Drosophila melanogaster
and Anopheles gambiae Genome Annotations
Using Genome-Wide Sequence Comparisons
Olivier Jaillon,1 Carole Dossat,1 Ralph Eckenberg,1 Karin Eiglmeier,2
Béatrice Segurens,1 Jean-Marc Aury,1 Charles W. Roth,2 Claude Scarpelli,1
Paul T. Brey,2 Jean Weissenbach,1 and Patrick Wincker1,3
1
Genoscope/Centre National de Séquençage and CNRS UMR 8030, 91057 Evry Cedex, France; 2Unité de Biochimie
et Biologie Moléculaire des Insectes, Institut Pasteur, Paris 75724 Cedex 15, France
We performed genome-wide sequence comparisons at the protein coding level between the genome sequences of
Drosophila melanogaster and Anopheles gambiae. Such comparisons detect evolutionarily conserved regions (ecores)
that can be used for a qualitative and quantitative evaluation of the available annotations of both genomes.
They also provide novel candidate features for annotation. The percentage of ecores mapping outside
annotations in the A. gambiae genome is about fourfold higher than in D. melanogaster. The A. gambiae genome
assembly also contains a high proportion of duplicated ecores, possibly resulting from artefactual sequence
duplications in the genome assembly. The occurrence of 4063 ecores in the D. melanogaster genome outside
annotations suggests that some genes are not yet or only partially annotated. The present work illustrates the
power of comparative genomics approaches towards an exhaustive and accurate establishment of gene models
and gene catalogues in insect genomes.
Whole-genome sequence comparisons between genomes
from metazoans can be used to detect sequence conservation
both in coding and noncoding regions. Whereas conservation
of coding regions can be detected between species separated
by large evolutionary distances (e.g., between mammals and
fish; Roest Crollius et al. 2000), the conservation of noncoding regions is usually much weaker and mainly detected between species that are separated by shorter evolutionary distances (e.g., within mammals; Kent 2002; Mural et al. 2002).
In other words, the kind and amount of information that can
be deduced from genomic DNA comparisons depend on the
evolutionary distance between the species.
The annotation process used for Drosophila (Rubin et al.
2000) relied on protein database searches, cDNA, and EST
matches and ab initio gene predictions. The power of protein
comparisons was high, but not exhaustive, because they concerned mainly species such as yeast, Caenorhabditis elegans
and mammals that are relatively distant from the fruit fly.
However, ab initio predictions and cDNA sequencing could
notably complement the annotations beyond conserved
genes, and a total of 13,666 genes was proposed for the analysis of the fly genome (Adams et al. 2000; Misra et al. 2002).
While finishing and analysis of the fly genome sequence was
still in progress, an additional set of genes was proposed (Gopal et al. 2001). The establishment of a draft sequence of the
genome of Anopheles gambiae (Holt et al. 2002) offers the possibility of reevaluation of the present D. melanogaster gene
inventory using a rationale that we used previously to compare a fraction of the human genome to that of a teleost fish,
Tetraodon nigroviridis (Roest Crollius et al. 2000). Conversely,
it will also provide an evaluation of the initial Anopheles ge3
Corresponding author.
E-MAIL pwincker@genoscope.cns.fr; FAX 33 1 60 87 25 89.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/
gr.922503.
nome annotations. We therefore carried out this type of global comparison between these two insect genomes.
RESULTS AND DISCUSSION
The Drosophila Annotation
The Exofish procedure (for EXOn FInding by Sequence Homology) that we developed for large-scale genome comparisons is based on the BLAST algorithm (Altschul et al. 1990). To
minimize background of false positive alignments outside
coding regions and to maximize the detection of evolutionarily conserved regions (ecores), TBLASTX parameters and filter conditions were adjusted on a set of reference sequences
(see Methods).
The available sequence assembly of A. gambiae (http://
www.ensembl.org/Anopheles_gambiae) and the last two versions of the D. melanogaster genome (http://www.fruitfly.org/
annot/release2.html and http://www.fruitfly.org/annot/
release3.html) were compared using the adjusted settings of
Exofish. A whole-genome comparison between the two genomes resulted in a total of 47,134 ecores (for release 2) or
46,742 ecores (for release 3) in the D. melanogaster genome
(Table 1; available at www.genoscope.cns.fr/Exofish/Fly).
These numbers are slightly different as the genome sequence
has changed between the two releases (Celniker et al. 2002).
The ecores created using release 3 were mapped on the collection of gene models defined by the annotations of fulllength cDNAs designated as the “Drosophila Gene Collection”
(Stapleton et al. 2002; we used a subset of 6,006 transcripts as
explained in the Methods section). We only considered ecores
located between the start and the end positions of the models.
We detected ecores in 87.7% of the genes and in 57.7% of the
exons. Six hundred thirty-seven (3.2%) ecores mapped outside the boundaries of annotated exons, and may correspond
to alternative exons, nested genes or false positives. In other
words, the specificity in this large set was higher than 96%.
13:1595–1599 ©2003 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/03 $5.00; www.genome.org
Genome Research
www.genome.org
1595
Downloaded from genome.cshlp.org on May 27, 2020 - Published by Cold Spring Harbor Laboratory Press
Jaillon et al.
fruitfly.org/annot/release3.html,
Misra et al. 2002; Table 1). We observed a significant increase in the
percentage of ecores falling inside
gene models between the two releases (93.5% versus 90.5%). This
provides an independent verification of the improvement of the D.
melanogaster annotation between
the two versions.
The gene number estimate is
based on a ratio of ecores per gene
determined using existing annotations and, as a consequence, could
reflect a bias in this set. This bias
would in particular depend on the
level of sequence conservation of
genes and on their structure (length
and number of exons). However,
the collection of 6006 full-length
cDNAs from the Drosophila Gene
Collection is based on biologic observations, and hence considered as
representative. Altogether, these genome comparisons reveal the presence of 4063 ecores outside of annotated exons in the Drosophila genome. Because the mean ecore
number in the Drosophila Gene Collection is higher than in other annotated genes, we expect that some
Figure 1 Exofish analysis on a region on arm 2L of the genome of Drosophila from two different
gene models are still incomplete or
releases of annotations, and around the same ecores. (Top) Results from release 2 of BDGP. (Bottom)
fragmented. We expect that most of
Results from release 3 of BDGP. (A, D) BDGP annotations on the 5⬘-3⬘ strand. (B, E) BDGP annotations
these would correspond to addion the 3⬘-5⬘ strand. The genes are represented by boxes, with vertical lines separating exons (red) and
tional exons of partially annotated
introns (white). (C, F) Ecores (blue). In release 2 (top), five ecores (numbers 7, 8, 9, 11,12) overlap four
gene models, and seven ecores (numbers 1, 2, 3, 4, 5, 6, 10) do not overlap any annotation. In release
genes. Conversely, it is not ex3, a large gene model overlaps all the ecores that fall exclusively in exons except ecore number 9. This
pected that these 4063 ecores will
ecore is part of a gene model on the 5⬘-3⬘ strand, which is predicted inside one intron on the 5⬘-3⬘
contribute to a substantial increase
strand.
in the total gene number of Drosophila. A verified example of a
The mean number of ecores per gene was equal to 3.22 when
modification of a predicted gene indicated by Exofish is
we considered only ecores overlapping exons, and to 3.33
shown in Figure 1. In this case, a series of additional exons in
when considering all ecores within a gene model. Applying
the release 2 annotation is predicted by Exofish, suggesting
these ratios of ecores per gene to the whole genome provides
that a significant number of exons were missed in this region
a gene number estimate in Drosophila between 14,036
(Fig. 1A). We reexamined the same region in release 3, and
(46,742/3.33) and 14,516 (46,742/3.22).
observed that after the new annotation, all ecores are placed
Ecores were also compared to the two last BDGP (Berkein two gene models (Fig.1B). A second example is seen in
ley Drosophila Genome Project) genome annotations (http://
Figure 2, where the presence of two ecores in a region without
www.fruitfly.org/annot/release2.html and http://www.
annotation in the two insect genomes revealed the exis-
Table 1. Distributions of Ecores in the Sequence of Drosophila in Two Successive Annotations
Set
BDGP
BDGP
BDGP
BDGP
Release
Release
Release
Release
2
2
3
3
(number)
(%)
(number)
(%)
Ecores
Genes
Genes
detected
47,134
n.d.
46,742
n.d.
13,468
n.d.
13,666
n.d.
11,147
82.8
11,167
81.7
Ecores
within
genes
42,633
90.5
43,705
93.5
Exons
Exons
detected
Ecores
overlapping
exons
Ecores
overlapping
genes not
overlapping
exons
54,771
n.d.
61,085
n.d.
31,751
58.0
33,996
55.7
41,332
87.7
42,679
91.3
1072
2.3
1026
2.2
Genes and exons stand for annotated genes and exons in the corresponding versions.
1596
Genome Research
www.genome.org
Ecores/
gene
3.17
n.d.
3.2
n.d.
Downloaded from genome.cshlp.org on May 27, 2020 - Published by Cold Spring Harbor Laboratory Press
Drosophila/Anopheles Genomes Comparison
eral explanations that are not mutually exclusive may account for
this observation. The high number
of ecores could be the consequence
of (1) an increased coding capacity
in the genome of Anopheles, or (2) a
larger number of pseudogenes or
unmasked tranposable elements in
Anopheles, or (3) problems in the sequence assembly. Explanations (1)
and (2) were not supported by a
previous comparative analysis
(Zdobnov et al. 2002). The presence
of at least two different haplotypes
in the A. gambiae strain sequenced
is known to have introduced a
Figure 2 Ecores detecting a new gene model. The scale refers to the position on the chromosome
number of redundancies in the asarm 2L of the genome of Anopheles. (A) Ensembl gene predictions on the 5⬘-3⬘ strand. (B) Ensembl
sembly, essentially as linked artegene predictions on the 3⬘-5⬘ strand. The genes are represented by boxes, with vertical lines separating
factual duplications and unanexons (red) and introns (white). (C) Ecores (blue). (D) A confirmatory cDNA sequence is in green, with
chored duplicated scaffolds (Holt et
a potential intron in white. Only one cDNA, matching with two consecutive unanotated ecores, is
al. 2002). We analyzed the redunrepresented here. This cDNA (corresponding to the assembly of entries BX034944 and BX034945)
matches a region unannotated in both Drosophila and Anopheles genomes.
dancy in both genomes looking for
multiple occurrences of two ecores
in one genome created by a single
tence of a totally new gene, confirmed by a spliced mosquito
common region in the other genome. A striking result was
cDNA.
observed for the alignments occurring once in Drosophila and
We also ran Exofish against the additional 1042 canditwice in Anopheles (n = 3476), which were more abundant
date genes recently proposed for Drosophila (Gopal et al. 2001;
than the reverse (once in Anopheles and twice in Drosophila,
http://genomes.rockefeller.edu/dm). We obtained ecores on
n = 1650, see Methods). We observed significantly more du18.7% of these new gene models (the list of the matches can
plicated ecores in the same chromosome in Anopheles (77% of
be found at www.genoscope.cns/externe/Fly). This low fracthe duplicated cases) than with Drosophila (60%). One exception could result from a very low conservation of these genes
tion was noted for chromosome X, where duplicated ecores
between Anopheles and Drosophila, possibly representing a
have their second copy randomly present in the Anopheles
subset of rapidly evolving genes, or from a substantial number
genome. This corresponds to the expectation, because chroof false-positive predictions. However, Exofish can serve to
mosome X is the only Anopheles chromosome assembled esvalidate a number of these potential genes.
sentially from a single haplotype (Holt et al. 2002), apparently
because of selection in the sequenced strain. An even more
The Anopheles Annotation
striking result is obtained when looking at small, unmapped
scaffolds. These sequences represent only 16% of the size of
We also attempted to use Exofish in a reverse mode to identify
the whole assembly, but contained about 35% of the dupliecores in Anopheles, assuming that if one ecore in the genome
cated ecores. Taken together, these results indicate that an
of Drosophila flags a coding sequence, the corresponding ecore
important fraction of the excess ecores resides in regions with
in Anopheles should flag a coding sequence. To test the reverse
mode, we applied Exofish to a 585kb region from the Pen1 locus of
Anopheles using the whole genome
of Drosophila. This region had been
independently annotated manually
(unpublished results). We detected
100 ecores in this region, with only
six of them lying outside of annotated exons, while 83% of the annotated genes are confirmed by at
least one ecore. This shows that the
expected sensitivity of Exofish
should be comparable in this reverse mode. A genome-wide analysis was then performed with the
whole A. gambiae assembly.
Figure 3 Ecores defining a new gene model on A. gambiae chromosome 2R. The scale refers to the
We found more ecores in the
Anopheles assembly (54,069 for re- position on the chromosome. (A) Ensembl gene predictions on the 5⬘-3⬘ strand. The genes are represented by boxes, with vertical lines separating exons (red) and introns (white). (B) Ecores (blue). (C)
lease 6.01a) than in the Drosophila
Anopheles cDNA clone (green), with potential introns in white. Only one cDNA, matching with three
genome (ratio = 1.16). The mean consecutive unanotated ecores is represented here. This cDNA (corresponding to the assembly of
size of the ecores is identical for entries BX063894 and BX063895) matches all along its sequence with the Drosophila Innexin-7 gene.
This gene is not annotated in both releases of the Anopheles annotation.
both species (251 nucleotides). Sev-
Genome Research
www.genome.org
1597
Downloaded from genome.cshlp.org on May 27, 2020 - Published by Cold Spring Harbor Laboratory Press
Jaillon et al.
(21%) of Anopheles ecores map outside of annotations. These observations indicate that a substantial
fraction of exons were not annotated and that a number of gene
models should be revised.
A new version of the Anopheles
assembly and annotation was recently released (version 10.2.1).
This new version addressed some
misassembly problems and corrected a number of automatic gene
predictions using recent data. Surprisingly, the percentage of ecores
outside of annotation increased
from 21%–25.6% (Table 2). However, an improvement between the
two versions was seen at the level of
the redundant ecores. We found
that a significant fraction of the duplicated ecores that were present in
the release 6.1a have been discarded as haplotype variants. This
explained in large part the net disappearance of 937 ecores between
the two versions.
Three main types of annotation problems were observed that
remained in the two versions. They
Figure 4 Ecores correcting a gene model. The scale refers to the position on the chromosome arm
are exemplified here: absence of an3L of the genome of Anopheles. (A) Ensembl gene predictions (release 6.1a) on the 5⬘-3⬘ strand. The
notation in both genomes (Fig. 2);
genes are represented by boxes, with vertical lines separating exons (red) and introns (white). (B)
absence of annotation in Anopheles
Ecores (blue). (C) A cDNA sequence is in green, with potential introns in white. Only one cDNA,
of a known gene in Drosophila (Fig.
matching with unannotated ecores, is represented here. This cDNA (corresponding to the assembly of
entries BX062803 and BX062804) matches two of the three orphan ecores. It is homologous through3); incorrectly predicted gene lackout to a Drosophila tetraspanin family member. The version 6.1a of the annotation apparently fused the
ing some exons and integrating intwo last exons of the gene with two putative exons, originating from a transposable element. The large
correct ones (Fig. 4). In the three exsizes of the two first introns may induce such erroneous model constructions. In release 10.2.1, the
amples shown in the figures, the
region is entireley unannotated.
ecores were confirmed by the existence of Anopheles cDNA clones.
potential assembly problems. Further improvements of the A.
This study shows how whole genome comparisons based
gambiae genome annotation will be greatly dependent on
on a tool like Exofish can be used as an efficient method to
resolution of the misassembled regions.
evaluate the quality and to improve existing annotations of
We compared the 54,069 ecores from the assembly of
insect genomes. In particular, it provides an independent asAnopheles to release 6.1a of the Celera-Ensembl joint annotasessment of the improvement of the Drosophila annotation
tions of Anopheles (http://www.ensembl.org/Anopheles_
across the successive releases. The fact that 4,063 ecores do not
gambiae). We found that 79% of the ecores matched with
overlap annotated Drosophila exons illustrates the potential of
79.1% of the gene candidates (Table 2). The fraction of annointerspecies comparisons, even for extensively studied species
tated Anopheles genes that is detected by Exofish is thus
like Drosophila. The number of ecores outside annotations in A.
slightly lower than in Drosophila. Conversely, a large fraction
gambiae (13,791; Table 2) is higher than for Drosophila, showing
Table 2. Comparisons Between Ecores on the Assembly of Anopheles and the Successive Ensembl Annotations
Set
EnsEMBL
EnsEMBL
EnsEMBL
EnsEMBL
Release
Release
Release
Release
6.1a (number)
6.1a (%)
10.2.1 (number)
10.2.1 (%)
Ecores
Genes
Genes
detected
54,069
n.d.
53,132
n.d.
15,088
n.d.
14,658
n.d.
11,929
79.1
10,759
73.4
Ecores
overlapping
genes
42,693
79.0
39,749
74.8
Genes and exons stand for annotated genes and exons in the corresponding versions.
1598
Genome Research
www.genome.org
Exons
Exons
detected
Ecores
overlapping
exons
Ecores
overlapping
genes not
overlapping
exons
53,693
n.d.
56,573
n.d.
32,553
60.6
32,610
57.6
40,278
74.5
39,247
73.9
2,415
4.5
502
0.9
Downloaded from genome.cshlp.org on May 27, 2020 - Published by Cold Spring Harbor Laboratory Press
Drosophila/Anopheles Genomes Comparison
that the present automated annotation is probably missing a
substantial number of coding sequences. Two successive versions of the annotation gave globally comparable results, reflecting the slow progress in the acquisition of functional and comparative data for annotating this organism. Anopheles/Drosophila
ecores can clearly serve to refine and improve the next versions of the Anopheles annotation. The precise locations of
ecores in each genome are available for improving both annotations (http://www.genoscope.cns.fr/Exofish/Fly). More
generally, this study illustrates the power of whole-genome
comparisons, and could be extended to other species combinations with the availability of newly sequenced genomes.
METHODS
Exofish Procedure
To determine the conditions that would generate alignment in
coding regions, we first tested a large range of TBLASTX (Altscul
et al. 1990) conditions (W,X, scoring matrix) between the ADH
region of Drosophila that contains 222 transcripts (Ashburner et
al. 1999), and a collection of 16 Mb of shotgun reads from the
Anopheles genome. All sequences were masked against known
repeats. For each condition, we kept an alignment if all of the
alignments with the same length and percent identity were located in a coding region. We selected the conditions that provided the highest sensitivity (match score = 15, mismatch
score = ⳮ3, W = 4, X = 19). We created a general filter based on
the combination of length and percent identity that distinguish
alignments falling exclusively in exons from others. For this
purpose, we added a collection of sequences of 591 introns of
chromosome X of Drosophila (Benos et al. 2001) to the ADH
region. We compared this resource to a collection of 310 Mb
of shotgun reads from the Anopheles genome. Applying these
criteria a series of alignments was selected. We joined overlapping alignments to form ecores. Exofish is a three-step process: compute alignments/filter/create ecores.
Reverse Mode and Ecores Duplicated
Ecores can be built either on the sequence of one species, or
on the sequence of the other one among the two genomes
being compared (reverse mode). We can link one ecore on one
genome to one ecore (eventually more than one) on the other
genome if they have common local alignments. To investigate duplications, we selected situations where one ecore on
one genome is linked to two ecores (on the other genome)
that are both exclusively linked to the same ecore.
Selection of a Drosophila Reference Gene Set
To have a good estimate of sensitivity and specificity of
Exofish, we needed a collection of nonredundant and complete genes. We choose the BDGP gene models that correspond to a DGC reference (Stapleton et al. 2002). We eliminate the genes that have at least one intron overlapped by
another annotation of the BDGP from this set. Hence, we
retained 6006 gene models.
Computations
Anopheles cDNA were mapped on the genomic sequence using Sim4 (Florea et al. 1998).
The series of BLAST comparisons were performed using
the Lassap package (Glemet and Codani 1997). All the computations were performed on a cluster of 40 CPU ␣ (EV6.8/
1GHz) organized in eight nodes (7 ES45 + 1 GS160-12) using
the Cluster File System.
ACKNOWLEDGMENTS
The publication costs of this article were defrayed in part by
payment of page charges. This article must therefore be
hereby marked “advertisement” in accordance with 18 USC
section 1734 solely to indicate this fact.
REFERENCES
Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., Gocayne, J.D.,
Amanatides, P.G., Scherer, S.E., Li, P.W., Hoskins, R.A., Galle,
R.F., et al. 2000. The genome sequence of Drosophila
melanogaster. Science 287: 2185–2195.
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990.
Basic local alignment search tool. J. Mol. Biol. 215: 403–410.
Ashburner, M., Misra, S., Roote, J., Lewis, S.E., Blazej, R., Davis, T.,
Doyle, C., Galle, R., George, R., Harris, N., et al. 1999. An
exploration of the sequence of a 2.9-Mb region of the genome of
Drosophila melanogaster: The Adh region. Genetics 153: 179–219.
Benos, P.V., Gatt, M.K., Murphy, L., Harris, D., Barrell, B., Ferraz, C.,
Vidal, S., Brun, C., Demaille, J., and Cadieu, E. 2001. From first
base: The sequence of the tip of the X chromosome of Drosophila
melanogaster, a comparison of two sequencing strategies. Genome
Res. 11: 710–730.
Celniker, S.E., Wheeler, D.A., Kronmiller, B., Carlson, J.W., Halpern,
A., Patel, S., Adams, M., Champe, M., Dugan, S.P., and Frise, E.
2002. Finishing a whole-genome shotgun: Release 3 of the
Drosophila melanogaster euchromatic genome sequence. Genome
Biol. 3: 7901–7914.
Florea, L., Hartzell, G., Zhang, Z., Rubin, G.M., and Miller, W. 1998.
A computer program for aligning a cDNA sequence with a
genomic DNA sequence. Genome Res. 8: 967–974.
Glemet, E. and Codani, J.J. 1997. LASSAP, a LArge Scale Sequence
compArison Package. Comput. Appl. Biosci. 13: 137–143.
Gopal, S., Schroeder, M., Pieper, U., Sczyrba, A., Aytekin-Kurban, G.,
Bekiranov, S., Fajardo, J.E., Eswar, N., Sanchez, R., Sali, A., et al.
2001. Homology-based annotation yields 1,042 new candidate genes
in the Drosophila melanogaster genome. Nat. Genet. 27: 337–340.
Holt, R.A., Subramanian, G.M., Halpern, A., Sutton, G.G., Charlab, R.,
Nusskern, D.R., Wincker, P., Clark, A.G., Ribeiro, J.M., Wides, R., et
al. 2002. The genome sequence of the malaria mosquito Anopheles
gambiae. Science 298: 129–149.
Kent, W.J. 2002. BLAT—The BLAST-like alignment tool. Genome Res.
12: 656–664.
Misra, S., Crosby, M.A., Mungall, C.J., Matthews, B.B., Campbell, K.S.,
Hradecky, P., Huang, Y., Kaminker, J.S., Millburn, G.H., Prochnik, S.E.,
et al. 2002. Annotation of the Drosophila melanogaster euchromatic
genome: A systematic review. Genome Biol. 3: 8301–8322.
Mural, R.J., Adams, M.D., Myers, E.W., Smith, H.O., Miklos, G.L.,
Wides, R., Halpern, A., Li, P.W., Sutton, G.G., Nadeau, J., et al.
2002. A comparison of whole-genome shotgun-derived mouse
chromosome 16 and the human genome. Science 296: 1161–1171.
Roest Crollius, H., Jaillon, O., Bernot, A., Dasilva, C., Bouneau, L.,
Fischer, C., Fizames, C., Wincker, P., Brottier, P., Quetier, F., et al.
2000. Estimate of human gene number provided by genome-wide
analysis using Tetraodon nigroviridis DNA sequence. Nat. Genet.
25: 235–238.
Rubin, G.M., Yandell, M.D., Wortman, J.R., Gabor Miklos, G.L.,
Nelson, C.R., Hariharan, I.K., Fortini, M.E., Li, P.W., Apweiler, R.,
Fleischmann, W., et al. 2000. Comparative genomics of the
eukaryotes. Science 287: 2204–2215.
Stapleton, M., Liao, G., Brokstein, P., Hong, L., Carninci, P., Shiraki, T.,
Hayashizaki, Y., Champe, M., Pacleb, J., Wan, K., et al. 2002. The
Drosophila gene collection: Identification of putative full-length
cDNAs for 70% of D. melanogaster genes. Genome Res. 12: 1294–1300.
Zdobnov, E.M., Von Mering, C., Letunic, I., Torrents, D., Suyama,
M., Copley, R.R., Christophides, G.K., Thomasova, D., Holt, R.A.,
Subramanian, G.M., et al. 2002. Comparative genome and
proteome analysis of Anopheles gambiae and Drosophila
melanogaster. Science 298: 149–159.
WEB SITE REFERENCES
http://www.fruitfly.org/DGC; BDGP; Drosophila gene collection.
http://www.fruitfly.org/annot/release2.html; BDGP; Drosophila
genome annotation release 2.
http://www.ensembl.org/Anopheles_gambiae; ENSEMBL mosquito
genome server.
http://www.genoscope.cns.fr/Exofish/Fly; Genoscope
Anopheles/Drosophila Exofish database.
http://genomes.rockfeller.edu/dm; A collection of additional
candidate genes in Drosophila.
Received October 24, 2002; accepted in revised form April 25, 2003.
Genome Research
www.genome.org
1599
Downloaded from genome.cshlp.org on May 27, 2020 - Published by Cold Spring Harbor Laboratory Press
Assessing the Drosophila melanogaster and Anopheles gambiae
Genome Annotations Using Genome-Wide Sequence Comparisons
Olivier Jaillon, Carole Dossat, Ralph Eckenberg, et al.
Genome Res. 2003 13: 1595-1599
Access the most recent version at doi:10.1101/gr.922503
References
This article cites 15 articles, 9 of which can be accessed free at:
http://genome.cshlp.org/content/13/7/1595.full.html#ref-list-1
License
Email Alerting
Service
Receive free email alerts when new articles cite this article - sign up in the box at the
top right corner of the article or click here.
To subscribe to Genome Research go to:
http://genome.cshlp.org/subscriptions
Cold Spring Harbor Laboratory Press