Abstract
The genome of the flowering plant Arabidopsis thaliana has five chromosomes1,2. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNAPro genes and the other contains 27 tandem repeats of tRNATyr-tRNATyr-tRNASergenes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Goodman, H., Ecker, J. R. & Dean, C. The genome of Arabidopsis thaliana. Proc. Natl Acad. Sci. USA 93, 10831â 10835 (1995).
Meyerowitz, E. M. in Arabidopsis (eds Meyerowitz, E. M. & Somerville, C.) 21â36 (Cold Spring Harbor Press, Cold Spring Harbor, NY, 1994).
Goffeau, A. et al. Life with 6000 genes. Science 274, 546â567 (1996).
C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282, 2012â 2046 (1998).
Adams, M. D. The genome sequence of Drosophila melanogaster Science 287, 2185â2195 (2000).
Lin, X. et al. Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402, 761â 768 (1999).
Mayer, K. Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana . Nature 402, 769â777 (1999).
Mozo, T. et al. A complete BAC-based physical map of the Arabidopsis thaliana genome. Nature Genet. 22, 271â 275 (1999).
Marra, M. et al. A map for sequence analysis of the Arabidopsis thaliana genome. Nature Genet. 22, 265â 270 (1999).
Creusot, F. et al. The CIC library: a large insert YAC library for genome mapping in Arabidopsis thaliana. Plant J. 8, 763â770 (1995).
Ewens, W. J. et al. Genome mapping with anchored clones: theoretical aspects. Genomics 11, 799â805 (1991).
Venter, J. C., Smith, H. O. & Hood, L. A new strategy for sequencing. Nature 381, 364â366 (1996).
Choi, S., Creelman, R. A., Mullet, J. E. & Wing, R. Construction and characterization of a bacterial artificial chromosome library of Arabidopsis thaliana. Plant Mol. Biol. Rep. 13, 124â128 (1995).
Mozo, T., Fischer, S., Shizuya, H. & Altmann, T. Construction and characterization of the IGF Arabidopsis BAC library. Mol. Gen. Genet. 258, 562â570 (1998).
Round, E. K., Flowers, S. K. & Richards, E. J. Arabidopsis thaliana centromere regions: genetic map positions and repetitive DNA structure. Genome Res. 7, 1045â1053 (1997).
Richards, E. J., Chao, S., Vongs, A. & Yang, J. Characterization of Arabidopsis thaliana telomeres isolated in yeast. Nucleic Acids Res. 20, 4039â4046 (1992).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389â3402 (1997).
Lister, C. & Dean, C. Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J. 4, 745â750 ( 1993).
The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana . Nature 408, 796â815 (2000)..
Beier, D., Stange, N., Gross, H. J. & Beier, H. Nuclear tRNA(Tyr) genes are highly amplified at a single chromosomal site in the genome of Arabidopsis thaliana. Mol. Gen. Genet. 225, 72â80 (1991).
Copenhaver, G. P. et al. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468â 2474 (1999).
Conner, J. A., Conner, P., Nasrallah, M. E. & Nasrallah, J. B. Comparative mapping of the Brassica S locus region and its homolog in Arabidopsis: implications for the evolution of mating systems in the Brassicaceae. Plant Cell 10, 801â 812 (1998).
Rottmann, W. E. et al. 1-aminocyclopropane-1-carboxylate synthase in tomato is encoded by a multigene family whose transcription is induced during fruit and floral senescence. J. Mol. Biol. 222, 937â 961 (1991).
Salanoubat, M. et al. Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana. Nature 408, 820â 822 (2000).
Tabata, S. et al. Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature 408, 823â 826 (2000).
Chory, J. et al. National Science Foundation-sponsored workshop report: âThe 2010 Projectâ functional genomics and the virtual plant. A blueprint for understanding how plants are built and how to improve them. Plant Physiol. 123, 423â426 (2000).
Lockhart, D. J. & Winzeler, E. A. Genomics, gene expression and DNA arrays. Nature 405, 827â836 (2000).
Ecker, J. R. PFGE and YAC analysis of the Arabidopsis genome. Methods 1, 186â194 ( 1990).
Oefner, P. J. et al. Efficient random subcloning of DNA sheared in a recirculating point-sink flow system. Nucleic Acids Res. 24, 3879â3886 (1996).
Dietrich, F. S. et al. The nucleotide sequence of Saccharomyces cerevisiae chromosome V. Nature (Suppl.) 387, 78â 81 (1997).
Marziali, A., Willis, T. D., Federspiel, N. A. & Davis, R. W. An automated sample preparation system for large-scale DNA sequencing. Genome Res. 9, 457â462 ( 1999).
Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using Phred I. Accuracy assessment. Genome Res. 8, 175â 185 (1998).
Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred II. Error probabilities. Genome Res. 8, 186â194 ( 1998).
Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195â202. (1998).
Uberbacher, E. C. & Mural, R. J. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc. Natl Acad. Sci. USA 88, 11261â 11265 (1991).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78â94 (1997).
Salzberg, S. L., Pertea, M., Delcher, A. L., Gardner, M. J. & Tettelin, H. Interpolated Markov models for eukaryotic gene finding. Genomics 59, 24 â31 (1999).
Lukashin, A. V. & Borodovsky, M. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res. 26, 1107â1115 (1998).
Hebsgaard, S. M. et al. Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 24, 3430â3452 ( 1996).
Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic sequences. Genomics 46, 37â45 (1997).
Frishman, D. & Mewes, H.-W. PEDANTic genome analysis. Trends Genet. 13, 415â416 (1997).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955â964 ( 1997).
Emanuelsson, O., Nielsen, H., Brunak, S. & von Heijne, G. Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005â 1016 (2000).
Acknowledgements
We thank K. Mayer and H. Schoof of MIPS for discussions; S. Rhee and E. Huala of TAIR for sequences for the RI markers; and R. Wells for editing the manuscript. This work was funded by National Science Foundation/US Department of Energy/US Department of Agriculture (NSF/DOE/USDA) grants to the SPP Consortium and TIGR.
Author information
Authors and Affiliations
Corresponding authors
Supplementary information
Rights and permissions
About this article
Cite this article
Theologis, A., Ecker, J., Palm, C. et al. Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana . Nature 408, 816â820 (2000). https://doi.org/10.1038/35048500
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/35048500
This article is cited by
-
Functional characterization of genes related to triterpene and flavonoid biosynthesis in Cyclocarya paliurus
Planta (2024)
-
Exploiting AGPase genes and encoded proteins to prioritize development of optimum engineered strains in microalgae towards sustainable biofuel production
World Journal of Microbiology and Biotechnology (2023)
-
Transcriptome mapping related genes encoding PR1 protein involved in necrotic symptoms to soybean mosaic virus infection
Molecular Breeding (2023)
-
Differentiation of an Iranian resistance chickpea line to Ascochyta blight from a susceptible line using a functional SNP
AMB Express (2022)
-
Comparative transcriptome analysis of MeJA-responsive AP2/ERF transcription factors involved in notoginsenosides biosynthesis
3 Biotech (2020)