Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Almost all human genes resulted from ancient duplication

Proc Natl Acad Sci U S A. 2006 Dec 12;103(50):19027-32. doi: 10.1073/pnas.0608796103. Epub 2006 Dec 4.

Abstract

Results of protein sequence comparison at open criterion show a very large number of relationships that have, up to now, gone unreported. The relationships suggest many ancient events of gene duplication. It is well known that gene duplication has been a major process in the evolution of genomes. A collection of human genes that have known functions have been examined for a history of gene duplications detected by means of amino acid sequence similarity by using BLASTp with an expectation of two or less (open criterion). Because the collection of genes in build 35 includes sets of transcript variants, all genes of known function were collected, and only the longest transcription variant was included, yielding a 13,298-member library called KGMV (for known genes maximum variant). When all lengths of matches are accepted, >97% of human genes show significant matches to each other. Many form matches with a large number of other different proteins, showing that most genes are made up from parts of many others as a result of ancient events of duplication. To support the use of the open criterion, all of the members of the KGMV library were twice replaced with random protein sequences of the same length and average composition, and all were compared with each other with BLASTp at expectation two or less. The set of matches averaged 0.35% of that observed for the KGMV set of proteins.

MeSH terms

  • Amino Acid Sequence
  • Computational Biology
  • Computer Simulation
  • Databases, Genetic
  • Evolution, Molecular*
  • Gene Duplication*
  • Humans
  • Proteins / genetics

Substances

  • Proteins