Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions

J Proteome Res. 2006 Apr;5(4):879-87. doi: 10.1021/pr060048x.

Abstract

Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / chemistry
  • Animals
  • Conserved Sequence*
  • Databases, Factual
  • Entropy
  • Molecular Sequence Data
  • Plants
  • Predictive Value of Tests
  • Protein Structure, Tertiary*
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Homology, Amino Acid
  • Software
  • Structure-Activity Relationship

Substances

  • Amino Acids
  • Proteins