Enzyme-mediated modifications at the wobble position of tRNAs are essential for the translation o... more Enzyme-mediated modifications at the wobble position of tRNAs are essential for the translation of the genetic code. We report the genetic, biochemical and structural characterization of CmoB, the enzyme that recognizes the unique metabolite carboxy-S-adenosine-L-methionine (Cx-SAM) and catalyzes a carboxymethyl transfer reaction resulting in formation of 5-oxyacetyluridine at the wobble position of tRNAs. CmoB is distinctive in that it is the only known member of the SAM-dependent methyltransferase (SDMT) superfamily that utilizes a naturally occurring SAM analog as the alkyl donor to fulfill a biologically meaningful function. Biochemical and genetic studies define the in vitro and in vivo selectivity for Cx-SAM as alkyl donor over the vastly more abundant SAM. Complementary high-resolution structures of the apo- and Cx-SAM bound CmoB reveal the determinants responsible for this remarkable discrimination. Together, these studies provide mechanistic insight into the enzymatic and n...
The gene encoding rabbit muscle creatine kinase (CK) has been subcloned into a single plasmid, an... more The gene encoding rabbit muscle creatine kinase (CK) has been subcloned into a single plasmid, and the protein expressed in a soluble and functional form in Escherichia coli. The amino terminus, specific activity, and electrophoretic mobility of the E. coli-expressed creatine kinase are all identical with that of creatine kinase purified from rabbit skeletal muscle. Surprisingly, isoelectric focusing shows that the expressed protein displays no less heterogeneity than the tissue-purified material. The identification of the source(s) of this heterogeneity is important for the preparation of highly homogeneous material needed for structural studies and clinical applications. This issue also has implications for studies of the developmental regulation and tissue localization of the various CK genes. Our results allow us to eliminate some of the proposals, such as the presence of multiple alleles, alternative ribosomal initiation sites, and post-translational glycosylation or phosphoryl...
As the volume of data relating to proteins increases, researchers rely more and more on the analy... more As the volume of data relating to proteins increases, researchers rely more and more on the analysis of published data, thus increasing the importance of good access to these data that vary from the supplemental material of individual papers, all the way to major reference databases with professional staff and long-term funding. Specialist protein resources fill an important middle ground, providing interactive web interfaces to their databases for a focused topic or family of proteins, using specialised approaches that are not feasible in the major reference databases. Many are labours of love, run by a single lab with little or no dedicated funding and there are many challenges to building and maintaining them. This perspective arose from a meeting of several specialist protein resources and major reference databases held at the Wellcome Trust Genome Campus (Cambridge, UK) on the 11th and 12th of August 2014. During this meeting some common key challenges involved in creating and ...
Understanding how enzymes have evolved offers clues about their structure-function relationships ... more Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.
Superfamily and family analyses provide an effective tool for the functional classification of pr... more Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures.
Identification and functional characterization of the genes in the human genome remain a major ch... more Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previous...
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2001
Visualization interfaces for high performance computing systems pose special problems due to the ... more Visualization interfaces for high performance computing systems pose special problems due to the complexity and volume of data these systems manipulate. In the post-genomic era, scientists must be able to quickly gain insight into structure-function problems, and require flexible computing environments to quickly create interfaces that link the relevant tools. Feature, a program for analyzing protein sites, takes a set of 3-dimensional structures and creates statistical models of sites of structural or functional significance. Until now, Feature has provided no support for visualization, which can make understanding its results difficult. We have developed an extension to the molecular visualization program Chimera that integrates Feature's statistical models and site predictions with 3-dimensional structures viewed in Chimera. We call this extension ViewFeature, and it is designed to help users understand the structural Features that define a site of interest. We applied ViewFe...
Gene knockouts in a model organism such as mouse provide a valuable resource for the study of bas... more Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences) were used to evaluate localization results. In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small ...
This review examines the enzymes of 4-chlorobenzoate to 4-hydroxybenzoate converting pathway foun... more This review examines the enzymes of 4-chlorobenzoate to 4-hydroxybenzoate converting pathway found in certain soil bacteria. This pathway consists of three enzymes: 4-chlorobenzoate: Coenzyme A ligase, 4-chlorobenzoyl-Coenzyme A dehalogenase and 4-hydroxybenzoyl-Coenzyme A thioesterase. Recent progress made in the cloning and expression of the pathway genes from assorted bacterial strains is described. Gene order and sequence found among these strains are
Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (g... more Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins in 12 families in the PRS that represent ∼85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discov...
To construct the entry clone, blunt-end FVII cDNA and subsequent polymerase chain reaction (PCR) ... more To construct the entry clone, blunt-end FVII cDNA and subsequent polymerase chain reaction (PCR) product isolated from a HepG2 cell line was TOPO-cloned into a pENTR TOPO vector. To construct the expression clone, a LR recombination reaction was carried out ...
... l 1OS Press. 1999 Structure-Based Sequence Alignment of Mandelate Racemase and Muconate Lacto... more ... l 1OS Press. 1999 Structure-Based Sequence Alignment of Mandelate Racemase and Muconate Lactonizing Enzyme: Superposition on Reality Miriam S. Hasson* $, Patricia C. Babbitt^, Dagmar Ringe'. ... Mot Biol Evol 11 (1994l 571-92. [3] DJ Neidhart. ...
Page 198. INTRODUCTION TO INFORMATICS APPLICATIONS IN STRUCTURAL GENOMICS SEAN D. MOONEY Stanford... more Page 198. INTRODUCTION TO INFORMATICS APPLICATIONS IN STRUCTURAL GENOMICS SEAN D. MOONEY Stanford Medical Informatics Department of Genetics, Stanford University Stanford, CA 94305 PATRICIA C BABBIT ...
Enzyme-mediated modifications at the wobble position of tRNAs are essential for the translation o... more Enzyme-mediated modifications at the wobble position of tRNAs are essential for the translation of the genetic code. We report the genetic, biochemical and structural characterization of CmoB, the enzyme that recognizes the unique metabolite carboxy-S-adenosine-L-methionine (Cx-SAM) and catalyzes a carboxymethyl transfer reaction resulting in formation of 5-oxyacetyluridine at the wobble position of tRNAs. CmoB is distinctive in that it is the only known member of the SAM-dependent methyltransferase (SDMT) superfamily that utilizes a naturally occurring SAM analog as the alkyl donor to fulfill a biologically meaningful function. Biochemical and genetic studies define the in vitro and in vivo selectivity for Cx-SAM as alkyl donor over the vastly more abundant SAM. Complementary high-resolution structures of the apo- and Cx-SAM bound CmoB reveal the determinants responsible for this remarkable discrimination. Together, these studies provide mechanistic insight into the enzymatic and n...
The gene encoding rabbit muscle creatine kinase (CK) has been subcloned into a single plasmid, an... more The gene encoding rabbit muscle creatine kinase (CK) has been subcloned into a single plasmid, and the protein expressed in a soluble and functional form in Escherichia coli. The amino terminus, specific activity, and electrophoretic mobility of the E. coli-expressed creatine kinase are all identical with that of creatine kinase purified from rabbit skeletal muscle. Surprisingly, isoelectric focusing shows that the expressed protein displays no less heterogeneity than the tissue-purified material. The identification of the source(s) of this heterogeneity is important for the preparation of highly homogeneous material needed for structural studies and clinical applications. This issue also has implications for studies of the developmental regulation and tissue localization of the various CK genes. Our results allow us to eliminate some of the proposals, such as the presence of multiple alleles, alternative ribosomal initiation sites, and post-translational glycosylation or phosphoryl...
As the volume of data relating to proteins increases, researchers rely more and more on the analy... more As the volume of data relating to proteins increases, researchers rely more and more on the analysis of published data, thus increasing the importance of good access to these data that vary from the supplemental material of individual papers, all the way to major reference databases with professional staff and long-term funding. Specialist protein resources fill an important middle ground, providing interactive web interfaces to their databases for a focused topic or family of proteins, using specialised approaches that are not feasible in the major reference databases. Many are labours of love, run by a single lab with little or no dedicated funding and there are many challenges to building and maintaining them. This perspective arose from a meeting of several specialist protein resources and major reference databases held at the Wellcome Trust Genome Campus (Cambridge, UK) on the 11th and 12th of August 2014. During this meeting some common key challenges involved in creating and ...
Understanding how enzymes have evolved offers clues about their structure-function relationships ... more Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.
Superfamily and family analyses provide an effective tool for the functional classification of pr... more Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures.
Identification and functional characterization of the genes in the human genome remain a major ch... more Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previous...
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2001
Visualization interfaces for high performance computing systems pose special problems due to the ... more Visualization interfaces for high performance computing systems pose special problems due to the complexity and volume of data these systems manipulate. In the post-genomic era, scientists must be able to quickly gain insight into structure-function problems, and require flexible computing environments to quickly create interfaces that link the relevant tools. Feature, a program for analyzing protein sites, takes a set of 3-dimensional structures and creates statistical models of sites of structural or functional significance. Until now, Feature has provided no support for visualization, which can make understanding its results difficult. We have developed an extension to the molecular visualization program Chimera that integrates Feature's statistical models and site predictions with 3-dimensional structures viewed in Chimera. We call this extension ViewFeature, and it is designed to help users understand the structural Features that define a site of interest. We applied ViewFe...
Gene knockouts in a model organism such as mouse provide a valuable resource for the study of bas... more Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences) were used to evaluate localization results. In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small ...
This review examines the enzymes of 4-chlorobenzoate to 4-hydroxybenzoate converting pathway foun... more This review examines the enzymes of 4-chlorobenzoate to 4-hydroxybenzoate converting pathway found in certain soil bacteria. This pathway consists of three enzymes: 4-chlorobenzoate: Coenzyme A ligase, 4-chlorobenzoyl-Coenzyme A dehalogenase and 4-hydroxybenzoyl-Coenzyme A thioesterase. Recent progress made in the cloning and expression of the pathway genes from assorted bacterial strains is described. Gene order and sequence found among these strains are
Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (g... more Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins in 12 families in the PRS that represent ∼85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discov...
To construct the entry clone, blunt-end FVII cDNA and subsequent polymerase chain reaction (PCR) ... more To construct the entry clone, blunt-end FVII cDNA and subsequent polymerase chain reaction (PCR) product isolated from a HepG2 cell line was TOPO-cloned into a pENTR TOPO vector. To construct the expression clone, a LR recombination reaction was carried out ...
... l 1OS Press. 1999 Structure-Based Sequence Alignment of Mandelate Racemase and Muconate Lacto... more ... l 1OS Press. 1999 Structure-Based Sequence Alignment of Mandelate Racemase and Muconate Lactonizing Enzyme: Superposition on Reality Miriam S. Hasson* $, Patricia C. Babbitt^, Dagmar Ringe'. ... Mot Biol Evol 11 (1994l 571-92. [3] DJ Neidhart. ...
Page 198. INTRODUCTION TO INFORMATICS APPLICATIONS IN STRUCTURAL GENOMICS SEAN D. MOONEY Stanford... more Page 198. INTRODUCTION TO INFORMATICS APPLICATIONS IN STRUCTURAL GENOMICS SEAN D. MOONEY Stanford Medical Informatics Department of Genetics, Stanford University Stanford, CA 94305 PATRICIA C BABBIT ...
Uploads
Papers by Patricia Babbitt