Patricia Babbitt

UCSF, Bioengineering and Therapeutic Sciences, Faculty Member

Followers

Following

Co-authors

Public Views

Interests

Uploads

Papers by Patricia Babbitt

Understanding Enzyme Superfamilies: CHEMISTRY AS THE FUNDAMENTAL DETERMINANT IN THE EVOLUTION OF NEW CATALYTIC ACTIVITIES

Journal of Biological Chemistry, 1997

Download

The Nucleophilic Attack Six-Bladed β-Propeller (N6P) Superfamily

Relating Protein Sequence, Structure, and Function, 2013

ABSTRACT

Determinants of the CmoB carboxymethyl transferase utilized for selective tRNA wobble modification

by Patricia Babbitt and Jungwook Kim

Nucleic acids research, Jan 8, 2015

Enzyme-mediated modifications at the wobble position of tRNAs are essential for the translation o... more Enzyme-mediated modifications at the wobble position of tRNAs are essential for the translation of the genetic code. We report the genetic, biochemical and structural characterization of CmoB, the enzyme that recognizes the unique metabolite carboxy-S-adenosine-L-methionine (Cx-SAM) and catalyzes a carboxymethyl transfer reaction resulting in formation of 5-oxyacetyluridine at the wobble position of tRNAs. CmoB is distinctive in that it is the only known member of the SAM-dependent methyltransferase (SDMT) superfamily that utilizes a naturally occurring SAM analog as the alkyl donor to fulfill a biologically meaningful function. Biochemical and genetic studies define the in vitro and in vivo selectivity for Cx-SAM as alkyl donor over the vastly more abundant SAM. Complementary high-resolution structures of the apo- and Cx-SAM bound CmoB reveal the determinants responsible for this remarkable discrimination. Together, these studies provide mechanistic insight into the enzymatic and n...

Download

Cloning and expression of functional rabbit muscle creatine kinase in Escherichia coli. Addressing the problem of microheterogeneity

The Journal of biological chemistry, Jan 25, 1991

The gene encoding rabbit muscle creatine kinase (CK) has been subcloned into a single plasmid, an... more The gene encoding rabbit muscle creatine kinase (CK) has been subcloned into a single plasmid, and the protein expressed in a soluble and functional form in Escherichia coli. The amino terminus, specific activity, and electrophoretic mobility of the E. coli-expressed creatine kinase are all identical with that of creatine kinase purified from rabbit skeletal muscle. Surprisingly, isoelectric focusing shows that the expressed protein displays no less heterogeneity than the tissue-purified material. The identification of the source(s) of this heterogeneity is important for the preparation of highly homogeneous material needed for structural studies and clinical applications. This issue also has implications for studies of the developmental regulation and tissue localization of the various CK genes. Our results allow us to eliminate some of the proposals, such as the presence of multiple alleles, alternative ribosomal initiation sites, and post-translational glycosylation or phosphoryl...

Key challenges for the creation and maintenance of specialist protein resources

Proteins, Jan 27, 2015

As the volume of data relating to proteins increases, researchers rely more and more on the analy... more As the volume of data relating to proteins increases, researchers rely more and more on the analysis of published data, thus increasing the importance of good access to these data that vary from the supplemental material of individual papers, all the way to major reference databases with professional staff and long-term funding. Specialist protein resources fill an important middle ground, providing interactive web interfaces to their databases for a focused topic or family of proteins, using specialised approaches that are not feasible in the major reference databases. Many are labours of love, run by a single lab with little or no dedicated funding and there are many challenges to building and maintaining them. This perspective arose from a meeting of several specialist protein resources and major reference databases held at the Wellcome Trust Genome Campus (Cambridge, UK) on the 11th and 12th of August 2014. During this meeting some common key challenges involved in creating and ...

Using the structure-function linkage database to characterize functional domains in enzymes

Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.], 2014

The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible da... more The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of uncharacterized enzymes and to correct misannotated functional assignments. The information in this unit is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases. © 2014 by John Wiley & Sons, Inc.

New insights about enzyme evolution from large scale studies of sequence and structure relationships

The Journal of biological chemistry, Jan 31, 2014

Understanding how enzymes have evolved offers clues about their structure-function relationships ... more Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.

A gold standard set of mechanistically diverse enzyme superfamilies

Genome biology, 2006

Superfamily and family analyses provide an effective tool for the functional classification of pr... more Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures.

Download

A semiautomated approach to gene discovery through expressed sequence tag data mining: discovery of new human transporter genes

AAPS pharmSci, 2003

Identification and functional characterization of the genes in the human genome remain a major ch... more Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previous...

ViewFeature: integrated feature analysis and visualization

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2001

Visualization interfaces for high performance computing systems pose special problems due to the ... more Visualization interfaces for high performance computing systems pose special problems due to the complexity and volume of data these systems manipulate. In the post-genomic era, scientists must be able to quickly gain insight into structure-function problems, and require flexible computing environments to quickly create interfaces that link the relevant tools. Feature, a program for analyzing protein sites, takes a set of 3-dimensional structures and creates statistical models of sites of structural or functional significance. Until now, Feature has provided no support for visualization, which can make understanding its results difficult. We have developed an extension to the molecular visualization program Chimera that integrates Feature's statistical models and site predictions with 3-dimensional structures viewed in Chimera. We call this extension ViewFeature, and it is designed to help users understand the structural Features that define a site of interest. We applied ViewFe...

Download

Comparison of methods for genomic localization of gene trap sequences

BMC genomics, 2006

Gene knockouts in a model organism such as mouse provide a valuable resource for the study of bas... more Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences) were used to evaluate localization results. In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small ...

Download

REPRESENTING STRUCTURE-FUNCTION RELATIONSHIPS IN MECHANISTICALLY DIVERSE ENZYME SUPERFAMILIES

Biocomputing 2005 - Proceedings of the Pacific Symposium, 2005

Download

On the origins and functions of the enzymes of the 4-chlorobenzoate to 4-hydroxybenzoate converting pathway

Biodegradation, 1994

This review examines the enzymes of 4-chlorobenzoate to 4-hydroxybenzoate converting pathway foun... more This review examines the enzymes of 4-chlorobenzoate to 4-hydroxybenzoate converting pathway found in certain soil bacteria. This pathway consists of three enzymes: 4-chlorobenzoate: Coenzyme A ligase, 4-chlorobenzoyl-Coenzyme A dehalogenase and 4-hydroxybenzoyl-Coenzyme A thioesterase. Recent progress made in the cloning and expression of the pathway genes from assorted bacterial strains is described. Gene order and sequence found among these strains are

Download

Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks

eLife, 2014

Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (g... more Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins in 12 families in the PRS that represent ∼85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discov...

Download