Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
J. Mol. Biol. (2005) 354, 630–641 doi:10.1016/j.jmb.2005.09.048 Structures of Mycobacterium tuberculosis DosR and DosR–DNA Complex Involved in Gene Activation during Adaptation to Hypoxic Latency Goragot Wisedchaisri1,2,3, Meiting Wu1,2, Adrian E. Rice 1,2 David M. Roberts4, David R. Sherman4 and Wim G. J. Hol1,2,3,5* 1 Department of Biochemistry University of Washington Seattle, WA 98195, USA 2 Biomolecular Structure Center University of Washington Seattle, WA 98195, USA 3 Biomolecular Structure and Design (BMSD) Graduate Program, University of Washington, Seattle, WA 98195 USA On encountering low oxygen conditions, DosR activates the transcription of 47 genes, promoting long-term survival of Mycobacterium tuberculosis in a non-replicating state. Here, we report the crystal structures of the DosR C-terminal domain and its complex with a consensus DNA sequence of the hypoxia-induced gene promoter. The DosR C-terminal domain contains four a-helices and forms tetramers consisting of two dimers with nonintersecting dyads. In the DNA-bound structure, each DosR C-terminal domain in a dimer places its DNA-binding helix deep into the major groove, causing two bends in the DNA. DosR makes numerous protein– DNA base contacts using only three amino acid residues per subunit: Lys179, Lys182, and Asn183. The DosR tetramer is unique among response regulators with known structures. q 2005 Elsevier Ltd. All rights reserved. 4 Department of Pathobiology University of Washington Seattle, WA 98195, USA 5 Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195 USA *Corresponding author Keywords: tuberculosis; persistence; two-component system; crystal structures; protein–DNA complex Introduction Tuberculosis is a major global infectious disease responsible for two million deaths per year worldwide,1 with eight million new cases reported annually.2 The causative agent, Mycobacterium tuberculosis, is remarkably successful because of its ability to persist in infected individuals without producing any symptoms for extended periods of time. It is estimated that one-third of the world’s population is latently infected with this pathogen.2 Latent tuberculosis is of major public health Present address: A. E. Rice, Department of Biochemistry and Molecular Biophysics, California Institute of Technology, M/C 114-96, Pasadena, CA 91125, USA. Abbreviations used: SAD, single-wavelength anomalous dispersion. E-mail address of the corresponding author: wghol@u.washington.edu concern, because it acts as a reservoir of disease that can remain dormant for decades before reemerging as active disease. The current treatments for tuberculosis are highly effective only when the bacteria are growing actively. Latent tuberculosis infections require prolonged drug therapy, presumably due to the persistence of dormant tubercle bacilli that are refractory to current treatment regimens. Reduced oxygen tension and nitric oxide (NO) exposure are two conditions frequently associated with the onset and maintenance of latent tuberculosis.3,4 An identical set of 47 M. tuberculosis genes is up-regulated rapidly in response to each of these stimuli.5,6 A two-component regulatory system dosS-dosR (also called devS-devR7) is among the genes induced by hypoxia or NO exposure.5,6 Sherman et al. proposed that dosS and dosR form a two-component signaling system involved in the adaptation of bacilli to low oxygen conditions 0022-2836/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. 631 Crystal Structure of DosR-DNA Complex within the host.5 We and others have demonstrated that the sensor kinases DosS and DosT phosphorylate their conserved histidine residue and then transfer a phosphoryl group to Asp54 of DosR.8–10 The phosphorylation of Asp54 enhances the binding affinity of DosR to its cognate DNA sequence (the DosR box).8 DosR binds to the two copies of the DosR box in a promoter region of the hypoxic response gene acr (also called hspX), encoding the chaperone protein a-crystallin (Acr).11 Mutations within the binding sites abolish DosR binding as well as hypoxic induction of a downstream reporter gene.11 In addition, computer analysis identified a consensus promoter sequence recognized by DosR, a variant of which is located upstream of nearly all M. tuberculosis genes induced rapidly by hypoxia, including several genes with multiple DosR boxes.11,12 DosR is believed to mediate the transition of M. tuberculosis into dormancy,13 and may contribute to latency. In fact, nearly all M. tuberculosis genes upregulated rapidly in response to low levels of oxygen and NO require DosR for their induction.6,11 Disruption of dosR was reported to increase the virulence of M. tuberculosis in mice,14 but leads to attenuation of M. tuberculosis virulence in guinea pigs.15 DosR is essential for M. bovis BCG survival under hypoxia.16 Therefore, DosR could be a good target for developing drugs against latent tuberculosis.17,18 Here, we report a 2.0 Å resolution crystal structure of the DosR C-terminal domain (DosRC) and a 3.1 Å resolution crystal structure of a complex of this domain with a DNA duplex containing the consensus sequence of hypoxic response gene promoters.11 These are the first structures of a response regulator/transcription activator from 11 putative M. tuberculosis two-component regulatory systems identified in the genome.19 Both in the absence and in the presence of double-stranded DNA, DosRC forms a dimer of dimers, yielding a tetramer that has not been seen before among twocomponent regulators. Results Overall structure of DosR C-terminal domain The crystal structure of M. tuberculosis DosRC was determined using the single-wavelength anomalous dispersion (SAD) method and refined at 2.0 Å resolution (Table 1). The asymmetric unit contains eight molecules, assembled into two DosRC tetramers. The DosRC subunit contains four a-helices designated as a7, a8, a9 and a10 on the basis of sequence alignment of the protein family (Figure 1). The crystal structure reveals that DosRC forms two dimers, AB and CD, which assemble into a Table 1. Data collection and refinement statistics SeMet derivative DosRC Native DosRC–DNA complex P1 C2 33.07, 60.49, 74.23 89.90, 89.91, 90.99 0.9796 50.0–2.0 282,185 37,716 14.7 (3.7) 97.3 (91.1) 10.8 (41.2) 142.40, 58.79, 82.93 90.00, 125.50, 90.00 0.9641 50.0–3.1 25,449 8211 10.7 (2.0) 79.5 (60.7) 9.2 (48.2) A. Data collection Space group Unit cell dimension a, b, c (Å) a, b, g (deg.) Wavelength (Å) Resolution range (Å) Total reflections Unique reflections I/s(I)a Completenessa (%) Rsyma (%) B. SAD phasing Resolution range (Å) No. methionine residues No. heavy-atom sites found FOMb Solvent flattening FOMb C. Refinement Resolution range (Å) Reflections used (working/free) Rwork (%) Rfree (%) Average B-factor of all atoms (Å2) D. Model statistics No. DosRC molecules per asymmetric unit No. DNA molecules/a.u. r.m.s. deviation from ideal geometry Bond lengths (Å) Bond angles (deg.) Torsion angles (deg.) a b Values in parentheses are for the highest-resolution shell. According to RESOLVE.40 20.0–2.5 18 15 0.41 0.73 50.0–2.0 35,816/1894 18.6 21.1 26.4 50.0–3.1 7142/409 27.2 28.8 82.9 8 – 2 2 0.015 1.32 4.29 0.007 1.08 4.24 632 Crystal Structure of DosR-DNA Complex Figure 1. Sequence alignment for the C-terminal DNA-binding domain of DosR with its homologues. The secondary structure including four a-helices, based on the crystal structure of DosRC, is shown at the top. Red denotes completely conserved residues and blue denotes conservative substitutions. An asterisk (*) signifies residues interacting across the a10 dimer interface with asterisks in purple highlighting the critical a10 Thr198, Val202, and Thr205 residues. A ‡ symbol indicates residues at the a7/a8 tetramer interface with residues with magenta symbols forming the Phe175 pocket. A Y symbol shows residues involved in interactions with DNA, where green arrows represent residues contacting nucleotide bases and orange arrows indicate residues making DNA phosphate oxygen contacts. Lys179 contacts both phosphate oxygen and nucleotide bases. Sequences highlighted in yellow represent proteins with known structures: DosR (this work), GerE (PDB 1FSE),26 NarL (PDB 1JE8),29 RcsB (PDB 1P4W),30 and TraR (PDB 1H0M and 1L3L).31,32 tetramer (Figure 2(a)). Two essentially identical dimerization interfaces are organized about noncrystallographic 2-fold axes between two subunits. A third interface brings the AB and CD dimers together by a non-crystallographic 2-fold axis between the B and C subunits. This 2-fold is perpendicular to, but non-intersecting with, the two dimer dyads, generating a non-crystallographic 2-fold screw axis perpendicular to all three 2-folds in the tetramer. Subunit interfaces of the DosRC The a10 dimer interface The dimerization interface between subunits A and B (and between C and D) involves mainly the a10 helix plus additional residues from the a7-a8 loop (Figure 2(b)). A pseudo 2-fold symmetry axis with a 178.68 rotation operation is observed in each dimer, while the angle between the a10 helices of two adjacent subunits is about 318. This interface has a buried surface area of 1000 Å2 upon dimerization or about 10% of the solvent-accessible surface per subunit. Thr198 contributes the most to the interface, accounting for about 17% of the total buried surface area, while Gly164, Arg196, Val202, and Thr205, each contribute 10–14%. Thr198, Val202, and Thr205 are the key residues responsible for the contacts between the two a10 helices in the dimer (Figure 2(b)). The a7/a8 tetramer interface The second type of interface occurs between subunits B and C, arranging dimers AB and CD, which are related by a non-crystallographic 2-fold axis of about 179.58 rotation, into a tetramer. The dimer–dimer contacts involve mainly residues from helices a7 and a8 plus the a8-a9 loop (Figure 2(a)). This interface buries approximately 1500 Å2 of solvent-accessible surface area, which is significantly larger than the dimer interface. There are several direct and water-mediated hydrogen bonds and numerous van der Waals contacts. Arg173 and Phe175 contribute the most to the interface, about 12% and 24%, respectively. Phe175 of each subunit inserts its side-chain moiety into a hydrophobic pocket created by the non-polar parts of the sidechains of Arg155, Leu158, Gly159, Ala204, Leu207, and Lys208 (plus Leu147 in subunit B or Pro213 in subunit C) of the interacting subunit (Figure 2(c) and (d)). Without such a hydrophobic cavity of the interacting subunit, the side-chain of Phe175 would extend into the solvent. Additionally, an identical a7/a8 interface is found in crystal contacts. Crystal contacts The DosRC subunits are arranged as continuous strings of ABCD tetramers in the crystal. Two types of contacts are present between neighboring unit cells. The first interface connects subunits D and A 0 , where A 0 is translated by one unit cell along the crystallographic b axis. This interface is virtually identical with the a7/a8 interface within the tetramer between subunits B and C. For the second crystal contact, the interactions occur mostly between residues in the a9-loop-a10 region of subunit B and the a9 and a10 helices of subunit A 00 , where A 00 is translated by one unit cell along the a axis. Identical interactions are observed at another location between subunits D 00 and C, where D 00 is also translated along the a axis. We describe this interface as the a9 interface. This interface has a total buried solvent-accessible area of 1500 Å2, which is as large as that of the a7/a8 tetramer interface but the contacts are somewhat looser, as shown by a larger gap volume (gap volume per interface-accessible surface area) (Table 2).20,21 The a9 interface in the DosRC crystals is absent from the DNA complex crystals because Crystal Structure of DosR-DNA Complex 633 Figure 2. The DosRC dimers and tetramer. (a) The tetramer ABCD is made up of the AB and CD dimers related by the central 2BC 2-fold relating subunits B and C. Note that the 2-fold axes do not intersect. Subunits are colored with A yellow, B magenta, C gold, and D pink. (b) Stereo view of the a10 dimer interface between subunits A and B. Thr198, Val202, and Thr205 are the key contacting residues on helix a10. (c) Stereo view of residues making contacts at the a7/a8 tetramer interface between subunits B and C. For clarity, only Leu147, Arg155, Leu158, Leu207, and Lys208 of subunit B and Phe175 of subunit C are labeled. (d) Close-up view of the solvent-accessible surface of DosRC subunit B showing a hydrophobic pocket interacting with Phe175 of subunit C at the a7/a8 tetramer interface. The surface is colored by atom type: gray for carbon, red for oxygen, and blue for nitrogen. 634 Crystal Structure of DosR-DNA Complex Table 2. Buried solvent-accessible surfaces and gap volumes of DosRC interfaces DosRC DosRC–DNA Complex Interface type Buried solvent-accessible surface area (Å2) Interface gap volume (Å3) Gap volume indexa (Å) a10 a7/a8 a9 a10 a7/a8 1.0!103 1.5!103 1.5!103 1.0!103 1.5!103 1.8!103 1.9!103 2.6!103 1.3!103 1.7!103 1.8 1.2 1.7 1.3 1.1 Values were calculated by the protein–protein interaction server.20,21 a Gap volume per interface-accessible surface area. the a9 helix makes contacts with DNA. Therefore, the extensive a9 interface in the uncomplexed DosRC structure is most likely not physiologically important. make interactions with DosRC when using a distance cutoff of 3.3 Å for hydrogen bonds and 3.8 Å for van der Waals contacts. Overall structure of the DosRC–DNA complex DNA phosphate interactions The crystal structure of DosRC, cocrystallized with a 43mer DNA duplex containing the 20 bp consensus promoter of hypoxia-induced genes (the DosR box), was determined to 3.1 Å resolution (Table 1). The asymmetric unit contains two DosRC subunits, comprising residues 145–209 and oligonucleotide strands 1 and 2, forming a doublestranded DNA duplex. The electron density of the DNA is of good quality in the protein-bound region (Figure 3(a)), allowing 22 bp of DNA nucleotides (covering the 20 bp DosR box plus one additional base-pair on both ends) to be modeled in the structure. The dimer and tetramer of the DosRC–DNA complex The DosRC–DNA complex reveals that the AB dimer, containing the a10 interface, is responsible for DNA binding (Figure 3(b)). This interface is similar to the a10 interface in the AB and CD dimers observed in the uncomplexed DosRC crystal structure (Figure 2(a) and (b)). The DosRC dimer in the asymmetric unit contacts neighboring DosRC dimer in an adjacent asymmetric unit using the a7/a8 tetramer interface (Figure 3(c)), generating an ABA 0 B 0 tetramer that is very similar to the ABCD tetramer in the uncomplexed structure. Six nucleotides of each DNA strand make hydrogen bonds, salt-bridges, or van der Waals interactions via their non-bridging phosphate oxygen atoms with 11 amino acid residues of each DosRC subunit (Figure 3(d), which shows the nucleotide nomenclature): Gln153 in the a7 helix, eight residues in the a8-a9 HTH motif, and Arg196 and Arg197 in the a10 helix (Figure 1). Nucleotide interactions Remarkably, the DosRC dimer interacts with 16 nucleotides of the DNA duplex, using only three amino acid residues in the a9 helix from each subunit: Lys179, Lys182, and Asn183 (Figures 1 and 3(a) and (d)). Lys179A makes van der Waals interactions with G5I, A7I, T14J, and C15J, and a hydrogen bond to the O6 atom of G6I with its sidechain amino group (Figure 3(a)). The amino group of Lys182A side-chain makes hydrogen bonds with the N7 atom of A12J, and the O6 and N7 atoms of G13J. Asn183A makes van der Waals contacts with G4I, G5I, T14J, and C15J. The nucleotide interactions of the three amino acid residues in subunit B are very similar to those of subunit A but involve the opposite DNA strands (Figure 3(d)). Therefore, the base-pairs interacting with DosR are grouped into two recognition motifs, G4G5G6A7C8T9, one in each half of the 20 bp DosR box. DosRC–DNA interactions The interactions between DosRC and DNA are similar in the two DosRC subunits due to the noncrystallographic 2-fold symmetry of the two protein subunits and the pseudo-palindromic sequence of the DNA in the 20 bp DosR box (Figure 3(b) and (d)). Each DosRC domain in the dimer places its helix a9 into the major groove on the same side of the DNA and makes numerous contacts. A total of 12 phosphate oxygen moieties contribute to the DNA backbone–protein interactions, while ribose sugars make essentially no contact with the protein. No less than 16 nucleotides in the DNA duplex DNA conformation One characteristic feature in the DosRC–DNA complex structure is that the DNA is bent significantly compared to canonical B-DNA. The helical axis calculated by the program CURVES22,23 reveals two kinks of approximately 258 and 308 at G4G5G6A7 of each half, where the a9 helix of DosRC interacts extensively with the DNA (Figure 3(b)). Interestingly, the two ends of the helical axis are not in the same plane but are in a staggered conformation with a “dihedral 635 Crystal Structure of DosR-DNA Complex Figure 3 (legend next page) angle” of about 408. The analysis by the program 3DNA24 reveals B-like DNA with local deformed conformations at T2G3G4G5G6A7 of the first half palindrome (interacting with DosRC subunit A) and G4G5G6A7 of the second half palindrome (interacting with DosRC subunit B), exactly the same region where the DNA is bent. Therefore, the DNA conformation observed in our crystal structure is considerably different from straight canonical B-DNA. Nevertheless, DNA parameters over its entire length are within the range of values compatible with B-like DNA observed in a number of high-resolution crystal structures.24,25 Comparison of the DosRC and DosRC–DNA complex structures The monomers Pairwise least-squares superpositions of DosRC monomers in the two crystal structures reveal r.m.s. deviations of 0.4–0.5 Å for 58 Ca atoms from residues 151 to 208. This indicates that there is no significant conformational change within the core structure of the DosRC monomer upon DNA binding. 636 Crystal Structure of DosR-DNA Complex Figure 3. DosRC–DNA complex structure. (a) Stereo view of sA-weighted 2FobsKFcalc electron density map contoured at the 1s level where Lys179, Lys182, and Asn183 interact with nucleotide bases. Due to the limited resolution and high anisotropy of the data, the side-chain densities are not as well defined as would have been possible with better diffracting crystals. Such crystals were not obtained in this case, however. (b) Structure of the DosRC–DNA complex. DosRC uses the a10 helices to form a functional dimer for DNA binding. Arg196, Thr198, Val202, and Thr205 are residues contributing to this dimerization interface. The DNA clearly has a bent conformation, as shown by its helical axis in gray. (c) Stereo view of the DosR tetramer–DNA complex. The a7/a8 dimer interface is formed between DosRC subunit B and its neighboring crystallographic symmetry-related subunit A 0 forming a tetramer ABA 0 B 0 similar to tetramer ABCD in the uncomplexed DosRC structure. (d) Contacts between DosRC dimer and the 20 bp consensus promoter of hypoxic response genes. Amino acid residues making DNA phosphate oxygen interactions are colored by subunit: yellow for subunit A and magenta for subunit B. Residues contacting nucleotide bases as well as the nucleotide bases interacting with the protein are colored blue for hydrogen bonds and red for van der Waals interactions. The backbone of the DNA is colored by DNA strand, cyan for strand I, and green for strand J. (e) Comparison of uncomplexed DosRC (white) and DNA-bound DosRC (green). The base-contacting residues Lys179, Lys182 and Asn183 are shown in stick model. A 58 rotation of the entire subunit B with respect to subunit A occurs upon binding to the DNA duplex. This decreases the distances between Ca atoms of the two Lys179 residues in the dimer by 1.5 Å, the two Lys182 residues by 1.0 Å, and the two Asn183 residues by 0.4 Å. 637 Crystal Structure of DosR-DNA Complex The dimers C The uncomplexed DosR AB dimer differs from the DosRC dimer in the DNA complex by an r.m.s. deviation of 0.7 Å. However, when only one subunit is superimposed, the second subunits in the dimers display a significant difference with an r.m.s. deviation of 1.4–1.8 Å caused by an approximately 58 rotation of the whole second subunit with respect to the first (Figure 3(e)). This causes the a9 helix of the second subunit in the DNA complex structure to shift by about 2 Å along its helical axis without moving significantly closer to the a9 helix of the first subunit. Nevertheless, equivalent base-contacting residues in two subunits, which cluster in the first half of the a9 helices, are brought closer together in order to make proper interactions with DNA bases. For instance, the two Lys179 Ca atoms in the dimer are 1.5 Å closer to each other, and the two Lys182 Ca atoms are 1.0 Å closer in the DNA complex structure than in the uncomplexed DosRC dimer (Figure 3(e)). Residues making contacts at the a10 dimer interface are similar in both structures. The interface gap volume between two DosRC subunits in the uncomplexed dimer is about 1800 Å3 with a gap volume index of 1.8 Å, while the DosRC dimer in the DNA complex has an interface gap volume of only 1300 Å3 with a gap volume index of 1.3 Å, or about 70% of that in the uncomplexed DosRC, despite the fact that the buried surface areas are about the same (Table 2). This indicates that upon binding to the DNA, a slight alteration of the intersubunit orientation makes the two subunits form a tighter DNA-binding AB dimer. The tetramer The coordinates of 116 Ca atoms from residues 151–208 in each of two DosRC subunits interacting via the a7/a8 interface in the DNA complex (the BA 0 contacts) differ from equivalent coordinates of the uncomplexed DosRC a7/a8-interface dimer (the BC dimer) by an r.m.s. deviation of about 0.5 Å. Amino acid residues making contacts in the a7/a8 tetramer interface are similar in both structures, and the buried solvent-accessible surface areas and the interface gap volumes are also comparable (Table 2). This indicates that the a7/a8 tetramer interface is well maintained in both structures and does not undergo a significant change upon DNA binding, suggesting that this interface may be physiologically relevant for DosR function. Comparison with homologous proteins and protein–DNA complexes Protein structure comparison DosRC is a member of the response regulator C-terminal effector domain family containing four helices. There are four members of this family of transcription regulators with known structures: the transcriptional regulator GerE,26 the two-component response regulator NarL,27–29 the twocomponent response regulator RcsB,30 and the quorum-sensing transcription factor TraR,31,32 with 45%, 44%, 28%, and 23% sequence identity to DosRC, respectively. Among all these structures, the a7/a8 tetramer interface seen three times in our two crystal structures of DosR is unique. Therefore, comparisons can be made only at the monomer and dimer levels. The least-squares superpositions of 58 Ca atoms of the core region of DosRC monomer and the corresponding coordinates of the homologous proteins reveal r.m.s. deviations of 1.6–2.0 Å for GerE, 1.0 Å for NarL, 1.0 Å for RcsB, and 1.5–1.8 Å for TraR. This suggests that the C-terminal domains of NarL and RcsB, response regulators of twocomponent regulatory systems, are the closest structural homologues of DosRC, with NarL sharing the highest level of sequence identity of 44% between these two homologues. DosR shares with GerE, NarL, and TraR a similar AB dimer via the a10 dimer interface (RcsB is a monomer). The DosRC dimer is different from homologous dimers by an r.m.s. deviation of 2.0 Å for GerE, 1.3 Å for NarL, and 2.0–2.2 Å for TraR for 116 equivalent Ca atoms per dimer. Several amino acid residues from the DosRC a10 helix used for making the interface are, nevertheless, different from those observed in GerE, NarL, and TraR dimers (Figure 1). For instance, the contacting residues Thr198, Val202, and Thr205 in DosR are Ser60, Val64, and Leu67 in GerE,26 and Val204, Val208, and His211 in NarL. 29 However, an interesting feature observed in all structures is that the positions of these contacting residues are identical in the sequence alignment (Figure 1) while the central valine residue is conserved among DosR, GerE, and NarL. DNA structure comparison Superpositions of phosphate moieties from the DNA backbone of the 20 bp consensus promoter of hypoxic response genes recognized by DosR (this structure) with corresponding coordinates of the nirB promoter recognized by NarL29 and the tra box recognized by TraR32 reveal r.m.s. deviations of 1.6 Å and 2.5 Å, respectively, for 38 DNA phosphate atoms. In all three cases, the DNA molecules are bent, but in distinctly different manners. Differences in DNA recognition between DosR and NarL Because NarL is the closest homologue of DosRC with a known structure of a complex with DNA, a comparison of nucleotide recognition may shed light on differences between the sequences recognized by these two response regulators. DosR binds to a 20 bp palindromic sequence containing 638 inverted repeats of the G4G5G6A7C8T9 recognition motif, where G4G5G6 has a distorted B-DNA conformation. NarL also recognizes a 20 bp palindromic sequence containing, however, a different inverted repeat sequence T3A4C5C6C7A8T9, where C5C6C7 undergoes a local B/A conformation transition.29 The DNA bound to DosRC makes two kinks of about 258–308 at the G4G5G6A7 regions while, in contrast, the DNA in the NarL–DNA complex bends gradually by about 428 over its entire length.29 DosR and NarL show both similarities and distinct differences in how they recognize their target DNA sequences. The three amino acid residues contacting nucleotide bases in DosR are Lys179, Lys182, and Asn183; and in NarL the basecontacting residues are Lys188, Val189, and Lys192. According to the structure-based sequence alignment (Figure 1), two of these three residues, Lys182 and Asn183 of DosRC are equivalent to Lys188 and Val189 of NarL, respectively. Lys182 of DosRC contacts bases A 12G 13 while Lys188 of NarL interacts with A12T13. Asn183 in DosRC makes contacts with G4G5 and T14C15 of the complementary strand, while Val189 of NarL differently contacts T3A4C5. In contrast to these two residues with similar global functions in the two regulators, the important base-contacting residue Lys179 of DosR, which interacts with base moieties of G5G6A7 and T14C 15 of the complementary strand, is equivalent to Ser185 in NarL, which makes a hydrogen bond only with a phosphate oxygen atom of C5. On the other hand, Lys192 of NarL, which makes hydrogen bonds to G15G16, is equivalent to Ser186 in DosR, which makes a hydrogen bond with a T14 phosphate oxygen atom instead. Although one of the three base-contacting residues is identical in the two regulators, the other two are different, which is likely to contribute significantly to the difference in DNA sequences recognized by the two regulators. Some non-conserved residues have similar functions for interacting with DNA phosphate oxygen atoms in both DosR and NarL. For instance, Thr166 in DosR aligns with Pro172 in NarL and both contact a phosphate oxygen atom of A12. Also, Tyr184 in DosR is equivalent to His190 in NarL and both make a hydrogen bond to a phosphate oxygen atom, G3 in DosR or T3 in NarL. Arg196 in DosR is changed into Ser202 in NarL but both use their main-chain amide nitrogen to contact a phosphate oxygen atom of G13 in DosR or T13 in NarL. Interestingly, some identical or homologous residues do not make the same phosphate oxygen interactions. For example, the Thr151 side-chain of DosR is 5 Å away from a phosphate oxygen atom of T2 and does not make any contact with the DNA, but the equivalent Thr157 in NarL makes a hydrogen bond with a phosphate oxygen atom of G2. Conversely, Leu176 in DosR contacts a phosphate oxygen atom of G4 but the equivalent Ile182 of NarL contacts that of T3 instead. In conclusion, our comparison shows that DosR and NarL recognize different promoter sequences Crystal Structure of DosR-DNA Complex using: (i) one equivalent and identical Lys residue to contact similar bases in the same positions; (ii) one equivalent but non-identical (Asn versus Lys) residue to contact dissimilar bases in similar positions; (iii) a third base-contacting residue that is not equivalent and contacts entirely different nucleotides; (iv) some equivalent but non-identical residues making interactions with equivalent phosphate oxygen atoms; and (v) some conserved residues contacting non-equivalent phosphate oxygen atoms. Interestingly, in both the DosR–DNA and in the NarL–DNA complexes, ribose moieties of the DNA are not involved in the protein–DNA contacts. Discussion Many DNA-binding proteins recognize their target promoter by DNA local conformation and sequence-dependent deformability in addition to base-specific interactions, i.e. via both “indirect” and “direct” readout mechanism.33,34 Our crystal structure of the DosRC–DNA complex defines the G4G5G6A7C8T9 sequence as the motif recognized directly by one DosR subunit, while the deviations of the bound DNA from canonical DNA are suggestive of indirect readout. It has been shown that crystal structures of oligonucleotides containing repeated G sequences adopt conformations between those of A and BDNA.35,36 Furthermore, NMR structures of the HIV1 kB binding site with a 16 bp oligonucleotide duplex containing the T2G 3G 4G 5 G 6A 7C 8T 9 sequence, identical with almost half of the DNA in our crystal structure, show two major B-DNA conformations, one with a rather straight helical axis (form I) and the other with a curvature of 258 (form II).37 This suggests that the DNA sequence recognized by DosR may adopt multiple conformations in solution. The G4G5G6 sequence could be a basis for promoter recognition by DosR as a result of its bendability, yielding a stable DosR–DNA complex with DNA in a specific bent conformation. For direct readout, DosR–DNA base interactions occur in the G4G5G6A7C8T9 motif, which is present twice as inverted repeats in the 20 bp palindromic DosR box. When we analyzed the frequency of each nucleotide in the putative DosR-binding promoters,11 only four bases per half-palindrome are highly conserved: G4G5G6 and C8. These four conserved nucleotide positions may provide crucial basespecific contacts for DosR. Definitive conclusions on nucleotide base-specificity cannot be made at present, as the resolution of our crystal structure is not sufficiently high. However, previous experiments by Park et al.11 suggest the importance of the conserved G4 nucleotide. When mutations including G4 of either DosR box in the acr promoter were introduced, DosR lost its ability to bind to the DNA.11 The G4 position may be essential for DosR–DNA interactions, and may serve, together with the conserved G5G6 basepairs, as a hinge for DNA bending. 639 Crystal Structure of DosR-DNA Complex Clearly, despite insight into dimerization, tetramerization, and DNA binding reported here, several questions regarding this important regulator need further investigation. The tetramer, observed multiple times in our structures, is particularly intriguing. Tetramerization could act to bring spatially distant DosR boxes together to alter gene regulation. Consistent with this idea, the hypoxia-induced genes preceded by multiple DosR boxes, such as the acr gene, are among the most powerfully induced of the DosR regulon.11 However, size-exclusion chromatography and dynamic light-scattering experiments of unphosphorylated DosR, phosphorylated DosR and DosRC have not provided evidence for tetramerization in solution (data not shown). Obviously, further studies are needed to establish the relevance of the DosR tetramers observed in both our structures. Nevertheless, we present here the first structures of a response regulator/transcription activator of any M. tuberculosis two-component system, and we provide a potential platform for development of novel DosRderegulating compounds that can be effective by themselves, or, when co-administered with other drugs, might increase the efficacy of the latter and decrease the long duration of current therapies. Materials and Methods Protein expression and purification of DosRC A DNA fragment of the dosRC gene encoding amino acid residues 144–217 was amplified by PCR from a fulllength dosR plasmid DNA using a primer 3N (5 0 -CGGACCCATATGCAGGACCCGCTATCAGGC-3 0 ) and a primer 3C (5 0 -GGGTCCGAGCTCTCATGGTCCATCACCGGG-3 0 ) (Invitrogen). The PCR product was subcloned into pET-28(C) (Novagen) using NdeI and SacI restriction sites. The plasmid was introduced into Escherichia coli BL21(DE3) (Novagen). The selenomethionine-substituted (SeMet) DosRC with an N-terminal Histag followed by a thrombin cleavage site was expressed in M9 minimal medium supplemented with selenomethionine by induction with 1 mM IPTG. Cells were harvested and the pellets were frozen at K80 8C. The cell pellets were thawed with 20 mM Tris–HCl (pH 8.0), 100 mM NaCl, 1 mM PMSF, Complete EDTA-free Protease Inhibitor Cocktail (Roche), and benzonase nuclease, and lysed with a French press. Protamine sulfate was added to the lysate, which was subsequently incubated on ice for 30 min. The mixture was centrifuged at 20,000 g for 30 min and the supernatant was clarified by filtration followed by Ni-NTA affinity chromatography. The nonspecifically bound proteins were eluted with 20 mM imidazole in 20 mM Tris–HCl, 100 mM NaCl. Subsequently, DosRC was eluted with 200 mM imidazole in the same buffer. Protein fractions were pooled and cleaved with thrombin (Novagen) at room temperature overnight, which left the four amino acid residues GSHM at the N terminus. The cleaved product was concentrated and subsequently purified using a Superdex75 HR10/30 size-exclusion column (Amersham Biosciences) equilibrated in 20 mM Tris–HCl (pH 8.0), 0.25 M NaCl, 1 mM EDTA, 1 mM Tris(2-carboxyethyl)-phosphine (TCEP). The pure protein (O99% purity) was concentrated to 4–6 mg/ml for crystallization. The expression of native DosRC was carried out using the same expression system and the culture was grown in LB medium. Native DosRC was purified using a Ni-NTA affinity column followed by collection of the flow-through fraction from a MonoQ column (Amersham Biosciences) without cleaving off the N-terminal His6 tag. The protein was concentrated and then dialyzed in 10 mM Hepes (pH 8.0), 50 mM NaCl, 1 mM EDTA, 1 mM TCEP. Crystallization Crystals were grown by the sitting-drop, vapordiffusion method at room temperature. The SeMet DosRC at 4 mg/ml crystallized as plates with 0.1 M Mes (pH 5.5), 30% (w/v) polyethylene glycol monomethyl ether (PEG 5000mme), 0.2 M ammonium sulfate, 5% (v/v) glycerol. The crystals were transferred to a cryoprotectant solution containing 0.1 M Mes (pH 5.5), 30% PEG 5000mme, 0.2 M ammonium sulfate, 0.25 M NaCl, 20% glycerol, and flash-frozen in liquid nitrogen. The DosRC– DNA complex crystals were obtained after trying several DNA variants from 20 nucleotides to 43 nucleotides containing a 20 bp consensus promoter of hypoxia response genes (underlined below).11 The DNA used in the present structure is a 43mer oligonucleotide duplex of 41 bp and one AT dinucleotide overhang on each 3 0 -end (5 0 -GGCCCGCGCTTTGGGGACTAAAGTCCCTAACCCTGGCCACGAT-3 0 and 5 0 -CGTGGCCAGGGTTAGGGACTTTAGTCCCCAAAGCGCGGGCCAT-3 0 ). The DosRC–DNA complex was prepared by mixing 8.4 mg/ ml (0.8 mM) of native protein with 0.44 mM DNA duplex. The crystal used in the current structure was crystallized with 0.1 M Hepes (pH 8.0), 24% PEG 400, 0.2 M CaCl2. The crystal was transferred to a cryoprotectant solution containing 0.1 M Hepes (pH 8.0), 35% PEG 400, 0.2 M CaCl2, 2 mM TCEP, and flash-frozen in liquid nitrogen. Data collection and structure determination A data set for SeMet DosRC was collected at the peak wavelength at the Advanced Light Source (ALS) beamline 8.2.2 of the Lawrence Berkeley National Laboratory with an 7208 oscillation range per wavelength and processed with the programs DENZO and SCALEPACK38 to a resolution of 2.0 Å. The structure was solved by the Se-SAD method with the program SOLVE,39 using the data to 2.5 Å resolution. Subsequently, the program RESOLVE was used for solvent flattening, density averaging, and automatic model building.40–42 The model from RESOLVE was further refined to 2.0 Å resolution using the program Refmac5.43 Subsequent cycles of manual model building and placement of water molecules using XtalView,44 followed by refinement using the program Refmac5,43 completed the structure determination: the statistics are provided in Table 1. The data set for the DosRC–DNA complex was collected at the Advanced Photon Source (APS) beamline 19ID of Argonne National Laboratory and processed with the programs DENZO and SCALEPACK38 to 3.1 Å. Intensities were converted to structure factors with Truncate,45 showing significant anisotropy of the data plus a relatively high Wilson’s B-factor of 73 Å2. The structure was solved by the molecular replacement technique with the program Molrep,46 using a dimer AB of the uncomplexed structure (truncated to residues 150– 209) as a search model. After several cycles of rigid-body refinement and restrained refinement using Refmac5,43 640 a sA-weighted FobsKFcalc map revealed electron density of the DNA in a bent conformation. An initial model of a 20 bp DNA duplex with the DosR box consensus sequence11 was generated using the program InsightII version 97.0 (Molecular Simulations Inc.) from a DNA duplex in the NarL C-terminal domain–DNA complex structure (PDB accession number 1JE8).29 Several cycles of manual model building using the program XtalView44 were followed by restrained refinement using Refmac5,43 alternated with simulated annealing and group B-factor refinement with DNA conformation restraints (sugar puckering and Watson–Crick base-pairing) using CNS.47 The final refinement cycle was completed by restrained refinement with an overall B-factor refinement using Refmac5,43 with 2-fold NCS restraints for main-chain protein atoms (0.05 Å r.m.s. deviation), side-chain protein atoms (0.19 Å r.m.s. deviation), and DNA atoms of nucleotides 4–9 and 12–17 (0.14 Å r.m.s. deviation). Structure analysis The buried solvent-accessible surface and contact residues were calculated with CNS,47 the protein–protein interaction server,20,21 and the program Contact from the CCP4 program suite.48 The protein–DNA interactions were analyzed with Nucplot.49 The DNA conformation was evaluated with the programs CURVE22,23 and 3DNA.24 The least-squares superpositions were performed with LSQKAB,50 LSQMAN,51 and CNS.47 Molecules are rendered with Pymol (Delano Scientific†). Crystal Structure of DosR-DNA Complex 3. 4. 5. 6. 7. 8. 9. Protein Data Bank accession codes The atomic coordinates and structure factors for DosRC and the DosRC–DNA complex have been deposited in the RCSB Protein Data Bank with accession codes 1ZLJ and 1ZLK, respectively. Acknowledgements We thank Stewart Turley for assistance during data collection in the early stage of the project. We gratefully acknowledge the use of the Advanced Photon Source (APS) beamline 19-ID (SBC-CAT) at Argonne National Laboratory and the Advanced Light Source (ALS) beamline 8.2.2 (HHMI) at Lawrence Berkeley National Laboratory and their staffs for technical assistance. Use of the APS and the ALS are supported by the U.S. Department of Energy. This work is sponsored by grant CA65656 to W.G.J.H. and National Institute of Health grant AI47744 to D.R.S. 10. 11. 12. 13. 14. 15. References 1. Frieden, T. R., Sterline, T. R., Munsiff, S. S., Watt, C. J. & Dye, C. (2003). Tuberculosis. Lancet, 362, 887–899. 2. Corbett, E. L., Watt, C. J., Walker, N., Maher, D., Williams, B. G., Raviglione, M. C. & Dye, C. (2003). 16. 17. † http://pymol.sourceforge.net/ The growing burden of tuberculosis: global trends and interactions with the HIV epidemic. Arch. Intern. Med. 163, 1009–1021. Nathan, C. & Shiloh, M. U. (2000). Reactive oxygen and nitrogen intermediates in the relationship between mammalian hosts and microbial pathogens. Proc. Natl Acad. Sci. USA, 97, 8841–8848. Wayne, L. G. & Sohaskey, C. D. (2001). Nonreplicating persistence of Mycobacterium tuberculosis. Annu. Rev. Microbiol. 55, 139–163. Sherman, D. R., Voskuil, M., Schnappinger, D., Liao, R., Harrell, M. I. & Schoolnik, G. K. (2001). Regulation of the Mycobacterium tuberculosis hypoxic response gene encoding a-crystallin. Proc. Natl Acad. Sci. USA, 98, 7534–7539. Voskuil, M. I., Schnappinger, D., Visconti, K. C., Harrell, M. I., Dolganov, G. M., Sherman, D. R. & Schoolnik, G. K. (2003). Inhibition of respiration by nitric oxide induces a Mycobacterium tuberculosis dormancy program. J. Expt. Med. 198, 705–713. Dasgupta, N., Kapur, V., Singh, K. K., Das, T. K., Sachdeva, S., Jyothisri, K. & Tyagi, J. S. (2000). Characterization of a two-component system, devRdevS, of Mycobacterium tuberculosis. Tuber. Lung Dis. 80, 141–159. Roberts, D. M., Liao, R. P., Wisedchaisri, G., Hol, W. G. & Sherman, D. R. (2004). Two sensor kinases contribute to the hypoxic response of Mycobacterium tuberculosis. J. Biol. Chem. 279, 23082–23087. Saini, D. K., Malhotra, V., Dey, D., Pant, N., Das, T. K. & Tyagi, J. S. (2004). DevR-DevS is a bona fide twocomponent system of Mycobacterium tuberculosis that is hypoxia-responsive in the absence of the DNAbinding domain of DevR. Microbiology, 150, 865–875. Saini, D. K., Malhotra, V. & Tyagi, J. S. (2004). Cross talk between DevS sensor kinase homologue, Rv2027c, and DevR response regulator of Mycobacterium tuberculosis. FEBS Letters, 565, 75–80. Park, H. D., Guinn, K. M., Harrell, M. I., Liao, R., Voskuil, M. I., Tompa, M. et al. (2003). Rv3133c/dosR is a transcription factor that mediates the hypoxic response of Mycobacterium tuberculosis. Mol. Microbiol. 48, 833–843. Florczyk, M. A., McCue, L. A., Purkayastha, A., Currenti, E., Wolin, M. J. & McDonough, K. A. (2003). A family of acr-coregulated Mycobacterium tuberculosis genes shares a common DNA motif and requires Rv3133c (dosR or devR) for expression. Infect. Immun. 71, 5332–5343. Karakousis, P. C., Yoshimatsu, T., Lamichhane, G., Woolwine, S. C., Nuermberger, E. L., Grosset, J. & Bishai, W. R. (2004). Dormancy phenotype displayed by extracellular Mycobacterium tuberculosis within artificial granulomas in mice. J. Expt. Med. 200, 647–657. Parish, T., Smith, D. A., Kendall, S., Casali, N., Bancroft, G. J. & Stoker, N. G. (2003). Deletion of two-component regulatory systems increases the virulence of Mycobacterium tuberculosis. Infect. Immun. 71, 1134–1140. Malhotra, V., Sharma, D., Ramanathan, V. D., Shakila, H., Saini, D. K., Chakravorty, S. et al. (2004). Disruption of response regulator gene, devR, leads to attenuation in virulence of Mycobacterium tuberculosis. FEMS Microbiol. Letters, 231, 237–245. Boon, C. & Dick, T. (2002). Mycobacterium bovis BCG response regulator essential for hypoxic dormancy. J. Bacteriol. 184, 6760–6767. Zhang, Y. (2005). The magic bullets and tuberculosis drug targets. Annu. Rev. Pharmacol. Toxicol. 45, 529–564. Crystal Structure of DosR-DNA Complex 18. Saini, D. K. & Tyagi, J. S. (2005). High-throughput microplate phosphorylation assays based on DevRDevS/Rv2027c 2-component signal transduction pathway to screen for novel antitubercular compounds. J. Biomol. Screen. 10, 215–224. 19. Cole, S. T., Brosch, R., Parkhill, J., Garnier, T., Churcher, C., Harris, D. et al. (1998). Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature, 393, 537–544. 20. Jones, S. & Thornton, J. M. (1995). Protein–protein interactions: a review of protein dimer structures. Prog. Biophys. Mol. Biol. 63, 31–65. 21. Jones, S. & Thornton, J. M. (1996). Principles of protein– protein interactions. Proc. Natl Acad. Sci. USA, 93, 13–20. 22. Lavery, R. & Sklenar, H. (1988). The definition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids. J. Biomol. Struct. Dynam. 6, 63–91. 23. Lavery, R. & Sklenar, H. (1989). Defining the structure of irregular nucleic acids: conventions and principles. J. Biomol. Struct. Dynam. 6, 655–667. 24. Lu, X. J. & Olson, W. K. (2003). 3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. Nucl. Acids Res. 31, 5108–5121. 25. Olson, W. K., Bansal, M., Burley, S. K., Dickerson, R. E., Gerstein, M., Harvey, S. C. et al. (2001). A standard reference frame for the description of nucleic acid base-pair geometry. J. Mol. Biol. 313, 229–237. 26. Ducros, V. M., Lewis, R. J., Verma, C. S., Dodson, E. J., Leonard, G., Turkenburg, J. P. et al. (2001). Crystal structure of GerE, the ultimate transcriptional regulator of spore formation in Bacillus subtilis. J. Mol. Biol. 306, 759–771. 27. Baikalov, I., Schroder, I., Kaczor-Grzeskowiak, M., Grzeskowiak, K., Gunsalus, R. P. & Dickerson, R. E. (1996). Structure of the Escherichia coli response regulator NarL. Biochemistry, 35, 11053–11061. 28. Baikalov, I., Schroder, I., Kaczor-Grzeskowiak, M., Cascio, D., Gunsalus, R. P. & Dickerson, R. E. (1998). NarL dimerization? Suggestive evidence from a new crystal form. Biochemistry, 37, 3665–3676. 29. Maris, A. E., Sawaya, M. R., Kaczor-Grzeskowiak, M., Jarvis, M. R., Bearson, S. M., Kopka, M. L. et al. (2002). Dimerization allows DNA target site recognition by the NarL response regulator. Nature Struct. Biol. 9, 771–778. 30. Pristovsek, P., Sengupta, K., Lohr, F., Schafer, B., von Trebra, M. W., Ruterjans, H. & Bernhard, F. (2003). Structural analysis of the DNA-binding domain of the Erwinia amylovora RcsB protein and its interaction with the RcsAB box. J. Biol. Chem. 278, 17752–17759. 31. Vannini, A., Volpari, C., Gargioli, C., Muraglia, E., Cortese, R., De Francesco, R. et al. (2002). The crystal structure of the quorum sensing protein TraR bound to its autoinducer and target DNA. EMBO J. 21, 4393–4401. 32. Zhang, R. G., Pappas, T., Brace, J. L., Miller, P. C., Oulmassov, T., Molyneaux, J. M. et al. (2002). Structure of a bacterial quorum-sensing transcription factor complexed with pheromone and DNA. Nature, 417, 971–974. 33. Dickerson, R. E. & Chiu, T. K. (1997). Helix bending as a factor in protein/DNA recognition. Biopolymers, 44, 361–403. 641 34. Dickerson, R. E. (1998). DNA bending: the prevalence of kinkiness and the virtues of normality. Nucl. Acids Res. 26, 1906–1926. 35. Ng, H. L., Kopka, M. L. & Dickerson, R. E. (2000). The structure of a stable intermediate in the A4B DNA helix transition. Proc. Natl Acad. Sci. USA, 97, 2035–2039. 36. Ng, H. L. & Dickerson, R. E. (2002). Mediation of the A/B-DNA helix transition by G-tracts in the crystal structure of duplex CATGGGCCCATG. Nucl. Acids Res. 30, 4061–4067. 37. Tisne, C., Hantz, E., Hartmann, B. & Delepierre, M. (1998). Solution structure of a non-palindromic 16 base-pair DNA related to the HIV-1 kB site: evidence for BI-BII equilibrium inducing a global dynamic curvature of the duplex. J. Mol. Biol. 279, 127–142. 38. Otwinowski, Z. & Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. 39. Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Crystallog. sect. D, 55, 849–861. 40. Terwilliger, T. C. (2002). Automated structure solution, density modification and model building. Acta Crystallog. sect. D, 58, 1937–1940. 41. Terwilliger, T. C. (2003). Automated main-chain model building by template matching and iterative fragment extension. Acta Crystallog. sect. D, 59, 38–44. 42. Terwilliger, T. C. (2003). Automated side-chain model building and sequence assignment by template matching. Acta Crystallog. sect. D, 59, 45–49. 43. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallog. sect. D, 53, 240–255. 44. McRee, D. E. (1999). XtalView/Xfit-a versatile program for manipulating atomic coordinates and electron density. J. Struct. Biol. 125, 156–165. 45. French, A. & Wilson, K. (1978). On the treatment of negative intensity observations. Acta Crystallog. sect. A, 34, 517–525. 46. Vagin, A. & Teplyakov, A. (1997). MOLREP: an automated program for molecular replacement. J. Appl. Crystallog. 30, 1022–1025. 47. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W. et al. (1998). Crystallography and NMR system: a new software suite for macromolecular structure determination. Acta Crystallog. sect. D, 54, 905–921. 48. Collaborative Computational Project Number 4. (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallog. sect. D, 50, 760–763. 49. Luscombe, N. M., Laskowski, R. A. & Thornton, J. M. (1997). NUCPLOT: a program to generate schematic diagrams of protein–nucleic acid interactions. Nucl. Acids Res. 25, 4940–4945. 50. Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. Acta Crystallog. sect. A, 32, 922–923. 51. Kleywegt, G. J. (1996). Use of non-crystallographic symmetry in protein structure. Acta Crystallog. sect. D, 52, 842–857. Edited by K. Morikawa (Received 28 July 2005; received in revised form 14 September 2005; accepted 15 September 2005) Available online 3 October 2005