2014 Glick Medical Biotechnology CH 6 PDF
2014 Glick Medical Biotechnology CH 6 PDF
2014 Glick Medical Biotechnology CH 6 PDF
Manipulating
Gene Expression in
Modulation of Gene Expression
Prokaryotes
Promoters
Translational Regulation
Codon Usage
Protein Stability
Fusion Proteins
Metabolic Load
Chromosomal Integration
Increasing Secretion
Overcoming Oxygen Limitation
Reducing Acetate
Protein Folding
he major objective of gene cloning for biotechnological applica-
Heterologous Protein
Production in Eukaryotic
Cells
Eukaryotic Expression Systems
T tions is the expression of a cloned gene in a selected host organism.
In addition, for many purposes, a high rate of production of the
protein encoded by the cloned gene is needed. To this end, a wide range
Saccharomyces cerevisiae of expression vectors that provide genetic elements for controlling tran-
Expression Systems scription and translation of the cloned gene as well as enhanced stability,
Other Yeast Expression Systems facilitated purification, and facilitated secretion of the protein product of
Baculovirus–Insect Cell
Expression Systems
the cloned gene have been constructed. There is not one single strategy
Mammalian Cell Expression for obtaining maximal protein expression from every cloned gene. Rather,
Systems there are a number of different biological parameters that can be manip-
Directed Mutagenesis ulated to yield an optimal level of expression.
Oligonucleotide-Directed The level of foreign gene expression also depends on the host organ-
Mutagenesis with M13 DNA ism. Although a wide range of both prokaryotic and eukaryotic organisms
Oligonucleotide-Directed
Mutagenesis with Plasmid DNA
have been used to express foreign genes, initially, many of the commer-
PCR-Amplified cially important proteins produced by recombinant DNA technology
Oligonucleotide-Directed were synthesized in Escherichia coli. The early dependence on E. coli as a
Mutagenesis host organism occurred because of the extensive knowledge of its genet-
Error-Prone PCR
Random Mutagenesis
ics, molecular biology, biochemistry, and physiology (Milestone 6.1). To
DNA Shuffling date, recombinant proteins have been produced using different strains of
Examples of Modified Proteins bacteria (including E. coli), yeasts, and mammalian cells grown in culture
SUMMARY and transgenic plants (Table 6.1). Each of these systems has its particular
REVIEW QUESTIONS advantages and disadvantages, so, again, there is no universal optimal
REFERENCES host for the expression of recombinant proteins, even those that will even-
tually be used as therapeutic agents or vaccines. Thus, for example, E. coli
cells may be engineered to produce high levels of foreign proteins; how-
ever, these proteins are not glycosylated and are sometimes misfolded. On
the other hand, with mammalian cells in culture, recombinant proteins
are correctly glycosylated and folded, although the yield of proteins is
much lower than in E. coli. Notwithstanding the very large differences
between organisms, the strategies that have been elaborated for E. coli, in
principle, are applicable to all systems.
doi:10.1128/9781555818890.ch6 329
330 CHAPTER 6
I
n 1972, Paul Berg and his co- the University of California at San With the publishing of this article,
workers (Jackson et al., 1972) Francisco) showed that a recombinant recombinant DNA technology had
demonstrated that fragments of DNA molecule could be created with- truly arrived. The technology spread,
bacteriophage λ DNA could be spliced out the use of viruses. They demon- first slowly to a few labs and then
into SV40. They reported, for the first strated that foreign DNA could be to dozens and, eventually, to tens of
time, that fragments of DNA could be inserted into plasmid DNA and subse- thousands of labs worldwide. In the
covalently joined with other DNA mol- quently perpetuated in E. coli. As they 40 years since the groundbreaking
ecules. This joining of “unrelated DNA state in the abstract to their research experiments of Cohen and Boyer and
molecules to one another” by Jackson article, “The construction of new plas- their colleagues, more than 200 new
et al. is arguably the first demonstra- mid DNA species by in vitro joining drugs produced by recombinant DNA
tion of the possibility of recombinant of restriction endonuclease-generated technology have been used to treat
DNA technology. However, while SV40 fragments of separate plasmids is over 300 million people for a wide
was at the time thought to be safe in described. Newly constructed plasmids range of human diseases. In addition,
humans, the prospect of an altered that are inserted into Escherichia coli today more than 400 additional drugs
form of the virus spreading unchecked, by transformation are shown to be bio- produced using this technology are in
through the common bacterium E. coli, logically functional replicons that pos- various stages of clinical trials, with
caused Berg to delay a portion of his sess genetic properties and nucleotide many expected to be on the market
research program. Thus, contrary to his base sequences from both of the parent within the next 5 to 10 years. Today,
original plan, Berg did not insert the DNA molecules. Functional plasmids Cohen and Boyer are widely regarded
recombinant virus into bacterial cells. can be obtained by reassociation of as the founders of the scientific
Soon after Berg published the endonuclease-generated fragments of revolution that has become modern
results of his experiments, Stanley larger replicons, as well as by joining biotechnology.
Cohen and Herbert Boyer and their of plamid DNA molecules of entirely
colleagues (at Stanford University and different origins.”
codons used by the host organism and the target gene, the stability of both
the recombinant protein and its mRNA, the metabolic functioning of the
host cell, and the localization of the introduced foreign gene as well as the
protein that it encodes.
Promoters
The minimum requirement for an effective gene expression system is
the presence of a strong and regulatable promoter sequence upstream
from a cloned gene. A strong promoter is one that has a high affinity
for the enzyme RNA polymerase, with the consequence that the adjacent
downstream region is frequently transcribed. The ability to regulate the
functioning of a promoter allows the researcher to control the extent of
transcription.
The rationale behind the use of strong and regulatable promoters is
that the expression of a cloned gene under the control of a continuously
activated (i.e., constitutive) strong promoter would likely yield a high level
of continual expression of a cloned gene, which is often detrimental to
the host cell because it creates an energy drain, thereby impairing essential
host cell functions. In addition, all or a portion of the plasmid carrying a
constitutively expressed cloned gene may be lost after several cell division
cycles, since cells without a plasmid grow faster and eventually take over
the culture. To overcome this potential problem, it is desirable to control
transcription so that a cloned gene is expressed only at a specific stage in
the host cell growth cycle and only for a specified duration. This may be
achieved by using a strong regulatable promoter. The plasmids constructed
to accomplish this task are called expression vectors.
For the production of foreign proteins in E. coli cells, a few strong
and regulatable promoters are commonly used, including those from the
E. coli lac (lactose) and trp (tryptophan) operons; the tac promoter, which
is constructed from the −10 region (i.e., 10 nucleotide pairs upstream
from the site of initiation of transcription) of the lac promoter and the
−35 region of the trp promoter; the leftward and rightward, or pL and pR,
promoters from bacteriophage λ; and the gene 10 promoter from bacte-
riophage T7. Each of these promoters interacts with regulatory proteins
(i.e., repressors or inducers), which provide a controllable switch for ei-
ther turning on or turning off the transcription of the adjacent cloned
genes. Each of these promoters is recognized by the major form of the E.
coli RNA polymerase holoenzyme. This holoenzyme is formed when a
protein, called sigma factor, combines with the core proteins (i.e., two α,
one β, and one β′ subunit) of RNA polymerase. The sigma factor directs
the binding of the holoenzyme to promoter regions on the DNA.
One commonly used expression system utilizes the gene 10 promoter
from bacteriophage T7 (Fig. 6.1). This promoter is not recognized by E.
coli RNA polymerase but, rather, requires T7 RNA polymerase for tran-
scription to occur. For this system to work in E. coli, the gene encoding
T7 polymerase is often inserted into the E. coli chromosome under the
transcriptional control of the E. coli lac promoter and operator. The E.
coli host cells must also contain the E. coli lacI gene, which encodes the
lac repressor. The lac repressor forms a tetramer that binds to the lac
332 CHAPTER 6
T7 RNA
polymerase Target protein
mRNA
lac repressor
mRNA
mRNA
prhaBAD T7 lysozyme gene TT plac olac T7 RNA TT pT7 olac Target gene TT
polymerase gene
334 CHAPTER 6
Translational Regulation
Placing a cloned gene under the control of a regulatable, strong promoter,
although essential, may not be sufficient to maximize the yield of the
cloned gene product. Other factors, such as the efficiency of translation
and the stability of the newly synthesized cloned gene protein, may also
affect the amount of product.
In prokaryotic cells, various proteins are not necessarily synthesized
with the same efficiency. In fact, they may be produced at very different
levels (up to several hundredfold) even if they are encoded within the
same polycistronic mRNA. Differences in translational efficiency and in
transcriptional regulation enable the cell to have hundreds or even thou-
sands of copies of some proteins and only a few copies of others.
The molecular basis for differential translation of bacterial mRNAs is
the presence of a translational initiation signal called a ribosome-binding
site which precedes the protein-coding portion of the mRNA. A
ribosome-binding site is a sequence of six to eight nucleotides in mRNA
that can base-pair with a complementary nucleotide sequence on the 16S
RNA component of the small ribosomal subunit. Generally, the stronger
the binding of the mRNA to the rRNA, the greater the efficiency of trans-
lational initiation.
Thus, many E. coli expression vectors have been designed to ensure
that the mRNA of a cloned gene contains a strong ribosome-binding
Figure 6.3 Example of secondary struc- site. Inclusion of an E. coli ribosome-binding site just upstream from the
ture of the 5′ end of an mRNA that
would prevent efficient translation. In protein-coding open reading frame ensures that heterologous prokaryotic
this example, the ribosome-binding site and eukaryotic genes can be translated readily in E. coli. However, certain
is GGGGG, the initiator codon is AUG conditions must be satisfied for this approach to function properly. First, the
(shown in red), and the first few codons
are CAG-CAU-GAU-UUA-UUU. The ribosome-binding sequence must be located within a short distance (gen-
mRNA is oriented with its 5′ end to erally 2 to 20 nucleotides) from the translational start codon of the cloned
the left and its 3′ end to the right. Note gene. At the RNA level, the translational codon is usually AUG (adenosine,
that in addition to the traditional A⋅U
uridine, and guanidine). In DNA, the coding strand contains the ATG se-
and G⋅C base pairs in mRNA, G can
also base-pair to some extent with U. quence (where T is thymidine) that functions as a start codon, and the
doi:10.1128/9781555818890.ch6.f6.3 complementary noncoding strand is a template for transcription. Second,
A the DNA sequence that includes the ribosome-binding site through the first
C G few codons of the gene of interest should not contain nucleotide sequences
G C
that have regions of complementarity and can fold back (form intrastrand
loops) after transcription (Fig. 6.3), thereby blocking the interaction of the
U A
mRNA with the ribosome. The local secondary structure of the mRNA,
A U
which can either shield or expose the ribosome-binding site, determines the
U G extent to which the mRNA can bind to the appropriate sequence on the
A A ribosome and initiate translation. Thus, for each cloned gene, it is impor-
A U tant to ensure that the mRNA contains a strong ribosome-binding site and
A U that the secondary structure of the mRNA does not prevent its access to the
A U ribosome. However, since the nucleotide sequences that encode the amino
G A
acids at the N-terminal region of the target protein vary from one gene to
another, it is not possible to design a vector that will eliminate the possi-
G U
bility of mRNA fold-back in all instances. Therefore, no single optimized
G U
translational initiation region can guarantee a high rate of translation ini-
G U tiation for all cloned genes. Consequently, the optimization of translation
5' G A A C C G G A A C A C 3' initiation needs to be on a gene-by-gene basis.
Modulation of Gene Expression 335
Codon Usage
While the genetic code for amino acids, on average, includes about three
different codons (any particular amino acid may have from one to six
codons), these codons are used to different extents in various living or-
ganisms. Any organism, e.g., E. coli, produces cognate tRNAs for each
codon in approximately the same relative amount as that particular co-
don is used in the production of its proteins. Various organisms prefer-
entially use different subsets of codons (Table 6.2) and contain various
amounts of the cognate amino acyl-tRNAs for the synthesis of proteins
encoded by their mRNAs. Thus, expressing a foreign gene in a particular
host organism may result in a cellular incompatibility that can interfere
with efficient translation when a cloned gene has codons that are rarely
used by the host cell. For example, AGG, AGA, AUA, CUA, and CGA
are the least-used codons in E. coli. When a foreign protein is expressed
at high levels in E. coli, the host cell may not produce enough of the
aminoacyl-tRNAs that recognize these rarely used codons, and either the
yield of the cloned gene protein is much lower than expected or incorrect
amino acids may be inserted into the protein. Any codon that is used
less than 5 to 10% of the time by the host organism may cause prob-
lems. Particularly detrimental to high levels of expression are regions of
mRNA where two or more rarely used codons are close or adjacent to, or
appear in, the sequence encoding the N-terminal portion of the protein.
Fortunately, there are several experimental approaches that can be used
to alleviate this problem. First, if the target gene is eukaryotic, it may be
cloned and expressed in a eukaryotic host cell. Second, a new version of
the target gene containing codons more commonly used by the host cell
may be chemically synthesized (i.e., codon optimization). Third, a host E.
coli cell engineered to overexpress several rare tRNAs may be employed.
In fact, some E. coli strains have been transformed with plasmids that
encode genes that lead to the overproduction of some tRNAs which are
specific for certain rare E. coli codons. These transformed E. coli cell lines
are available commercially and can often facilitate a high level of expres-
sion of foreign proteins that use these rare E. coli codons (Fig. 6.4). For
example, with one of the commercially available E. coli cell lines, it was
possible to overexpress the Ara h2 protein, a peanut allergen, approxi-
mately 100-fold over the amount that was synthesized in conventional
E. coli cells. With this approach, it should be possible to produce large
quantities of a variety of heterologous proteins that are otherwise difficult
to express in different hosts.
Protein Stability
The expression of some foreign proteins in E. coli host strains, which
are typically grown at 37°C, often results in the formation of inclusion
bodies of inactive protein. This occurs because the foreign protein mis-
folds when it cannot attain its native active conformation. A variety of
strategies have been developed, albeit with limited success, to circumvent
this problem. Cultivation of recombinant strains at lower temperatures
sometimes facilitates slower, and hence proper, protein folding, often sig-
nificantly increasing the amount of recoverable active protein. However,
mesophilic bacteria like E. coli grow extremely slowly at low tempera-
tures. In one study, the chaperonin 60 gene (cpn60) and the cochaperonin
10 gene (cpn10) from the psychrophilic bacterium Oleispira antarctica
were introduced into a host strain of E. coli, with the result that the E.
coli strain gained the ability to grow and to express foreign proteins at
a high rate at temperatures of 4 to 10°C (Fig. 6.5). It has been suggested
that at temperatures below around 20°C, E. coli cells are unable to grow
to any appreciable extent as a consequence of the cold-induced inactiva-
tion of several E. coli chaperonins that normally facilitate protein folding
in this bacterium. Thus, transforming E. coli with chaperonins from a
cold-tolerant bacterium allowed the introduced proteins to perform the
functions at low temperature that E. coli proteins perform at higher tem-
peratures. Although very high levels of expression of the cloned gene were
Modulation of Gene Expression 337
A
metT leuW
lys proL
argW
argU
thrT
l
C hlo ra m p h e nico
S p e c ti n o m y c i n
ileX glyT
pSJS1244 pRARE
tryU
thrU
argU
p15A ileX p15A
Plasmid with
foreign gene Low level of foreign
gene expression
cpn10 cpn60
Figure 6.5 The ability of nontransformed E. coli and E. coli transformed to express
plasmid-borne chaperonin genes cpn10 and cpn60, which were isolated from a psy-
chrophilic bacterium, and consequently grow at low temperatures.
doi:10.1128/9781555818890.ch6.f6.5
Fusion Proteins
Occasionally, foreign proteins are found in smaller-than-expected amounts
when they are produced in heterologous host cells. This apparent low level
of expression may be due to degradation of the foreign protein within the
host cell. One solution is to engineer a DNA construct encoding a target
protein in frame with DNA encoding a stable host protein (Fig. 6.6). The
combined, single protein that is produced is called a fusion protein, and
it protects the cloned foreign gene product from attack by host cell pro-
teases. In general, fusion proteins are stable because the target proteins
are fused with proteins that are not especially susceptible to proteolysis.
Fusion proteins are constructed by ligating a portion of the DNA
coding regions of two or more genes. In its simplest form, a fusion vector
system entails the insertion of a target gene into the coding region of a
cloned host gene, or fusion partner (Fig. 6.6). The fusion partner may be
positioned at either the N- or C-terminal end of the target gene. Knowl-
edge of the nucleotide sequences of the various coding segments joined at
Modulation of Gene Expression 339
the DNA level is essential to ensure that the ligation product maintains
the correct reading frame. If the combined DNA has an altered reading
frame, i.e., a sequence of successive codons that yields either an incom-
plete or an incorrect translation product, then a functional version of the
protein encoded by the cloned target gene will not be produced.
When the protein encoded by the cloned gene is intended for human
use, it is generally necessary to remove the fusion partner from the final
product. This is because fusion proteins require more extensive testing
before being approved by regulatory agencies, such as the U.S. Food and
Drug Administration (FDA). Therefore, strategies have been developed to FDA
remove the unwanted amino acid sequence from the target protein. One U.S. Food and Drug Administration
way to do this is to join the gene for the target protein to the gene for
the stabilizing fusion partner with specific oligonucleotides that encode
short stretches of amino acids that are recognized by a particular non-
bacterial protease. For example, an oligonucleotide linker encoding the
amino acid sequence Ile-Glu-Gly-Arg may be joined to the cloned gene.
Following synthesis and purification of the fusion protein, a blood coag-
ulation factor called Xa can be used to release the target protein from the
fusion partner, because factor Xa is a specific protease that cleaves peptide
bonds uniquely on the C-terminal side of the Ile-Glu-Gly-Arg sequence
(Fig. 6.7). Moreover, because this peptide sequence occurs rather infre-
quently in native proteins, this approach can be used to recover many
different cloned gene products.
The proteases most commonly used to cleave a fusion partner from
a target protein interest are enterokinase, tobacco etch virus protease,
thrombin, and factor Xa. However, following this cleavage, it is neces-
sary to perform additional purification steps in order to separate both
Figure 6.7 (A) Proteolytic cleavage of a fusion protein by blood coagulation factor
Xa. The factor Xa recognition sequence (Xa linker sequence) lies between the amino
acid sequences of two different proteins. A functional cloned gene protein (with Val at
its N terminus) is released after cleavage. (B) Schematic representation of a tripartite
fusion protein including a stable fusion partner, a linker peptide, and the cloned target
protein. doi:10.1128/9781555818890.ch6.f6.7
A B
Site of
cleavage of
factor Xa
Xa linker sequence
. . . Thr-Ala-Glu-Gly-Gly-Ser-Ile-Glu-Gly-Arg-Val-His-Leu . . .
Peptide
Fusion partner linker Target protein
340 CHAPTER 6
Table 6.3 Some protein fusion systems used to facilitate the purification of foreign
proteins in E. coli and other host organisms
Fusion partner Size Ligand Elution conditions
ZZ 14 kDa Immunoglobulin G Low pH
Histidine tail 6–10 amino acids Ni2+ Imidazole
Strep tag 10 amino acids Streptavidin Iminobiotin
Pinpoint 13 kDa Streptavidin Biotin
Maltose-binding 40 kDa Amylose Maltose
protein
GST 26 kDa Glutathione Reduced glutathione
Flag 8 amino acids Specific MAb EDTA or low pH
Polyarginine 5–6 amino acids SP-Sephadex High salt at pH
>8.0
c-myc 11 amino acids Specific MAb Low pH
S tag 15 amino acids S fragment of RNase A Low pH
Calmodulin-binding 26 amino acids Calmodulin EGTA and high salt
peptide
Cellulose-binding 4–20 kDa Cellulose Urea or guanidine
domain hydrochloride
Chitin-binding domain 51 amino acids Chitin SDS or guanidine
hydrochloride
SBP tag 38 amino acids Streptavidin Biotin
ZZ, a fragment of Staphylococcus aureus protein A; Strep tag, a peptide with affinity for streptavidin;
Pinpoint, a protein fragment that is biotinylated and binds streptavidin; GST, glutathione S-transferase; Flag,
a peptide recognized by enterokinase; EDTA, ethylenediaminetetraacetic acid; c-myc, a peptide from a pro-
tein that is overexpressed in many cancers; S tag, a peptide fragment of ribonuclease (RNase) A; EGTA,
ethylene glycol-bis(β-aminoethyl ether)-N,N,N′,N′-tetraacetic acid; SBP (streptavidin-binding protein), a
peptide with affinity for streptavidin; SP-Sephadex, a cation-exchange resin composed of sulfopropyl groups
covalently attached to Sephadex beads; SDS, sodium dodecyl sulfate.
the protease and the fusion protein from the protein of interest. Unfortu-
nately, sometimes proteases also cleave the protein of interest. When this
occurs to any significant extent, it is necessary to change either the linker
peptide or the digestion conditions.
In addition to reducing the degradation of cloned foreign proteins,
a number of fusion proteins have been developed to simplify the purifi-
cation of recombinant proteins (Table 6.3). This approach is useful for
purification of proteins expressed in either prokaryotic or eukaryotic host
organisms. For example, a vector that contains the human interleukin-2
IL-2 (IL-2) cytokine gene joined to DNA encoding the fusion partner (marker
interleukin-2 peptide) sequence Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys serves the dual
function of reducing the degradation of the expressed IL-2 gene prod-
uct and facilitating the purification of the product. Following expression
of this construct, the secreted fusion protein can be purified in a single
step by immunoaffinity chromatography, in which monoclonal antibodies
directed against the marker peptide have been immobilized on a solid
support and act as ligands to bind the fusion protein (Fig. 6.8). Because
this particular marker peptide is relatively small, it does not significantly
decrease the amount of host cell resources that are available for the pro-
duction of IL-2; thus, the yield of IL-2 is not compromised by the con-
comitant synthesis of the marker peptide. In addition, while the fusion
protein has the same biological activity as native IL-2, as mentioned
above, to more readily satisfy the government agencies that regulate the
Modulation of Gene Expression 341
A B
Marker
peptide Interleukin-2 Other proteins
Marker peptide/interleukin-2
fusion protein bound to specific
antibodies on a column
Figure 6.8 Immunoaffinity chromatographic purification of a fusion protein that in-
cludes a marker protein and IL-2. A monoclonal antibody that binds to the marker
peptide of the fusion protein (anti-marker peptide antibody) is attached to a solid
matrix support. The secreted proteins (A) are passed through the column containing
the bound antibody. The marker peptide portion of the fusion protein is bound to
the antibody (B), and the other proteins pass through. The immunopurified fusion
protein can then be eluted from the column by the addition of pure marker peptide.
doi:10.1128/9781555818890.ch6.f6.8
Metabolic Load
The introduction and expression of foreign DNA in a host organism often
change the metabolism of the organism in ways that may impair normal
cellular functioning (Fig. 6.9). This biological response is due to a meta-
bolic load (metabolic burden, metabolic drain) imposed upon the host by
the presence and expression of foreign DNA. A metabolic load can occur
as the result of a variety of conditions, including the following.
• Increasing plasmid copy number and/or size requires increasing
amounts of cellular energy for plasmid replication and maintenance.
• The limited amount of dissolved oxygen in the growth medium is
often insufficient for both host cell metabolism and plasmid main-
tenance and expression (see the section on overcoming oxygen lim-
itation below).
• Overproduction of both target and marker proteins may deplete
the pools of certain aminoacyl-tRNAs (see the section on codon
usage above) and/or drain the host cell of its energy (in the form of GTP
guanosine 5′-triphosphate
ATP or guanosine 5′-triphosphate [GTP]).
342 CHAPTER 6
Cellular building
Nontransformed E. coli
blocks and energy
Chromosomal Integration
As a consequence of metabolic load, a fraction of the cell population often
loses its plasmids during cell growth. In addition, cells that lack plasmids
grow faster than those that retain them, so plasmidless cells eventually
dominate the culture. After a number of generations of cell growth, the
loss of plasmid-containing cells diminishes the yield of the cloned gene
product. Plasmid-containing cells may be maintained by growing the cells
344 CHAPTER 6
2 Marker gene
Chromosomal DNA
Target gene
Chromosomal DNA
Modulation of Gene Expression 345
crossover. In this case, transformants are selected for the acquisition of the
marker gene (often an antibiotic resistance gene). Then, the target gene,
under the control of a regulatable promoter, is inserted in the middle of
the cloned chromosomal integration site on a different plasmid. This plas-
mid construct is used to transform host cells that contain the marker gene
integrated into its chromosome, and following a host enzyme-catalyzed
double crossover, the target gene and its transcriptional regulatory region
are inserted into the chromosome in place of the marker gene. The final
construct is selected for the loss of the marker gene.
Several other methods can also be used to integrate foreign genes into
host chromosomal DNA. For example, when a marker gene is flanked
by certain short specific DNA sequences and then inserted into either a
plasmid or chromosomal DNA, the gene may be excised by treatment of
the construct with an enzyme that recognizes the flanking DNA sequences
and removes them (Fig. 6.11). One combination of an enzyme and DNA
sequence that is useful for this sort of manipulation is the Cre–loxP re-
combination system, which consists of the Cre recombinase enzyme and
two 34-bp loxP recombination sites. The marker gene to be removed is
flanked by loxP sites, and after integration of the plasmid into the chro-
mosomal DNA, the marker gene is removed by the Cre enzyme. A gene
Plasmid
Cloned
gene
Site of Homologous
recombination chromosomal DNA
Homologous
recombination
+ Cre protein
346 CHAPTER 6
encoding the Cre enzyme is located on its own plasmid, which can be
introduced into the chromosomally transformed host cells. Marker gene
excision is triggered by the addition of IPTG to the growth medium; this
derepresses the lacI gene (encoding the lac repressor), which turns on
the E. coli lac promoter–operator, which was present upstream of the
Cre gene, and causes the Cre enzyme to be synthesized. Once there is no
longer any need for the Cre enzyme, the plasmid that contains the gene
for this enzyme under the control of the lac promoter may be removed
from the host cells merely by raising the temperature. This plasmid has a
temperature-sensitive replicon that allows it to be maintained in the cell
at 30°C but not above 37°C.
Increasing Secretion
For most E. coli proteins, secretion entails transit through the inner (cy-
toplasmic) cell membrane to the periplasm. Directing a foreign protein
to the periplasm, rather than the cytoplasm, makes its purification easier
and less costly, as many fewer proteins are present here than in the cyto-
plasm. Moreover, the stability of a cloned protein depends on its cellular
location in E. coli. For example, recombinant proinsulin is approximately
10 times more stable if it is secreted (exported) into the periplasm than
if it is localized in the cytoplasm. In addition, secretion of proteins to the
periplasm often facilitates the correct formation of disulfide bonds be-
cause the periplasm provides an oxidative environment, as opposed to the
more reducing environment of the cytoplasm. Table 6.4 provides some ex-
amples of the amounts of secreted recombinant pharmaceutical proteins
attainable with various bacterial host cells.
Normally, an amino acid sequence called a signal peptide (also called
a signal sequence or leader peptide), located at the N-terminal end of a
newly synthesized protein, facilitates its export by enabling the protein
to pass through the cell membrane (Fig. 6.12). It is sometimes possible
to engineer a protein for secretion to the periplasm by adding the DNA
sequence encoding a signal peptide to the cloned gene. When the recom- Ribosome
binant protein is secreted to the periplasm, the signal peptide is precisely mRNA
removed by the cell’s secretion apparatus, so the N-terminal end of the Growing Aminoacyl-
target protein is identical to the natural protein. peptide tRNA
Unfortunately, the fusion of a target gene to a DNA fragment en- chain
Cytoplasm
coding a signal peptide sequence does not necessarily guarantee a high
rate of secretion. When this simple strategy is found to be ineffective in Cell membrane
producing a secreted protein product, alternative strategies need to be Periplasm
employed. One approach that was found to be successful for the secretion
Signal peptide
of the IL-2 cytokine was the fusion of the IL-2 gene downstream from
the gene for the entire propeptide maltose-binding protein, rather than Figure 6.12 Schematic representation
just the maltose-binding protein signal sequence, with DNA encoding the of protein secretion. The ribosome is
factor Xa recognition site as a linker peptide separating these two genes attached to a cellular membrane, and
the signal peptide at the N terminus is
(Fig. 6.13). When this genetic fusion, on a plasmid vector, was used to transported, by the secretion appara-
transform E. coli cells, as expected, a large fraction of the fusion protein tus, across the cytoplasmic membrane,
was found localized in the host cell periplasm. Functional IL-2 could then followed by the rest of the amino acids
that constitute the mature protein. Once
be released from the fusion protein by digestion with factor Xa. the signal peptide has crossed the mem-
Sometimes too high a level of translation of a foreign protein can brane, it is cleaved from the remainder
overload the cell’s secretion machinery and inhibit the secretion of that of the protein by an enzyme associated
protein. Thus, to ensure that secretion of a target protein occurs most with the membrane called a signal pep-
tidase. Membrane proteins as well as se-
efficiently, it is necessary to lower the level of expression of that protein. creted proteins generally contain a signal
E. coli and other gram-negative microorganisms generally cannot se- peptide (prior to removal by processing).
crete proteins into the surrounding medium because of the presence of doi:10.1128/9781555818890.ch6.f6.12
an outer membrane (in addition to the inner or cytoplasmic membrane)
that restricts this process. Of course, it is possible to use as host organ-
isms gram-positive prokaryotes or eukaryotic cells, both of which lack
an outer membrane and therefore can secrete proteins directly into the
medium. Alternatively, it is possible to take advantage of the fact that
some gram-negative bacteria can secrete a bacteriocidal protein called a
bacteriocin into the medium. A cascade mechanism is responsible for this
specific secretion. A bacteriocin release protein activates phospholipase A,
which is present in the bacterial inner membrane, and cleaves membrane
phosopholipids so that both the inner and outer membranes are perme-
abilized (Fig. 6.14A). This results in some cytoplasmic and periplasmic
Bacteriocin
release factor Phospholipase A
Inner membrane
Outer membrane
Target gene
Target protein
C
proteins being released into the culture medium. Thus, by putting the
bacteriocin release protein gene onto a plasmid under the control of a
regulatable promoter, E. coli cells may be permeabilized at will. E. coli
cells that carry the bacteriocin release protein gene on a plasmid are trans-
formed with another plasmid carrying a cloned gene that has been fused
to a secretion signal peptide sequence that causes the target protein to
be secreted into the periplasm. The cloned gene is placed under the same
transcriptional regulatory control as the bacteriocin release protein gene
so that the two genes can be induced simultaneously, with the cloned gene
protein being secreted into the medium (Fig. 6.14B and C).
A B
Cytochrome H+ ATPase H+
complex complex Periplasm
+ + + + + + + + + + + + + + + + + + + + + +
O2
O2
– – – – – – – – – – – – – – – – – – – – – – O2
O2 O2 Bacterial cells
Cytoplasm
O2 H2O
O2 ADP ATP
O2 O2
O2 O2
Vitreoscilla
hemoglobin Figure 6.15 (A) Schematic representation of the binding of O2 by Vitreoscilla he-
moglobin, the utilization of this O2 in pumping (by proteins such as cytochromes)
H+ from the cytoplasm to the periplasm, and the subsequent coupling of H+ up-
take (by ATPase) to ATP generation. (B) Host cells engineered to express Vitreoscilla
hemoglobin are more efficient at taking up oxygen from the growth medium.
doi:10.1128/9781555818890.ch6.f6.15
Reducing Acetate
It is often difficult to achieve high levels of foreign-gene expression and
a high host cell density at the same time because of the accumulation
of harmful waste products, especially acetate, which inhibits both cell
growth and protein production and also wastes available carbon and en-
ergy resources. Since acetate is often associated with the use of glucose
as a carbon source, lower levels of acetate, and hence higher yields of
protein, are generally obtained when fructose or mannose is used as a
carbon source. In addition, several different types of genetically manip-
ulated E. coli host cells that produce lower levels of acetate have been
developed. One of these modified strains was produced by introducing a
gene (from B. subtilis) encoding the enzyme acetolactate synthase into E.
coli host cells. This enzyme catalyzes the formation of acetolactate from
pyruvate, thereby decreasing the flux through acetyl coenzyme A to ace-
tate (Fig. 6.16). In practice, the acetolactate synthase genes are introduced
into the cell on one plasmid, while the target gene (encoding the protein
Modulation of Gene Expression 351
Glucose Biomass
Glucose-6-phosphate
Phosphoenolpyruvate Succinate
ALS system
Acetyl-CoA
Acetoin
Acetaldehyde
Glucose Biomass
Glucose-6-phosphate
Phosphoenolpyruvate
Pyruvate
Oxaloacetate Citrate
TCA cycle
Aspartate Fumarate
Aspartase
Figure 6.17 Replenishment of the TCA cycle in E. coli by the introduction of a gene
encoding pyruvate carboxylase. This avoids the conversion of pyruvate to acetate.
The TCA cycle may also be replenished by the introduction of a gene encoding
aspartase, converting aspartate in the medium to fumarate. Note that the conver-
sion of glucose to biomass is a multistep process. Acetyl-CoA, acetyl coenzyme A.
doi:10.1128/9781555818890.ch6.f6.17
late log phase of growth. When the recombinant E. coli cells were grown
in minimal medium containing aspartate, the production of different re-
combinant proteins could be increased up to fivefold, with 30 to 40%
more biomass production.
Protein Folding
The use of conditions that result in very high rates of foreign gene ex-
pression in E. coli often also lead to the production of misfolded proteins
that can aggregate and form insoluble inclusion bodies within the host
cell. While it is possible to solubilize inclusion bodies and subsequently
establish conditions that allow at least a portion of the recombinant pro-
tein to fold correctly, this is typically a tedious, inefficient, expensive, and
time-consuming process, one that is best avoided if possible. A simple
strategy to avoid the formation of misfolded proteins and hence inclusion
bodies involves reducing the rate of synthesis of the target gene product
Modulation of Gene Expression 353
so that it has more time to fold properly. This may be achieved by various
means, including using weaker promoters, decreasing the concentration
of inducers (such as IPTG), or lowering the growth temperature to 20 to
30°C. These strategies are sometimes, but not always, effective in prevent-
ing the formation of inclusion bodies.
An alternative strategy to improve the yield of properly folded (and
therefore active) recombinant proteins in E. coli involves the coexpression
of one or more molecular chaperones (proteins that facilitate the correct
folding of other proteins) by the host E. coli strain (Table 6.5). The “fold-
ing chaperones” utilize ATP cleavage to promote conformational changes
to mediate the refolding of their substrates. The “holding chaperones”
bind to partially folded proteins until the folding chaperones have done
their job. The “disaggregating chaperone” promotes the solubilization of
proteins that have become aggregated. Protein folding also involves the
“trigger factor,” which binds to nascent polypeptide chains, acting as a
holding chaperone. Although there are a large number of chaperone mol-
ecules that are involved in the proper folding and secretion of proteins
C
onceptually, the development of genes had been cloned into the construct. Mulligan et al. concluded,
of a eukaryotic expression mammalian SV40 vectors, but mature, “The principal conceptual innova-
system appears to be a rel- functional mRNAs were never detected tion is the decision to leave intact
atively simple matter of assembling after infection of host cells. This prob- the regions of the vector implicated
the appropriate regulatory sequences, lem was overcome by inserting the rab- in . . . mRNA processing.” This study
cloning them in the correct order into bit cDNA for β-globin into an SV40 established that an effective eukaryotic
a vector, and then putting the gene of gene that had nearly all of its coding expression system could be created
interest into the precise location that region deleted but retained “all the by placing the cloned gene under the
enables it to be expressed. In reality, regions implicated in transcriptional control of transcription and translation
the development of the first genera- initiation and termination, splicing regulatory sequences. It also stimulated
tion of eukaryotic expression vectors and polyadenylation. . . .” Both rabbit additional research that pinpointed
was a painstaking process following β-globin mRNA and protein were in detail the structural prerequisites
a trial-and-error approach. Before the synthesized in cells that were trans- for the next generation of eukaryotic
study of Mulligan et al., a number fected with this β-globin cDNA–SV40 expression vectors.
S. cerevisiae Vectors
There are three main classes of S. cerevisiae expression vectors: episomal,
YEp or plasmid, vectors (yeast episomal plasmids [YEps]), integrating vectors
yeast episomal plasmid (yeast integrating plasmids [YIps]), and YACs. Of these, episomal vec-
tors have been used extensively for the production of either intra- or ex-
YIp tracellular heterologous proteins. Typically, the vectors contain features
yeast integrating plasmid that allow them to function in both bacteria and S. cerevisiae. An E. coli
origin of replication and bacterial antibiotic resistance genes are usually
included on the vector, enabling all manipulations to first be performed
in E. coli before the vector is transferred to S. cerevisiae for expression.
The YEp vectors are based on the 2μm plasmid, a small, indepen-
dently replicating circular plasmid found in about 30 copies per cell in
most natural strains of S. cerevisiae. Many S. cerevisiae selection schemes
rely on mutant host strains that require a particular amino acid (e.g., his-
tidine, tryptophan, or leucine) or nucleotide (e.g., uracil) for growth. Such
auxotrophic strains cannot grow on minimal growth medium unless it is
supplemented with a specific nutrient. In practice, the vector is equipped
Modulation of Gene Expression 357
URA3
Amp r
gene pYAC
ori E
T T
BamHI BamHI
BamHI
SmaI
Alkaline phosphatase
Ligate
Amp r
T ori E gene TRP1 ARS CEN URA3 T
Insert DNA
Figure 6.19 YAC cloning system. The YAC plasmid (pYAC) has an E. coli selectable
marker (Ampr) gene; an origin of replication that functions in E. coli (ori E); and yeast
DNA sequences, including URA3, CEN, TRP1, and ARS. CEN provides centro-
mere function, ARS is a yeast autonomous replicating sequence that is equivalent
to a yeast origin of replication, URA3 is a functional gene of the uracil biosynthesis
pathway, and TRP1 is a functional gene of the tryptophan biosynthesis pathway. The
T regions are yeast chromosome telomeric sequences. The SmaI site is the cloning
insertion site. pYAC is first treated with SmaI, BamHI, and alkaline phosphatase and
then ligated with size-fractionated (100-kb) input DNA. The final construct carries
cloned DNA and can be stably maintained in double-mutant ura3 and trp1 cells.
doi:10.1128/9781555818890.ch6.f6.19
Protein released to
Golgi apparatus for
further processing
Correctly
UPR Misfolded folded
protein PDI
protein
BiP
Hac1
ER
Nucleus mRNA
Ribosomes
Cytoplasm
Figure 6.20 Summary of protein folding in the endoplasmic reticulum (ER) of yeast
cells. During synthesis on ribosomes associated with the ER, nascent proteins are
bound by the chaperones BiP and calnexin, which aid in the correct folding of the
protein. Protein disulfide isomerases (PDI) catalyze the formation of disulfide bonds
between cysteine amino acids that are nearby in the folded protein. Quality control
systems ensure that only correctly folded proteins are released from the ER. Proteins
released from the ER are transported to the Golgi apparatus for further processing.
Prolonged binding of BiP to misfolded proteins leads to activation of the S. cerevisiae
transcription factor Hac1, which controls the expression of several proteins that me-
diate the unfolded-protein response (UPR). Adapted from Gasser et al., Microb. Cell
Fact. 7:11–29, 2008. doi:10.1128/9781555818890.ch6.f6.20
A D
B C E
ori E
Ampr
MCS
due to the alkaline gut environment, and the virions enter midgut cells to A
begin the infection cycle in the nucleus. Within the insect midgut, the in-
fection can spread from cell to cell as viral particles (single nucleocapsids) Cell membrane
bud off from an infected cell and infect other midgut cells. The budding
form is not embedded in a polyhedrin matrix and is not infectious to
other individual insect hosts, although it can infect cultured insect cells.
Plaques produced in insect cell cultures by the budding form of baculovi-
Nucleocapsid
rus have a morphology different from that of the occluded form. During
the late stages of the infection cycle in the insect host, about 36 to 48 h
after infection, the polyhedrin protein is produced in massive quantities
and production continues for 4 to 5 days, until the infected cells rupture
and the host organism dies. Occluded virions are released and can infect B Polyhedron
new hosts. Polyhedrin
The promoter for the polyhedrin (polyh) gene can account for as
much as 25% of the mRNA produced in cells infected with the virus.
However, the polyhedrin protein is not required for virus production, so
replacement of the polyhedrin gene with a coding sequence for a heterol-
ogous protein, followed by infection of cultured insect cells, results in the
production of large amounts of the heterologous protein. Furthermore,
because of the similarity of posttranslational modification systems be-
tween insects and mammals, it was thought that the recombinant protein
Nucleocapsid
would mimic closely the authentic form of the original protein. Baculo-
viruses have been highly successful as delivery systems for introducing
Figure 6.23 Budded (A) and occluded
target genes for production of heterologous proteins in insect cells. More (B) forms of AcMNPV. During bud-
than a thousand different proteins have been produced using this system, ding, a nucleocapsid becomes enveloped
including enzymes, transport proteins, receptors, and secreted proteins. by the membrane of an infected cell. A
polyhedron consists of clusters of nucle-
The specific baculovirus that has been used extensively as an ex- ocapsids (occluded virions) embedded
pression vector is Autographa californica multiple nucleopolyhedrovirus in various orientations in a polyhedrin
(AcMNPV). A. californica (the alfalfa looper) and over 30 other insect matrix.
doi:10.1128/9781555818890.ch6.f6.23
species are infected by AcMNPV. This virus also grows well on many in-
sect cell lines. The most commonly used cell line for genetically engineered
AcMNPV is derived from the fall armyworm, Spodoptera frugiperda.
In these cells, the polyhedrin promoter is exceptionally active, and dur- AcMNPV
ing infections with wild-type baculovirus, high levels of polyhedrin are Autographa californica multiple
nucleopolyhedrosis virus
synthesized.
A B
Baculovirus Baculovirus Modified baculovirus
DNA Promoter MCS TT DNA
Bsu36I Bsu36I
Cleaved baculovirus
Bsu36I Bsu36I
p t
Gene 603 Gene of interest Gene 1629
Transfer vector
Recombinant baculovirus
p t
Gene 603 Gene of interest Gene 1629
A
AcMNPV genome
Polyhedrin gene
Kanr
lacZ'att lacZ' gene ori E
5' 3'
E. coli plasmid
B
Bacmid
Tetr gene
C
Recombinant bacmid
* *
5' lacZ'attR Genr p GOI t attLlacZ' Kanr ori E 3'
gene gene
Figure 6.25 Construction of a recombinant bacmid. (A) An E. coli plasmid is incorpo-
rated into the AcMNPV genome by a double-crossover event (dashed lines) between
DNA segments (5′ and 3′) that flank the polyhedrin gene to create a shuttle vector
(bacmid) that replicates in both E. coli and insect cells. The gene for resistance to kan-
amycin (Kanr), an attachment site (att) that is inserted in frame in the lacZ′ sequence,
and an E. coli origin of replication (ori E) are introduced as part of the plasmid DNA.
(B) The transposition proteins encoded by genes of the helper plasmid facilitate the
integration (transposition) of the DNA segment of the transfer vector that is bounded
by two attachment sequences (attR and attL). The gene for resistance to gentamicin
(Genr) and a gene of interest (GOI) that is under the control of the promoter (p) and
transcription terminator (t) elements of the polyhedrin gene are inserted into the at-
tachment site (att) of the bacmid. The helper plasmid and transfer vector carry the
genes for resistance to tetracycline (Tetr) and ampicillin (Ampr), respectively. (C) The
recombinant bacmid has a disrupted lacZ′ gene (*). The right-angled arrow denotes
the site of initiation of transcription of the cloned gene after transfection of the re-
combinant bacmid into an insect cell. Cells that are transfected with a recombinant
bacmid are not able to produce functional β-galactosidase.
doi:10.1128/9781555818890.ch6.f6.25
Modulation of Gene Expression 367
Figure 6.26 N glycosylation of proteins in the Golgi apparatus of insect, human, and
humanized insect cells. While the sugar residues added to N-glycoproteins in the en-
doplasmic reticulum are similar in insect and human cells, further processing in the
Golgi apparatus yields a trimmed oligosaccharide (paucimannose) in insect cells and
an oligosaccharide that terminates in sialic acid in human cells. To produce recombi-
nant proteins for use as human therapeutic agents, humanized insect cells have been
engineered to express several enzymes that process human glycoproteins accurately.
Blue squares, N-acetylglucosamine; red circles, mannose; green squares, galactose; or-
ange squares, sialic acid. doi:10.1128/9781555818890.ch6.f6.26
Vector Design
Many cloning vectors for the expression of heterologous genes in mam-
SV40 malian cells are based on simian virus 40 (SV40) DNA (Table 6.6) that
simian virus 40 can replicate in several mammalian species. However, its use is restricted
to small inserts because only a limited amount of DNA can be packaged
into the viral capsid. The genome of this virus is a double-stranded DNA
molecule of 5.2 kb that carries genes expressed early in the infection
cycle that function in the replication of viral DNA (early genes) and
genes expressed later in the infection cycle that function in the produc-
tion of viral capsid proteins (late genes). Other vectors are derived from
adenovirus, which can accommodate relatively large inserts; bovine pa-
pillomavirus, which can be maintained as a multicopy plasmid in some
mammalian cells; and adeno-associated virus, which can integrate into
specific sites in the host chromosome.
All mammalian expression vectors tend to have similar features and
are not very different in design from other eukaryotic expression vectors.
A typical mammalian expression vector (Fig. 6.27) contains a eukary-
otic origin of replication, usually from an animal virus. The promoter
sequences that drive expression of the cloned gene(s) and the selectable
marker gene(s), and the transcription termination sequences (polyadeny-
lation signals), must be eukaryotic and are frequently taken from either
Table 6.6 Genomes of some animal viruses that are used as cloning vectors in
mammalian cells in culture
Virus Genome Genome size (kb)
SV40 Double-stranded DNA 5.2
Adenovirus Double-stranded DNA 26–45
Bovine papillomavirus Double-stranded DNA 7.3–8.0
Adeno-associated virus Single-stranded DNA 4.8
Epstein–Barr virus Double-stranded DNA 170
Modulation of Gene Expression 369
p I MCS pa TT p SMG pa TT
Figure 6.28 Translation control elements. A target gene can be fitted with various se-
quences that enhance translation and facilitate both secretion and purification, such as
a Kozak sequence (K), signal sequence (S), protein affinity tag (T), proteolytic cleavage
site (P), and stop codon (SC). The 5′ and 3′ UTRs increase the efficiency of translation
and contribute to mRNA stability. doi:10.1128/9781555818890.ch6.f6.28
5' UTR K S T P Target gene SC 3' UTR
370 CHAPTER 6
A Ampr
ori euk ori E gene p SMG pa TT
RNA
Protein subunits α β
Stress
Figure 6.30 Strategy to increase yields
of recombinant mammalian cells.
Cell death (apoptosis), stimulated by
the transcription factor p53, can lead
to decreased yields of recombinant p53 p53 MDM2
mammalian cells grown under stress-
ful conditions in large bioreactors. To
prevent cell death, the gene encoding
MDM2 (the mouse double-mutant 2
protein) is introduced into mammalian Target Apoptosis Target Apoptosis MDM2
cells. The MDM2 protein binds to p53 gene genes gene genes gene
and prevents it from inducing expres-
sion of proteins required for apopto-
sis. Engineered cells not only showed
delayed cell death but also achieved
higher cell densities in bioreactors. Cell death Delayed cell death
doi:10.1128/9781555818890.ch6.f6.30 Higher cell densities
stably integrated into the CHO genome and expressed, the enzyme was
detected in the mitochondria, where glucose is degraded. After 7 days
in culture, the rate of lactate production decreased by up to 40% in the
engineered cells.
Many eukaryotic DNA viruses from which the vectors used in mam-
malian cells are derived maintain their genomes as multicopy episomal
DNA (plasmids) in the host cell nucleus. These viruses produce proteins,
such as the large-T antigen in SV40 and the nuclear antigen 1 protein in
Epstein–Barr virus, that help to maintain the plasmids in the host nucleus
and to ensure that each host cell produced after cell division receives a
copy of the plasmid. To increase the copy number of the target gene by in-
creasing the plasmid copy number, HEK 293 cells have been engineered to
express the SV40 large-T antigen or Epstein–Barr virus nuclear antigen 1.
Many proteins of therapeutic value are secreted. However, the high
levels of these proteins that are desirable from a commercial standpoint
can overwhelm the capacity of the cell secretory system. Thus, protein
processing is a major limiting step in the achievement of high recombi-
nant protein yields. Researchers have therefore engineered cell lines with
enhanced production of components of the secretion apparatus. In this
regard, an effective strategy may be to simultaneously overexpress several,
if not all, of the proteins that make up the secretory mechanism. This can
be achieved through the enhanced production of the transcription factor
X box protein 1 (Xbp-1), a key regulator of the secretory pathway. Nor- Xbp-1
mally, full-length, unspliced xbp-1 mRNA is found in nonstressed cells X box protein 1
and is not translated into a stable, functional protein (Fig. 6.32). However,
Figure 6.32 Strategy to increase yields of secreted recombinant proteins from mam-
malian cells by simultaneously upregulating the expression of several proteins in the
secretion apparatus. The expression of chaperones and other proteins of the secre-
tion apparatus is controlled by the transcription factor Xbp-1. In nonstressed cells,
the intron is not cleaved from the xbp-1 transcript, and therefore, functional Xbp-1
transcription factor is not produced. In stressed cells with accumulated misfolded
proteins, an endoribonuclease cleaves the transcript to remove the intron and yield
mature xbp-1 mRNA that is translated into transcription factor Xbp-1. Recombinant
CHO cells transfected with a gene including only the xbp-1 exons overproduced a
functional Xbp-1 transcription factor that directed the production of high levels of
proteins required for protein secretion. doi:10.1128/9781555818890.ch6.f6.32
Unstressed cells Stressed cells Engineered cells
RNA
No mRNA and no
mRNA production of Xbp-1
transcription factor
Directed Mutagenesis
It is possible with recombinant DNA technology to isolate the gene (or
cDNA) for any protein that exists in nature, to express it in a specific
host organism, and to produce a purified product. Unfortunately, the
properties of some of these “naturally occurring” proteins are sometimes
not well suited for a particular end use. On the other hand, it is some-
times possible, using traditional mutagenesis (often ionizing radiation or
DNA-altering chemicals) and selection schemes, to create a mutant form
of a gene that encodes a protein with the desired properties. However, in
practice, the mutagenesis–selection strategy only very rarely results in any
significant beneficial changes to the targeted protein, because most amino
acid changes decrease the activity of a target protein.
By using a variety of different directed mutagenesis techniques that
change the amino acids encoded by a cloned gene, proteins with prop-
erties that are better suited than naturally occurring counterparts can be
created. For example, using directed mutagenesis techniques, it is possible
to change the specificity, stability, or regulation of target proteins.
Determining which amino acids of a protein should be changed to
attain a specific property is much easier if the three-dimensional structure
of the protein, or a similar protein, has been characterized by X-ray crys-
tallographic analysis. But for many proteins, such detailed information is
often lacking, so directed mutagenesis becomes a trial-and-error strategy
in which changes are made to those nucleotides that are most likely to
yield a particular change in a protein property. Moreover, it is not always
possible to know in advance which individual amino acid(s) contributes
to a particular physical, biological, or chemical property. Regardless of
what types of alterations are made to a target gene, the protein encoded
by each mutated gene has to be tested to ascertain whether the mutagene-
sis process has indeed generated the desired activity change.