Unit I Protein Structure
Unit I Protein Structure
Unit I Protein Structure
Quaternary
Lysine Tertiary
Structure
Structure
Histidine
Valine
Arginine
Alanine
COOH
The covalent backbone of a typical protein contains hundreds of individual bonds.
Because free rotation is possible around many of these bonds, the protein can, assume a virtually uncountable
number of conformations.
However, each protein has a specific chemical or structural function, suggesting that each has a unique three-
dimensional structure.
By the late 1920s, several proteins had been crystallized, including hemoglobin (Mr 64,500) and the enzyme urease
(Mr 483,000).
The structure of proteins emphasize six themes.
1. Three-dimensional structure or structures taken up by a protein are determined by its amino acid sequence.
2. The function of a typical protein depends on its structure.
3. Most isolated proteins exist in one or a small number of stable structural forms.
4. Forces stabilizing the specific structures maintained by a given protein are noncovalent; the hydrophobic effect
is particularly important.
5. Amid the huge number of unique protein structures, we can recognize some common structural patterns that
help to organize our understanding of protein architecture.
6. Finally, protein structures are not static.
7. All proteins undergo changes in conformation ranging from subtle to dramatic.
Parts of many proteins have no discernible structure. Some or parts of proteins, lacks a definable structure is critical
to their function.
The spatial arrangement of atoms in a protein or any part of a protein is called its conformation.
Proteins in any of their functional, folded conformations are called native proteins.
Protein conformation stabilized by weak interactions. Stability is the tendency to maintain a native conformation.
Native proteins are only marginally stable; the ΔG separating the folded and unfolded states is 20 to 65 kJ/mol.
A given polypeptide chain can theoretically assume countless conformations, and the unfolded state of a protein is
characterized by a high degree of conformational entropy.
Entropy, along with the hydrogen-bonding interactions of many groups in the polypeptide chain with the solvent
(water), tends to maintain the unfolded state.
The chemical interactions that counteract these effects and stabilize the native conformation include disulfide
(covalent) bonds and the weak (noncovalent) interactions and forces: hydrogen bonds, the hydrophobic effect, and
ionic interactions.
Weak interactions are especially important in the folding of polypeptide chains into their secondary and tertiary
structures.
The association of multiple polypeptides to form quaternary structures also relies on these weak interactions.
About 200 to 460 kJ/mol are required to break a single covalent bond, whereas weak interactions can be disrupted
by a mere 0.4 to 30 kJ/mol.
In general, the protein conformation with the lowest free energy (that is, the most stable conformation) is the one
with the maximum number of weak interactions.
For every hydrogen bond formed in a protein during folding, a hydrogen bond (of similar strength) between the same
group and water was broken.
The net stability contributed by a given hydrogen bond, or the difference in free energies of the folded and unfolded
states, may be close to zero.
The Peptide Bond Is Rigid and Planar Covalent bonds:
In the late 1930s, Linus Pauling and Robert- protein structure.
The α carbons of adjacent amino acid residues are separated by three covalent bonds, arranged as Cα—C—N—Cα.
X-ray diffraction studies of crystals of amino acids and of simple dipeptides and tripeptides showed that the peptide
C—N bond is somewhat shorter than the C—N bond in a simple amine and that the atoms associated with the
peptide bond are coplanar.
Resonance or partial sharing of two pairs of electrons between the carbonyl oxygen and the amide nitrogen.
The oxygen has a partial negative charge and the hydrogen bonded to the nitrogen has a net partial positive
charge, setting up a small electric dipole.
The six atoms of the peptide group lie in a single plane, with the oxygen atom of the carbonyl group trans to the
hydrogen atom of the amide nitrogen.
Peptide C—N bonds, because of their partial double-bond character, cannot rotate freely.
Rotation is permitted about the N—Cα and the Cα—C bonds.
The backbone of a polypeptide chain can thus be pictured as a series of rigid planes, with consecutive planes sharing
a common point of rotation at Cα.
Peptide conformation is defined by three dihedral angles (also known as torsion angles) called ϕ (phi), ψ(psi), and ω
(omega), reflecting rotation about each of the three repeating bonds in the peptide backbone.
A dihedral angle is the angle at the intersection of two planes.
• Overview of Protein Structure
A typical protein usually has one or more stable three-dimensional structures, or conformations, that reflect its function.
Protein structure is stabilized largely by multiple weak interactions.
Nonpeptide covalent bonds, particularly disulfide bonds, play a role in the stabilization of structure in some proteins.
The nature of the covalent bonds in the polypeptide backbone places constraints on structure.
The peptide bond has a partial double-bond character that keeps the entire six-atom peptide group in a rigid planar
configuration.
The N—Cα and Cα—C bonds can rotate to define the dihedral angles ϕ and ψ, respectively, although permitted values of ϕ
and ψ are limited by steric and other constraints.
The Ramachandran plot is a visual description of the combinations of ϕ and ψ dihedral angles that are permitted in a peptide
backbone and those that are not permitted due to steric constraints.
Protein Secondary Structure:
Segment of a polypeptide chain, local spatial arrangement of its main-chain atoms, without
regard to the positioning of its side chains or its relationship to other segments.
A regular secondary structure occurs when each dihedral angle, ϕ and ψ, remains the same or
nearly the same throughout the segment.
The most prominent are the α-helix and β conformations.
α Helix : Secondary structure (H bonds in orienting polar chemical groups C=O and N—H
groups of the peptide bond).
Pauling and Corey called it the α helix for its helical structure.
• Maximizes the use of internal hydrogen bonding.
• The repeating unit is a single turn of the helix, which extends about 5.4 Å along the long axis.
• The backbone atoms of the amino acid residues have a characteristic set of dihedral angles.
• Each helical turn includes 3.6 amino acid residues.
• The subsequent elucidation of the three dimensional structure of myoglobin and other
proteins showed that the right handed α helix is the common form.
Extended left-handed α helices are theoretically less stable and have not been observed in
proteins.
The α helix proved to be the predominant structure in α-keratins.
The Interior of protein is hydrophobic
Kendrew – Interior of protein almost exclusively hydrophobic side chains.
Main driving force for folding water-soluble globular protein molecules is to pack hydrophobic
side chains into interior of the molecule-creating a hydrophobic core and hydrophilic surface.
Hydrophobic core densely packed with the side chains in the interior of the protein.
Hole in the interior-occupied by one or more water molecules H bond to internal polar groups.
Bound internal water molecules are integral parts of protein structure.
Problem in hydrophobic core formation:
Bring side chains into the core.
Main chain highly polar and hydrophilic with one H bond donor, NH and one H bond acceptor for
each peptide unit must also fold into interior.
In hydrophobic environment: Main-chain polar groups neutralized by formation of H bonds.
Formation of regular secondary structures within the interior of the protein molecule helps in
solving this problem.
Why does the α-helix form more readily than many other possible conformations?
Optimal use of internal hydrogen helical bonds.
The structure is stabilized by a hydrogen bond between the hydrogen atom attached to the
electronegative nitrogen atom of a peptide linkage and the electronegative carbonyl oxygen
atom of the fourth amino acid on the amino-terminal side of that peptide bond.
At the ends of an α-helical segment, there are always three or four amide carbonyl or amino
groups that cannot participate in this helical pattern of hydrogen bonding.
Interactions between amino acid side chains can stabilize or destabilize the α-helical structure.
For example, if a polypeptide chain has a long block of Glu residues, the negatively charged
carboxyl groups of adjacent Glu residues repel each other so strongly that they prevent
formation of the α helix.
For the same reason, if there are many adjacent Lys and/or Arg residues, with positively
charged R groups at pH 7.0, they also repel each other and prevent formation of the α helix.
The bulk and shape of Asn, Ser, Thr, and Cys residues can also destabilize an α helix if they are
close together in the chain. Positively charged amino acids are often found three residues away
from negatively charged amino acids, permitting the formation of an ion pair.
Two aromatic amino acid residues are often similarly spaced, for stabilizing the hydrophobic
effect.
A constraint on the formation of the α helix is the
presence of Pro or Gly residues, which have the least
proclivity to form α helices.
In proline, the nitrogen atom is part of a rigid ring,
and rotation about the N— Cα bond is not possible.
Thus, a Pro residue introduces a destabilizing kink in
an α helix.
In addition, the nitrogen atom of a Pro residue in a
peptide linkage has no substituent hydrogen to
participate in hydrogen bonds with other residues.
For these reasons, proline is only rarely found in an
α helix.
Glycine occurs infrequently in α helices for a different reason:
It has more conformational flexibility than the other amino acid residues.
Polymers of glycine tend to take up coiled structures quite different from an α helix.
Amino acid residues at ends of the α-helical segment of the polypeptide affects stability.
A small electric dipole exists in each peptide bond.
The partial positive and negative charges of the helix dipole reside on the peptide amino and
carbonyl groups near the amino-terminal and carboxyl-terminal end.
In summary, five types of constraints affect the stability of an α helix:
(1)the intrinsic propensity of an amino acid residue to form an α helix;
(2) the interactions between R groups, particularly those spaced three (or four) residues apart;
(3) the bulkiness of adjacent R groups;
(4) the occurrence of Pro and Gly residues;
(5)interactions between amino acid residues at the ends of the helical segment and the electric
dipole inherent to the α helix.
Greek Key motifs
• Four adjacent antiparallel β strands are arranged in a pattern similar to
repeating unit of an ornament in ancient Greece.
• Ex: Staphylococcus nuclease an enzyme that degrades DNA.
• Not associated with any specific functions.
• But occurs frequently in protein structure.
• Frequent occurrence is based on initial formation of one long antiparallel
structure with loops in the middle of both β strands.
• Structural changes in loop between β strands 1 and 2, and β strands 3 and 4
occurs by changes in top folds.
• Then H bond formation occur between the strands and Greek key motif is
formed.
Loop regions are at the surface of protein molecules
• Most protein structures are combination of α helices and β strands, connected by loop regions.
• Frequently participates in forming binding site and enzyme active sites.
• Loop regions are at the surface of molecules with various length and irregular shapes.
• Main chain C=O and NH group of loop generally do not form H bond with each other.
• They exposed to solvent and form H bond with water molecules.
• Loop regions exposed to solvent are rich in polar charged hydrophilic amino acids. What are the possible
amino acids here?.
• Insertions and deletions of few aa residues occurs exclusively in the loop region. Why?
• During evolution cores are much more stable than loops.
• Intron positions found at sites in structural genes corresponds to loop in protein structure.
• Loop regions that connect adjacent antiparallel β strands are called hairpin loops.
• Short hairpin loops are usually called as reverse turns /simply turns.
Ramachandran Plot
Protein Folding
Protein, is a polypeptide chain folded into 1 or more domains made up of α helices, β sheets and loops.
Polypeptide chain acquires correct 3D structure to achieve biologically active native state called protein folding.
Some proteins fold spontaneously, some require assistance of enzymes.
Exs: To catalyse the formation and exchange of disulphide bonds
Many proteins require the assistance of a class of proteins called chaperones.
Chaperones binds to a partly folded polypeptide chain, prevents it from making illicit association with other folded or
partly folded proteins.
It also promotes folding of polypeptide chain it holds.
Secondary structure attained with looser tertiary structure than the native state is called molten globular state.
From molten globular state to the final native state occurs spontaneously.
Protein folding generates a 3 D structure from linear, one-dimensional structure.
How to predict the 3-D structure of a protein from its aa sequence is major unsolved problem in structural
mol.biology.
If we had a general solution, it would be possible to write a computer program to stimulate folding and generate 3-D
structure from its aa sequence.
However folding problem is still not in sight, bcos the database of known proteins structures is doubling every 2
years.
A protein in native state is not static.
The functional activities of many proteins depends upon large conformational changes triggered by ligand binding.
Protein Denaturation (I)
• The conformation of a native protein is only marginally stable.
• In the denatured state, the conformation of the protein need not be
completely randomized.
• A number of physical and chemical agents can cause protein
denaturation.
• A classic agent is heat, which has complex effects on many weak
interactions in a protein (primarily on the hydrogen bonds).
• On heating, a protein’s conformation generally remains intact
until an abrupt loss of structure occurs over a relatively narrow
temperature range (the Tm).
• The abruptness of the loss of structure suggests a cooperative
process in which loss of structure in one or more parts of the
protein rapidly destabilizes the structure of other parts.
Proteins also can be denatured by extremes in pH, by miscible organic solvents
such as alcohol and acetone, by certain solutes such as urea and guanidine
hydrochloride, or by detergents.
None of these agents breaks covalent bonds.
Organic solvents, urea and detergents act primarily by disrupting the hydrophobic
interactions that produce the stable core of globular proteins.
Urea and guanidine hydrochloride also disrupt hydrogen bonds.
Extremes of pH alter the net charge on the protein causing electrostatic repulsion
and the disruption of some hydrogen bonding.
The denatured structures resulting from these various treatments are not
necessarily the same.
Lastly, denaturation often results in aggregation and precipitation of the unfolded
protein.
The protein precipitate that is seen after boiling an egg white is a well known
example.
• Globular proteins are only marginally stable
• Proteins are unstable. Slight changes in pH/temp can convert biologically active protein molecules in their native state to
inactive denatured state.
• Energy difference between these two states is 5-15kcal/mol.
• Major contributors to energy difference is enthalpy and entropy.
• Enthalpy:- Derives energy of noncovalent interactions within the polypeptide chain. What are the noncovalent interactions?
• The covalent bonds within and between the aa residues in polypeptide chain are same in native and denatured state. What are
the covalent bonds?
• Non covalent interactions differ significantly between the two states. How?
• Non covalent interactions are stronger and more frequent in native state hence enthalpy is larger. Several hundred kcal/mol.
• Entropy , 2 law of thermodynamics- energy is required to create order.
• Proteins in native state are highly ordered and in denatured state it is highly disordered.
• Experimental preparation of unfolded protein (solution of 6 M guanidium chloride/ 8 M urea) contains 10 15 -1020 molecules,
each with uniques conformation.
• In absence of compensating factors entropically more favourable for protein to be in the disordered denatured state.
• Energy difference between native and denatured state reaches several hundred kcal/mole but in opp to enthalpy difference.
• The energy difference of 5-15kcal/mol is called free energy difference.
• Precise information about energy contributions to stability of native state from close packing of hydrophobic side chains in the
interior of the protein, presence of disulphide bridges, salt bridges, dipole moment of α helices and interior H bonds.
Kinetic factors are important for Folding
Specific sequence of polypeptide chain yield single, compact, biologically active fold in native state.
Under physiological conditions there appears to be one conformation for a given amino acid sequence that has significantly
lower free energy than any other.
How is this folded state reached? Protein molecules search through all possible conformations randomly until they frozen at
lowest energy for conformation of native state.
Cyrus Levinthal showed in 1968 a simple calculation for protein folding.
Assume that each peptide has only 3 possible conformations, the allowed regions α, β and L in Ramachandran plot.
It converts one conformation into another in the shortest ossible time. Of one picoseond (10 -12 seconds.
A polypeptide chain of 150 residues would then have 3 150 = 10 68 possible conformations.
All these conformations would require 10 48 years when co pared to actual folding time (0.1 and 1000 seconds) both invivo and
invitro.
Less compact than native structure. Second step lasts upto 1 second, persistent native-like
Proper packing interactions in the interior of protein have elements of tertiary structure begin to develop.
not been formed. Subdomains that are not properly docked.
Interior side chains may be mobile. Ensemble conformation is reduced compared to molten
Resembles a liquid than solid like interior of the native globule.
state. Single native form is reached in the final stage of folding.
Loops and other elements of surface remain largely Involves the formation of native interactions throughout
unfolded. the protein.
Molten globule should not be viewed as a single Hydrophobic packing in the interior as well as loop
structural entity. formation.
As an ensemble of related structures that are rapidly
• Formation of correct disulphide bonds in folding poses special problems
for cells.
• Denatured, unfolded state of proteins-no disulphide bridges.
• Formation of disulphide bonds-oxidation of cys residues.
• In bacteria-occurs in periplasmic space by disulphide bridge-forming
enzymes, Dsb.
• In eukaryotes-S-S –in endoplasmic reticulum by protein disulphide
isomerase, PDI.
• PDI catalyzes internal disulphide bond, remove folding intermediates by
incorrectly formed –S-S-bridges.
Trans-peptide (C=O and N-H grps point in opp direction)-most stable form of peptide group.
Cis-peptide [C=O and N-H grps point in same direction]. Less stable form. Rarely found in native proteins.
When 2nd residue is Proline in cis-form , it is only abt 4 times less stable than trans- form.
Some cis-proline occurs in many proteins.
Found in tight bends of polypeptide chain.
Sometimes essential for activity and conformational flexibility.
Most proline residues are in trans-configuration. There are few steric collisions.
In native proteins less stable cis-pro peptides are stabilized by tertiary structure.
In unfolded state these constraints are relaxed and there is an equilibrium, between cis and trans-isomers at
each peptide bond.
Less stable cis-proline peptides are stabilized by tertiary structure.
In unfolded state these constraints are relaxed and there is an equilibrium between cis-and trans-isomers at
each peptide bond.
When protein is refolded a fraction of molecules have one or more proline residues.
Cis-trans isomerization of proline peptides is a slow process.
Invivo the rate of this process is enhanced by enzymes initially peptidyl prolyl isomerases found in both pro
and eukaryotes.
• Protein molecules before attaining native state they may expose hydrophobic patches to
solvent.
• Isolated purified proteins aggregate during folding even at relatively low conc.
• Inside cells high conc. of different proteins- aggregation during folding process.
• Aggregation of proteins is prevented by Molecular chaperones, mostly induced by heat shock
hence called as heat shock proteins (Hsps)
• Protein unfolding and aggregation is increased at elevated temperatures.
• Hsp 70 polypeptide is divided into two functional regions
• Binds and hydrolysis ATP.
• Binds hydrophobic segments of unfolded polypeptide chain.
• N-terminal ATP binding domain has 4 domain structure.
• C-terminal polypeptide binding domain is an antiparallel β sandwich.
Hsp 60and Hsp 10 with molecular weights of 60kDa and 10kDa studied in E.coli.
They are called as GroEL and GroES. The protein functions together as a large complex called chaperonin
consist of 14 subunits of GroEL and 7 subunits of GroES and requires ATP for function
• Chaperonins bind unfolded, partly folded and incorrectly folded protein molecules but not proteins in their native
state.
• It assist in folding a large number of different proteins.
• How these chaperonins distinguish between correctly and incorrectly folded polypeptide chain?
• How they mediate the conversion of unfolded or misfolded proteins to their native form?
• GroEL is a cylindrical structure with a central channel in which newly synthesized polypeptides bind.
• X-ray structure of GroEL determined by Paul Sigler.
• It has 14 subunits, each comprises 547 amino acids, form two rings.
• 7 subunits of each ring arranged with nearly 7 fold rotational symmetry.
• Rings forms an extensive interface with one another across flat equatorial plane.
• It resembles a thick-walled cylinder about 150 Å with large central cavity or channel.
• Each subunit has 3 distinct domains equatorial, intermediate and apical.
• Equatorial domain is largest, comprises 243 residues is mainly α-helical type.
• Serve as foundation of GroEL structure providing all contacts between the two 7 membered rings across
equatorial plane.
• Provides contact between subunits within each ring and ATP binding site essential for function.
• It has both N-terminus and C-terminus of the polypeptide chain.
• About 30 residues are not visible. They are disordered/ occupy differently ordered positions.
• The apical domain (191-376 residues), 4
layer structure comprising 2 β sheets
sandwiched between α helices.
• One β sheet has antiparallel β strands, others
part of α /β domain with 4 parallel β strands.
• It form the opening to solvent of central
channel.
• Segments of this domain are flexible. Rich in
hydrophobic residues.
• Involved in binding to hydrophobic areas
exposed by non-native folds of polypeptide
chain.
• Mutating hydrophobic residues to charged
ones prevents binding of polypeptide to
GroES.
• Mutating phe to val have no functional
effects.
• The equatorial and apical domains linked by small intermediate domain.
• Internal cavity is wider.
• 7 holes are large enough to permit ATP and DP to diffuse in and out.
• Intermediate domain is connected to other two domains by short antiparallel
segments.
• GroEL with different ligands have shown substantial changes in the orientation of
the domain and in the size of central activity occur during functional cycle of the
chaperonin.
• GroES binds to apical domain of GroEL, closing off central activity.
• Once GroES bound to one of the rings in GroEL molecule a conformational
change occurs which decreases the affinity of second GroEL-GroES complex.
• Asymmetric with GroES bound to only one end of GroEL cylinder.
• GroES molecule comprises 7 subunits, each 97 amino acids.
• X-ray structure determined by Johann Deisenhofer.
• It is dome shaped about 75 Å in dm and 30 Å high with a small hole in the
middle.
• Core of subunit structure is β barrel comprising two antiparallel β sheets packed
against each other.
• Two large loop regions protrude from this core, one extends above the plane of
the ring creating loosely packed top of the dome that covers the central hole.
• Other loop region is rich in hydrophobic residues, extends below the dome and
interacts with the apical domain in GroES-GroEL complex.
• Loop is disordered in x-ray structure of GroES but NMR studies shows that the
loop becomes ordered when GroES binds to GroEL.
• Mutational studies shown that hydrophobic residues in this loop are important for
chaperonin function.
• GroEL-GroES complex binds and releases newly synthesized polypeptides in as ATP dependent
cycle.
• How does the GroEL-GroES complex function as chaperone to assist protein folding?
• Several aspects of mechanisms are not clear.
• Main features of the functional cycles are known.
• Formation of GroEL-ATP complex, one end binds to one molecule of GroES with hydrolysis of
ATP.
• GroEL-ADP-GroES ia a stable complex.
• The GroEL ring where GroES is bound in cis position.
• Large structural change occurs by forming a wide internal cavity.
• Walls are formed from both apical and intermediate domains.
• GroES dome partly closed off the cavity from the solvent.
• The other ring in trans-position has a smaller cavity is open to the solvent.
• Unfolded proteins can bind both in cis-and the trans- positions.
• Those that are bound in cis-position undergo subsequent folding.
• Release of bound pp from closed cavity in the cis-position requires release of GroES.
• GroES release requires ATP hydrolysis.
• But ATP bound to distant GroEL ring in trans- position.
Once GroES is released the polypeptide chain is released and it binds to another GroEL-GroES
complex to repeat the cycle.
• Native state is reached after multiple binding and release of GroEL-GroES complex.
• What happens to the polypeptide chain inside the closed cis-cavity?
• Two models were proposed.
• Unfolded or incorrectly folded proteins are recognized by hydrophobic areas.
• It bound to hydrophobic regions inside the GroEL cavity.
• Function of cavity is to unfold unproductive intermediates and eject them to unfolded state into
solution for spontaneous folding.
• Provides another chance to reach the folded state.
• Folding occur within the solvent during jumps of polypeptide between GroEL-GroES
complexes.
• In second model folding occurs inside the cis-cavity of GroEL-GroES
complex either to native or intermediate state.
• Folding occurs in a closed environment.
• Prevents aggregation with other unfolded proteins during the folding
process.
• GroEL-assisted folding of large protein is different.
• Large proteins are too large to fit inside the cavity.
• Assisted folding by Hsp70 does not occur inside a closed cavity.