Tải bản đầy đủ (.pdf) (10 trang)

Biochemistry, 4th Edition P21 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (756.18 KB, 10 trang )

6.4 How Do Polypeptides Fold into Three-Dimensional Protein Structures? 163
of these themes is more important for some proteins than for others. The process
of folding is clearly complex, but sophisticated simulations have already provided
reasonable models of folding (and unfolding) pathways for many proteins (Figure
6.33). One school of thought suggests that for any given protein there may be mul-
tiple folding pathways. For these cases, Ken Dill has suggested that the folding
process can be pictured as a funnel of free energies—an energy landscape
(Figure 6.34). The rim at the top of the funnel represents the many possible un-
folded states for a polypeptide chain, each characterized by high free energy and
significant conformational entropy. Polypeptides fall down the wall of the funnel as
contacts made between residues establish different folding possibilities. The nar-
rowing of the funnel reflects the smaller number of available states as the protein
approaches its final state, and bumps or pockets on the funnel walls represent par-
tially stable intermediates in the folding pathway. The most stable (native) folded
state of the protein lies at the bottom of the funnel.
What Is the Thermodynamic Driving Force for Folding
of Globular Proteins?
The free energy change for the folding of a globular protein must be negative if the
folded state is more stable than the unfolded state. The free energy change for fold-
ing depends, in turn, on changes in enthalpy and entropy for the process:
⌬G ϭ ⌬H Ϫ T⌬S
When ⌬H, ϪT⌬S, and ⌬G are measured separately for the polar side chains and for
the nonpolar side chains of the protein, an important insight is apparent. The en-
thalpy and entropy changes for polar residues largely cancel each other out, and the
⌬G of folding for the polar residues is approximately zero.
To understand the behavior of the nonpolar residues, it is helpful to distinguish
the ⌬H and ϪT⌬S contributions for the polypeptide chain and for the water sol-
vent. Both ⌬H and ϪT⌬S for the nonpolar residues of the peptide chain are posi-
tive and thus make unfavorable contributions to the folding free energy. However,
large numbers of water molecules restricted and immobilized around nonpolar
residues in the unfolded protein are liberated in the folding process. The burying


of nonpolar residues in the folded protein’s core produces a dramatic entropy
D (2 ns) N (0 ns)I (0.7 ns) TS (0.15 ns)D (4 ns)
D (70 ns) N (0 ns)I (30 ns) TS (0.26 ns)D (94 ns)
Cl2
Barnase
FIGURE 6.33 Computer simulations of folding and unfolding of proteins can reveal possible folding pathways.
Molecular dynamics simulations of the unfolding of small proteins such as chymotrypsin inhibitor 2 (CI2) and
barnase are presented here on a reversed time scale, to show how folding may occur. D ϭ denatured, I ϭ
intermediate,TS ϭ transition state, N ϭ native. (Adapted from Daggett, V., and Fersht, A. R., 2003. Is there a unifying
mechanism for protein folding? Trends in Biochemical Sciences 28:18-25. Figures provided by Alan Fersht and Valerie Daggett.)
164 Chapter 6 Proteins: Secondary,Tertiary, and Quaternary Structure
increase for these liberated water molecules. This is just enough to make the over-
all ⌬G for folding negative (and thus favorable). The crucial results:
• The largest contribution to the stability of a folded protein is the entropy change
for the water molecules associated with the nonpolar residues.
• The overall free energy change for the folding process is not large—typically
Ϫ20 to Ϫ40 kJ/mol.
Marginal Stability of the Tertiary Structure Makes Proteins Flexible
A typical folded protein is only marginally stable. The hundreds of van der Waals in-
teractions and hydrogen bonds in a folded structure are compensated and balanced
by a dramatic loss of entropy suffered by the polypeptide as it assumes a compact
folded structure. Because stability seems important to protein and cellular function,
it is tempting to ask what the advantage of marginal stability might be. The answer
appears to lie in flexibility and motion. All chemical bonds undergo a variety of mo-
tions, including vibrations and (for single bonds) rotations. This propensity to
move, together with the marginal stability of protein structures, means that the
many noncovalent interactions within a protein can be interrupted, broken, and re-
arranged rapidly.
FIGURE 6.34 A model for the steps involved in the fold-
ing of globular proteins.The funnel represents a free

energy surface or energy landscape for the folding
process.The protein folding process is highly coopera-
tive. Rapid and reversible formation of local secondary
structures is followed by a slower phase in which estab-
lishment of partially folded intermediates leads to the fi-
nal tertiary structure. Substantial exclusion of water oc-
curs very early in the folding process.
6.4 How Do Polypeptides Fold into Three-Dimensional Protein Structures? 165
Motion in Globular Proteins
Proteins are best viewed as dynamic structures. Most globular proteins oscillate and fluc-
tuate continuously about their average or equilibrium structures (Figure 6.35). This
flexibility is essential for a variety of protein functions, including ligand binding, en-
zyme catalysis, and enzyme regulation, as shown throughout the remainder of this
text.
The motions of proteins may be motions of individual atoms, groups of atoms,
or even whole sections of the protein. Furthermore, they may arise either from ther-
mal energy or from specific, triggered conformational changes in the protein.
Atomic fluctuations such as vibrations typically are random, are very fast, and usu-
ally occur over small distances, as shown in Table 6.2. These motions arise from the
kinetic energy within the protein and are a function of temperature. In the tightly
packed interior of the typical protein, atomic movements of an angstrom or less are
typical. The closer to the surface of the protein, the more movement can occur, and
on the surface atomic movements of several angstroms are possible.
A class of slower motions, which may extend over larger distances, is collective
motions. These are movements of a group of atoms covalently linked in such a way
that the group moves as a unit. Such a group can range from a few atoms to hun-
dreds of atoms. These motions are of two types: (1) those that occur quickly but in-
frequently, such as tyrosine ring flips, and (2) those that occur slowly, such as the
hinge-bending movement between protein domains. For example, the two antigen-
binding domains of immunoglobulins move as relatively rigid units to selectively

bind separate antigen molecules. These collective motions also arise from thermal
energies in the protein and operate on a timescale of 10
Ϫ12
to 10
Ϫ3
sec. It is often
important to distinguish the time scale of the motion itself versus the frequency of
its occurrence. A tyrosine ring flip takes only a picosecond (1 ϫ 10
Ϫ12
sec), but such
flips occur only about once every millisecond (1 ϫ 10
Ϫ3
sec).
Conformational changes involve motions of groups of atoms (individual side
chains, for example) or even whole sections of proteins. These motions occur on a
time scale of 10
Ϫ9
to 10
3
sec, and the distances covered can be as large as 1 nm.
These motions may occur in response to specific stimuli or arise from specific in-
teractions within the protein (hydrogen bonding, electrostatic interactions, or lig-
and binding—see Chapters 14 and 15).
The cis–trans isomerization of proline residues in proteins (Figure 6.36) occurs
over an even longer time scale—typically 10
1
to 10
4
sec. Conversion of even a single
proline from its cis to its trans configuration can alter a protein structure dramatically.

FIGURE 6.35 Proteins are dynamic structures.The mar-
ginal stability of a tertiary structure leads to flexibility
and motion in the protein. Determination of structures
of proteins (such as the SH3 domain of the ␣-chain of
spectrin, shown here) by nuclear magnetic resonance
produces a variety of stable tertiary structures that fit
the data. Such structural ensembles provide a glimpse
into the range of structures that may be accessible to a
flexible, dynamic protein (pdb id ϭ 1M8M).
Spatial
Displacement Characteristic
Type of Motion (Å) Time (sec) Source of Energy
Atomic vibrations 0.01–1 10
Ϫ15
–10
Ϫ11
Kinetic energy
Collective motions 0.01–5 10
Ϫ12
–10
Ϫ3
Kinetic energy
or more
1. Fast: Tyr ring flips;
methyl group rotations
2. Slow: hinge bending
between domains
Triggered conformation 0.5–10 10
Ϫ9
–10

3
Interactions with
changes or more triggering agent
Proline cis–trans 3–10 10
1
–10
4
Kinetic energy or
isomerization enzyme driven
Adapted from Petsko, G. A., and Ringe, D., 1984. Fluctuations in protein structure from X-ray diffraction. Annual Review
of Biophysics and Bioengineering 13:331–371.
TABLE 6.2
Motion and Fluctuations in Proteins
166 Chapter 6 Proteins: Secondary,Tertiary, and Quaternary Structure
Proline cis–trans isomerizations sometimes act as switches to activate a protein or open
a channel across a membrane (see Chapter 9).
The Folding Tendencies and Patterns of Globular Proteins
Globular proteins adopt the most stable tertiary structure possible. To do this, the
peptide chain must both (1) satisfy the constraints inherent in its own structure and
(2) fold so as to “bury” the hydrophobic side chains, minimizing their contact with
solvent. The polypeptide itself does not usually form simple straight chains. Even in
chain segments where helices and sheets are not formed, an extended peptide
chain, being composed of
L-amino acids, has a tendency to twist slightly in a right-
handed direction. As shown in Figure 6.37, this tendency is apparently the basis for
the formation of a variety of tertiary structures having a right-handed sense. Princi-
pal among these are the right-handed twists in ␤-sheets and right-handed cross-
overs in parallel ␤-sheets. Right-handed twisted ␤-sheets are found at the center of
a number of proteins (Figure 6.38) and provide an extended, highly stable struc-
tural core.

Connections between ␤-strands are of two types—hairpins and cross-overs.
Hairpins, as shown in Figure 6.37, connect adjacent antiparallel ␤-strands. Cross-
overs are necessary to connect adjacent (or nearly adjacent) parallel ␤-strands.
CH
2
CH
2
N
H
C
H
2
C
C

HR
O
C

H
C

HR
trans cis
CH
2
CH
2
N
C

C

H
2
C
O
FIGURE 6.36 The cis and trans configurations of proline residues in peptide chain are almost equally stable.
Proline cis-trans isomerizations, often occurring over relatively long time scales, can alter protein structure
significantly.
Antiparallel hairpin
Cross-overs
Parallel, right-handed
Parallel, left-handed
(b)
Natural right-handed twist by polypeptide chain
(a)
FIGURE 6.37 (a) The natural right-handed twist exhibited by polypeptide chains, and (b) the types of connec-
tions between ␤-strands.
6.4 How Do Polypeptides Fold into Three-Dimensional Protein Structures? 167
Nearly all cross-over structures are right-handed. In many cross-over structures, the
cross-over connection itself contains an ␣-helical segment. This creates a ␤␣␤-loop.
As shown in Figure 6.37, the strong tendency in nature to form right-handed cross-
overs, the wide occurrence of ␣-helices in the cross-over connection, and the right-
handed twists of ␤-sheets can all be understood as arising from the tendency of an
extended polypeptide chain of
L-amino acids to adopt a right-handed twist struc-
ture. This is a chiral effect. Proteins composed of
D-amino acids would tend to adopt
left-handed twist structures.
The second driving force that affects the folding of polypeptide chains is the need

to bury the hydrophobic residues of the chain, protecting them from solvent water.
From a topological viewpoint, then, all globular proteins must have an “inside”
where the hydrophobic core can be arranged and an “outside” toward which the hy-
drophilic groups must be directed. The sequestration of hydrophobic residues away
from water is the dominant force in the arrangement of secondary structures and
nonrepetitive peptide segments to form a given tertiary structure. Globular proteins
can be classified mainly on the basis of the particular kind of core or backbone struc-
ture they use to accomplish this goal. The term hydrophobic core, as used here, refers
to a region in which hydrophobic side chains cluster together, away from the solvent.
Backbone refers to the polypeptide backbone itself, excluding the particular side
chains. Globular proteins can be pictured as consisting of “layers” of backbone, with
hydrophobic core regions between them. More than half the known globular pro-
tein structures have two layers of backbone (separated by one hydrophobic core).
Roughly one-third of the known structures are composed of three backbone layers
and two hydrophobic cores. There are also a few known four-layer structures and at
least one five-layer structure. A few structures are not easily classified in this way, but
it is remarkable that most proteins fit into one of these classes. Examples of each are
presented in Figure 6.38.
(a) Cytochrome cЈ
Layer 1 Layer 2 Hydrophobic residues are
buried between layers
(b) Phosphoglycerate kinase
(domain 2)
(c) Phosphorylase
(domain 2)
(d) Triose
p
hos
p
hate isomerase

ACTIVE FIGURE 6.38 Examples of protein domains with different numbers of layers of back-
bone structure. (a) Cytochrome cЈ with two layers of ␣-helix. (b) Domain 2 of phosphoglycerate kinase, com-
posed of a ␤-sheet layer between two layers of helix, three layers overall. (c) An unusual five-layer structure,
domain 2 of glycogen phosphorylase, a ␤-sheet layer sandwiched between four layers of ␣-helix. (d) The con-
centric “layers” of ␤-sheet (inside) and ␣-helix (outside) in triose phosphate isomerase. Hydrophobic residues are
buried between these concentric layers in the same manner as in the planar layers of the other proteins.
The hydrophobic layers are shaded yellow.
(Original art courtesy of Jane Richardson.) Test yourself on the con-
cepts in this figure at www.cengage.com/login
168 Chapter 6 Proteins: Secondary,Tertiary, and Quaternary Structure
Most Globular Proteins Belong to One of Four Structural Classes
In addition to classification based on layer structure, proteins can be grouped ac-
cording to the type and arrangement of secondary structure (Figure 6.39). There
are four such broad groups: all ␣ proteins and all ␤ proteins (in which the struc-
tures are dominated by ␣-helices and ␤-sheets, respectively), ␣/␤ proteins (in which
helices and sheets are intermingled), and ␣؉␤ proteins (in which ␣-helical and
␤-sheet domains are separated for the most part).
It is important to note that the similarities of tertiary structure within these
groups do not necessarily reflect similar or even related functions. Instead, func-
tional homology usually depends on structural similarities on a smaller and more in-
timate scale.
Molecular Chaperones Are Proteins That Help Other Proteins to Fold
To a first approximation, all the information necessary to direct the folding of a
polypeptide is contained in its primary structure. On the other hand, the high
protein concentration inside cells may adversely affect the folding process be-
cause hydrophobic interactions may lead to aggregation of some unfolded or par-
tially folded proteins. Also, it may be necessary to suppress or reverse incorrect or
premature folding. A family of proteins, known as molecular chaperones, are es-
sential for the correct folding of certain polypeptide chains in vivo; for their as-
sembly into oligomers; and for preventing inappropriate liaisons with other pro-

teins during their synthesis, folding, and transport. Many of these proteins were
first identified as heat shock proteins, which are induced in cells by elevated tem-
perature or other stress. The most thoroughly studied proteins are Hsp70, a
70-kD heat shock protein, and the so-called chaperonins, also known as Cpn60s or
Hsp60s, a class of 60-kD heat shock proteins. A well-characterized Hsp60 chaper-
onin is GroEL, an E. coli protein that has been shown to affect the folding of sev-
eral proteins. The mechanism of action of chaperones is discussed in Chapter 31.
Some Proteins Are Intrinsically Unstructured
Remarkably, it is now becoming clear that many proteins exist and function nor-
mally in a partially unfolded state. Such proteins, termed intrinsically unstructured
proteins (IUPs) or natively unfolded proteins, do not possess uniform structural
properties but are nonetheless essential for basic cellular functions. These proteins
are characterized by an almost complete lack of folded structure and an extended
conformation with high intramolecular flexibility.
Intrinsically unstructured proteins contact their targets over a large surface area
(Figure 6.40). The p27 protein complexed with cyclin-dependent protein kinase 2
(Cdk2) and cyclin A shows that p27 is in contact with its binding partners across its
entire length. It binds in a groove consisting of conserved residues on cyclin A. On
Cdk2, it binds to the N-terminal domain and also to the catalytic cleft. One of the
most appropriate roles for such long-range interactions is assembly of complexes
involved in the transcription of DNA into RNA, where large numbers of proteins
must be recruited in macromolecular complexes. Thus, the transactivator domain
catenin-binding domain (CBD) of tcf3 is bound to several functional domains of
␤-catenin (Figure 6.40).
Can amino acid sequence information predict the existence of intrinsically un-
structured regions on proteins? Intrinsically unstructured proteins are character-
ized by a unique combination of high net charge and low overall hydrophobicity.
Compared with ordered proteins, IUPs have higher levels of E, K, R, G, Q, S, and P,
and low amounts of I, L, V, W, F, Y, C, and N. These features provide a rationale for
prediction of regions of disorder from amino acid sequence information, and ex-

perimental evidence shows that such predictions are better than 80% accurate.
Genomic analysis of disordered proteins indicates that the proportion of the
genome encoding IUPs and proteins with substantial regions of disorder tends to
increase with the complexity of organisms. Thus, predictive analysis of whole
Leucine-rich repeat
variant (pdb id = 1LRV)
Peridinin-chlorophyll protein
(a “solenoid”—pdb id = 1PPR)
Endoglucanase A (an ␣-helical
barrel—pdb id = 1CEM)
Cat allergen
(pdb id = 1PUO)
Human growth hormone
(pdb id = 1HGU)
Rieske iron protein
(a 3-layer ␤-sandwich—
(pdb id = 1RIE)
Hemopexin C-terminal
domain (a 4-bladed
propellor—pdb id = 1HXN)
Pleckstrin domain of
protein kinase B/AKT
(pdb id = 1UNQ)
Lectin from R. solanacearum
(a 6-bladed propellor—
pdb id = 1BT9)
Mannose-specific
aggluttinin (a prism—
(pdb id = 1JPC)
Hevamine (a “TIM barrel”

—pdb id = 2HVM)
Hepatocyte growth factor
(N-terminal domain
—pdb id = 2HGF)
Human bactericidal
permeability-increasing
protein (pdb id = 1BP1)
Prokaryotic ribosomal
protein L9
(pdb id = 1DIV)
MurA (an ␣–␤ prism
—pdb id = 1EYN)
Porcine ribonuclease inhibitor
(a “horseshoe”—pdb id = 2BNH)
All ␣ proteins:
All ␤ proteins:
RuvA protein
(pdb id = 1CUK)
Ribonuclease H
(pdb id = 1RNH)
L-Arginine: glycine
amidinotransferase (a metabolic
enzyme—pdb id = 4JDW)
Thymidylate synthase
(pdb id = 3TMS)
Equine leucocyte
elastase inhibitor
(pdb id = 1HLE)
␣/␤ proteins: ␣+␤ proteins:
FIGURE 6.39 Four major classes of protein structure (as defined in the SCOP database). (a) All ␣ proteins,

where ␣-helices dominate the structure; (b) All ␤ proteins, in which ␤-sheets are the primary feature; (c) ␣/␤
proteins, where ␣-helices and ␤-sheets are mixed within a domain; (d) ␣؉␤ proteins, in which ␣-helical and
␤-sheet domains are separated to at least some extent.
6.4 How Do Polypeptides Fold into Three-Dimensional Protein Structures? 169
170 Chapter 6 Proteins: Secondary,Tertiary, and Quaternary Structure
genomes indicates that 2% of archaeal, 4.2% of bacterial, and 33% of eukaryotic
proteins probably contain long regions of disorder.
Some proteins are disordered throughout their length, whereas others may con-
tain stretches of 30 to 40 residues or more that are disordered and imbedded in
an otherwise folded protein. The prevalence of disordered segments in proteins
may reflect two different cellular needs. (1) Disordered proteins are more mal-
leable and thus can adapt their structures to bind to multiple ligands, including
other proteins. Each such interaction could provide a different function in the
(a) (b)
Cdk2
CycA
Oct 1
POU SD
Ig␬
␤-catenin
TAF
II
105
Oct 1
POU HD
(c)
(d) (e)
(f) (g)
FIGURE 6.40 Intrinsically unstructured proteins (IUPs)
contact their target proteins over a large surface area.

(a) p27
Kip1
(yellow) complexed with cyclin-dependent
kinase 2 (Cdk2, blue) and cyclin A (CycA, green). (b) The
transactivator domain CBD of Tcf3 (yellow) bound to
␤-catenin (blue). Note: Part of the ␤-catenin has been
removed for a clear view of the CBD. (c) Bob 1 transcrip-
tional coactivator (yellow) in contact with its four part-
ners:TAF
II
105 (green oval), the Oct 1 domains POU SD
and POU HD (green), and the Ig␬ promoter (blue). (From
Tompa, P., 2002. Intrinsically unstructured proteins. Trends in Bio-
chemical Sciences 27:527–533.)
(d-g) Some intrinsically un-
structured proteins (in red and yellow) bind to their
targets by wrapping around them. Shown here are
(d) SNAP-25 bound to BoNT/A, (e) SARA SBD bound to
Smad 2 MH2, (f) HIF-1␣ interaction domain bound to
the TAZ1 domain of CBP, and (g) HIF-1␣ interaction
domain bound to asparagine hydroxylase FIH. (From
Trends in Biochemical Sciences, Vol. 27, No. 10, page 530.October
2002.)
HUMAN BIOCHEMISTRY

1
-Antitrypsin—A Tale of Molecular Mousetraps and a Folding Disease
In the human lung, oxygen and CO
2
are exchanged across the walls

of alveoli—air sacs surrounded by capillaries that connect the pul-
monary veins and arteries. The walls of alveoli consist of the elastic
protein elastin. Inhalation expands the alveoli, and exhalation com-
presses them. A pair of human lungs contains 300 million alveoli,
and the total area of the alveolar walls in contact with capillaries is
about 70 m
2
—an area about the size of a tennis court! In the lungs,
neutrophils (a type of white blood cell) naturally secrete elastase, a
protein-cleaving enzyme essential to tissue repair. However, elastase
also can attack and break down the elastin of the alveolar walls
if it spreads from the site of inflammation repair. To prevent this,
the liver secretes into the blood ␣
1
-antitrypsin—a 52-kD protein
belonging to the serpin (serine protease inhibitor) family—which
blocks elastase action, preventing alveolar damage.

1
-Antitrypsin is a molecular mousetrap, with a flexible peptide
loop (blue in the figure) that contains a Met residue as “bait” for the
elastase and that can swing like the arm of a mousetrap. When elas-
tase binds to the loop at the Met residue, it cuts the peptide loop.
Now free to move, the loop slides into the middle of a large beta
sheet (green), at the same time dragging elastase to the opposite side
of the ␣
1
-antitrypsin structure. At this new binding site, the elastase
structure is distorted, and it cannot complete its reaction and free it-
self from the ␣

1
-antitrypsin. Cellular scavenger enzymes then attack
the elastase–antitypsin complex and destroy it. By sacrificing itself in
this way, the ␣
1
-antitrypsin has prevented damage to the alveolar
elastin.
Defects in ␣
1
-antitrypsin can cause serious lung and liver damage.
The gene for ␣
1
-antitrypsin is polymorphic (that is, it occurs as many
different sequence variants) and many variants of ␣
1
-antitrypsin are
either poorly secreted by the liver or function poorly in the lungs.
Even worse, tobacco smoke oxidizes the critical Met residue in the
flexible loop of ␣
1
-antitrypsin, and smokers, especially those who
carry mutants of this protein, often develop emphysema—the de-
struction of the elastin connective tissue in the lungs.
The flexible loop of ␣
1
-antitrypsin—its mousetrap spring—is
also its Achilles’ heel. Mutations in this loop make the protein vul-
nerable to aberrant conformational changes. The Z-mutation of

1

-antitrypsin is an interesting case, with a Lys in place of Glu at
residue 342 (indicated by the arrow in M) at the base of the flexi-
ble loop. This causes partial loop insertion in the large ␤-sheet
(M*). This induces the modified ␤-sheet to accept the flexible
loop of another ␣
1
-antitrypsin, forming a dimer. Repetition of
these events forms polymers, which are trapped in the liver (often
leading to cirrhosis and death). Z variants that manage to make it
to the lungs associate so slowly with elastase that they are ineffec-
tive in preventing lung damage.
(a)
Elastase

1
-AT
Met-containing
loop

1
-AT
Elastase
P
(b)
Z
MM* D

(a) Elastase (dark gray) is inactivated by binding to ␣
1
-antitrypsin.

When elastase binds, cleaving the flexible loop at a Met residue,
the rest of the loop (the red ␤-strand) rotates more than 180° and
inserts into the green ␤-sheet, swinging the elastase to the other
end of the molecule. At this new location, the elastase is distorted
and inactivated. (b) In the Z-mutant of ␣
1
-antitrypsin, the flexible
loop is only partially inserted in the large ␤-sheet, promoting poly-
mer formation and trapping ␣
1
-antitrypsin at its site of synthesis in
the liver. The consequences of this are cirrhosis of the liver, as well
as lung damage, since the small amount of ␣
1
-antitrypsin that
reaches the lungs is ineffective in preventing lung damage. Individ-
ual monomers in the ␣
1
-antitrypsin polymer are colored red, blue,
and gold (far right). (From Lomas, D. A., et al., 2005. Molecular mousetraps
and serpinopathies. Biochem Soc. Transactions 33:321-330. Figure provided by
David Lomas.)
172 Chapter 6 Proteins: Secondary,Tertiary, and Quaternary Structure
cell. (2) Compared with compact, folded proteins, disordered segments in pro-
teins appear to be able to form larger intermolecular interfaces to which ligands,
such as other proteins, could bind (Figure 6.40). Folded proteins might have to be
two to three times larger to produce the binding surface possible with a disordered
protein. Larger proteins would increase cellular crowding or could increase cell
size by 15% to 30%. The flexibility of disordered proteins may thus reduce protein,
genome, and cell sizes.

HUMAN BIOCHEMISTRY
Diseases of Protein Folding
A number of human diseases are linked to abnormalities of pro-
tein folding. Protein misfolding may cause disease by a variety of
mechanisms. For example, misfolding may result in loss of func-
tion and the onset of disease. The following table summarizes sev-
eral other mechanisms and provides an example of each.
Disease Affected Protein Mechanism
Alzheimer’s disease
Familial amyloidotic
polyneuropathy
Cancer
Creutzfeldt-Jakob disease
(human equivalent of
mad cow disease)
Hereditary emphysema
Cystic fibrosis
␤-Amyloid peptide (derived from
amyloid precursor protein)
Transthyretin
p53
Prion

1
-Antitrypsin
CFTR (cystic fibrosis transmembrane
conductance regulator)
Misfolded ␤-amyloid peptide accumulates in human neural
tissue, forming deposits known as neuritic plaques.
Aggregation of unfolded proteins. Nerves and other organs

are damaged by deposits of insoluble protein products.
p53 prevents cells with damaged DNA from dividing. One
class of p53 mutations leads to misfolding; the misfolded
protein is unstable and is destroyed.
Prion protein with an altered conformation (PrP
SC
) may seed
conformational transitions in normal PrP (PrP
C
) molecules.
Mutated forms of this protein fold slowly, allowing its target,
elastase, to destroy lung tissue.
Folding intermediates of mutant CFTR forms don’t dissociate
freely from chaperones, preventing the CFTR from reaching
its destination in the membrane.
HUMAN BIOCHEMISTRY
Structural Genomics
The prodigious advances in genome sequencing in recent years,
together with advances in techniques for protein structure deter-
mination, have not only provided much new information for bio-
chemists but have also spawned a new field of investigation—
structural genomics, the large-scale analysis of protein structures
and functions based on gene sequences. The scale of this new en-
deavor is daunting: hundreds of thousands of gene sequences are
rapidly being determined, and current estimates suggest that
there are probably less than 10,000 distinct and stable polypep-
tide folding patterns in nature. The feasibility of large-scale, high-
throughput structure determination programs is being explored
in a variety of pilot studies in Europe, Asia, and North America.
These efforts seek to add 20,000 or more new protein structures

to our collected knowledge in the near future; from this wealth of
new information, it should be possible to predict and determine
new structures from sequence information alone. This effort will
be vastly more complex and more expensive than the Human
Genome Project. It presently costs about $100,000 to determine
the structure of the typical globular protein, and one of the goals
of structural genomics is to reduce this number to $20,000 or less.
Advances in techniques for protein crystallization, X-ray diffrac-
tion, and NMR spectroscopy, the three techniques essential to
protein structure determination, will be needed to reach this goal
in the near future.
The payoffs anticipated from structural genomics are substantial.
Access to large amounts of new three-dimensional structural infor-
mation should accelerate the development of new families of drugs.
The ability to scan databases of chemical entities for activities
against drug targets will be enhanced if large numbers of new pro-
tein structures are available, especially if complexes of drugs and tar-
get proteins can be obtained or predicted. The impact of structural
genomics will also extend, however, to functional genomics—the
study of the functional relationships of genomic content—which
will enable the comparison of the composite functions of whole
genomes, leading eventually to a complete biochemical and mech-
anistic understanding of all organisms, includin
g humans.

×