Tải bản đầy đủ (.pdf) (9 trang)

Báo cáo Y học: The evolution of monomeric and oligomeric bc-type crystallins potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (292.76 KB, 9 trang )

REVIEW ARTICLE
The evolution of monomeric and oligomeric bc-type crystallins
Facts and hypotheses
Giuseppe D’Alessio
Dipartimento di Chimica Biologica, Universita
`
di Napoli Federico II, Naples, Italy
The case of homologous monomeric c-type and oligomeric
b-type crystallins has been described and analyzed in evo-
lutionary terms. Data and hypotheses from molecular gen-
etics and structural investigations converge and suggest a
novel three-phase model for the evolutionary history of
crystallin-type proteins. In the divergent cascades of mono-
meric and oligomeric crystallins, a pivotal role was played by
alterations in the gene segments encoding the C-terminal
extensions and the intermotif or interdomain linker peptides.
These were genomic hot spots where evolution experimented
to produce the modern variety of bc-crystallin-type quater-
nary structures.
Keywords: crystallins; evolution; quaternary structure;
introns late; introns early.
The question of how oligomeric proteins evolved has gained
renewed interest in the last few years [1–9]. Although the
possibility cannot be excluded that some proteins emerged
first as functional aggregates and later dissociated into
functional monomers, the available evidence suggests that
divergent evolution more often used the association of
protein protomers into oligomers to vary and enrich the cell
repertoire of structures and functions. Evidence for this
evolutionary path can be seen in the Ôhydrophilic effectÕ
recorded at intersubunit interfaces [3], i.e. a surprising,


significant presence of polar and charged residues at
oligomeric interfaces. This can be readily interpreted as
the result of the association of previously exposed, hydro-
philic surfaces (from a monomer) into solvent-excluded
interfaces (in an oligomer).
It has been argued that the alteration of a protein surface
to render it adhesive for the generation of oligomers, would
be too long an evolutionary process, as it would require
multiple mutations in the gene encoding the ancestral
monomer [9]. It was therefore proposed that evolution used
pre-existing interdomain interfaces that after a ÔswapÕ of
domains between monomers would be readily reconstituted
as intersubunit interfaces. This would induce the association
of monomers into oligomers without the need for a lengthy
process of substituting one residue after another to build an
adhesive interface. However, it has been noted that a swap
of domains between monomeric ancestors is not an
evolutionary event per se, but rather the outcome of one
or more mutational events in the monomeric ancestor: these
events could then prime a swap of domains [3]. Monomeric
proteins have been transformed artificially into dimers by
inducing the displacement of terminal helices, which deter-
mined the helix segments between two monomers [10–12].
However, to make the swap permanent and the dimers
stable, mutations had to be engineered into the cDNAs
encoding the proteins [11,12].
These were, naturally, experiments of in vitro evolution, in
which a single genetic alteration was sufficient to induce
oligomerization. When we compare a present-day set of
homologous proteins, one monomeric the other oligomeric,

what we see when we compare the amino-acid sequences of
the two proteins are merely amino-acid substitutions. Some
of these may not related at all to the monomer to oligomer
transition, and it is difficult and risky to discern the changes
presumed to be significant for the transition. However, if we
could have observed the entire process of evolution of a
monomeric protein into a dimer, we would have assigned to
each gene alteration responsible for the evolutionary
transition a different status in the evolutionary mechanism.
A ÔprimaryÕ mutation would top the hierarchy, as the single
event responsible for the step of no return towards the new,
oligomeric structural organization. Although such a
primary event would have been essential, it may not have
been sufficient to engender oligomerization. On the other
hand, it may not be easy, or even possible, to decipher in the
structure of a present-day oligomer what was the primary
mutation originally responsible for oligomerization.
Besides investigations of mutational events as revealed by
amino-acid substitutions in homologous proteins, another
tool might be useful to shed light on putative ancestors of
present-day protein oligomers. It has been surmised [3,8]
that the analysis of the refolding mechanism by which
denatured, unfolded polypeptide chains fold back into
oligomers may shed light on the evolutionary history of the
oligomers, as this might be recapitulated in the pathway of
oligomer refolding.
The monomeric c-crystallins and the evolutionarily
related dimeric b-crystallins provide an interesting case
study in the discussion of the evolutionary transition from
monomeric to oligomeric proteins. They are one of the

Correspondence to G. D’Alessio, Dipartimento di Chimica Biologica,
Via Mezzocannone, 16, 80134 Napoli, Italy.
Fax: + 39 081 5521217, Tel.: + 39 081 2534731,
E-mail:
Abbreviations: EDSP, epidermis differentiation-specific protein; TKR,
tyrosine kinase receptor.
(Received 17 December 2001, revised 8 April 2002,
accepted 17 May 2002)
Eur. J. Biochem. 269, 3122–3130 (2002) Ó FEBS 2002 doi:10.1046/j.1432-1033.2002.03004.x
present-day sets of monomeric and dimeric homologous
proteins on which the Ô3D domain swapping modelÕ has
been based [9], and special attention has been given to their
evolutionary history [13–15].
Crystallins were so named when they were recognized as
the proteins that provide the crystalline lens of the vertebrate
eye with its indispensable transparency and unique refract-
ive properties (reviewed in [16]). They are long-lived
proteins, as lens cells live as long as their host organisms.
They also have lived a very long evolutionary history, as
their primitive ancestors can be traced back to the diver-
gence of protozoa. Crystallins are not confined to the lens;
in various taxa, crystallin genetic material was ÔrecruitedÕ in
different tissues to encode proteins serving functions, as
diverse as those of enzymes and antistress proteins [17–19].
Three major classes of crystallins are common to the eye
lens throughout the vertebrates: the a-, b-andc-crystallins.
The latter two classes are made up of homologous proteins,
and constitute the superfamily of bc-crystallins. For
c-crystallins, six genes (cAtocF) have been identified,
encoding 21-kDa monomeric proteins, and a gene for a cS

crystallin, previously classified as bS-crystallin. b-crystallins,
encoded by five to seven genes, depending on species, may
form aggregates of up to 200 kDa that consist of acidic type
(bA1 to bA4) and basic type (bB1 to bB3) subunits,
23–33 kDa. The c-crystallins have short C-terminal peptide
extensions, whereas b-crystallins possess long N-terminal
extensions, and the basic b-type subunits also have
C-terminal extensions.
In this review, only genetic and structural aspects of
monomeric or oligomeric bc-crystallins likely related to
their evolutionary origin will be discussed. For other
aspects, the reviews cited above should be consulted.
STRUCTURAL FEATURES OF
MONOMERIC AND DIMERIC
CRYSTALLINS
To date, the available 3D structures are those of
bB2-crystallin [20–22], and of cB- [23–25], cE- [26], and
cC-crystallin [27]. Formerly, the latter were called cII-,
cIIIB- and cIV-crystallin, respectively. cB-crystallin is
monomeric, as are all c-crystallins; bB2-crystallin is a dimer
in solution, although its structural unit in the crystal lattice is
a tetramer, made up of two dimers, and the likely assembly
of this protein in the lens is that of higher heteroligomers.
However, monomeric cB-crystallin and dimeric bB2-crys-
tallin will be considered here as the monomeric and dimeric
prototypes for the respective families of c-andb-crystallins,
andsimplyreferredtoasc-type or b-type crystallins,
respectively.
Both monomeric c-type crystallin and the subunit of
dimeric b-type crystallin are composed of two domains, an

N- and a C-terminal domain (termed N- and C-domains).
Each domain is made up of two homologous ÔGreek keyÕ
b strand motifs; motifs M1 and M2 in the N-domain, and
motifs M3 and M4 in the C-domain. In both the
c-monomer and the b subunit, the four motifs and the
two domains are organized symmetrically, with local
intermotif and interdomain pseudo-dyads. However, in
dimeric b-crystallin the topological equivalents of the two
domains of monomeric c-crystallin are domains from
Fig. 1. The structures of: (A) bB2-crystallin
(PDB 1BLB), the dimeric b–type prototype,
and (B) cB-crystallin (PDB 4GRC). The
monomeric c-type prototype. Fragments of
cB-crystallin are shown to illustrate schemat-
ically the structures of (C) two-motif/one-
domain, and (D) one-motif putative crystallin
ancestors. The interdomain linker peptides are
coloredinred.
Ó FEBS 2002 On the evolution of crystallins (Eur. J. Biochem. 269) 3123
different subunits (see Fig. 1). Two types of dimers may
thus be viewed in a b-type crystallin: the orthodox dimer,
made up of the two subunits with the two N • CandN¢•C¢
domains covalently linked through linker peptides (
|)in
antiparallel fashion:
and a pseudo-dimer made up of noncovalently associated
N • C¢ and C • N¢ domains from the two subunits, which
reproduce the topological association of the N- and
C-domains of c-crystallin (see Fig. 1). The linker peptide
segments that connect the N- and C-domains have very

different conformation in the c-monomer and in the
b subunits. In the monomeric c-type crystallins, the linker
peptide bends to reach from the N-domain through the
C-domain as in N • C. In dimeric b-type crystallin instead,
the two linker peptides have an extended conformation; as
in the scheme above, they run antiparallel on either side of
the pseudo twofold axis relating the two-domain pseudo-
dimeric structure made up of N • C¢ and C • N¢ (see Fig. 1).
When present, N- and C-terminal extensions are not
entirely defined in the structure of crystallin proteins, as they
are flexible, without unique conformations, with the excep-
tion of the proximal segments of the C-terminal extensions.
MOLECULAR GENETICS STUDIES:
FACTS AND HYPOTHESES
Owing to the stringent necessity to conserve the critical
function of providing the lens, by an appropriate arrange-
ment of protein aggregates, with the precision of an optical
measuring instrument, lens crystallins have been subjected
to severe selective pressure in the course of their evolution.
This is indicated by the very low substitution rates registered
in the vertebrate crystallin genes, especially in those coding
for b-crystallins, and by the unusually very similar substi-
tution rates recorded for internal and surface regions of
these proteins [14]. The latter finding can be interpreted as
indicative of the importance of surface, intermolecular
interactions among the lens proteins.
A striking exception to this general sequence conservation
rule are the high substitution rates that have been recorded
only for the sequences encoding the interdomain linker
peptides and the N- and C-terminal extensions. These

findings certainly have an evolutionary significance.
In both b-andc-type crystallin genes, the sequence
coding for the interdomain linker peptide is interrupted by
an ÔinterdomainÕ intron. In the b-type crystallin genes,
ÔintermotifÕ introns are also present. Thus in b-type genes,
each motif (M1, M2, M3, M4) is encoded by a separate
exon, whereas in the c-type genes the pairs of adjacent
motifs (M1/M2 and M3/M4) are each encoded by a single
exon (Fig. 2). Sequence similarities are higher when motif
M1 is compared with M3 (termed A type motifs), or M2
with M4 (B type motifs), which results in a ABAB pattern
[35].
The structural similarity and topological equivalence
between motifs and between domains, and the significant
degree of sequence identity between domains, even higher
than between motifs, led to the proposal [23] that the
evolutionary path of c-crystallin started with a one-motif
ancestor. Then, upon gene duplication followed by fusion, a
two-motif/one-domain protein evolved, to be followed,
after a second duplication-fusion step, by the two-domain
c-type proteins. Subsequent findings from protein sequence,
gene sequence, and structural studies [7,13,14,20] have
strengthened and expanded this view.
This evolutionary path based on primary and tertiary
structure homologies, and consisting of two main events of
gene duplication, each followed by fusion, is supported by
the identification in distant phyla of homologous genes
encoding proteins that can be related to putative crystallin
ancestors. A one-domain crystallin-like fold has been found
in a protein (spherulin 3a) from a slime mould, with a

significant sequence identity and a high structural similarity
with c-crystallin domains [61,65]. Interestingly, in the
amino-acid sequence of spherulin 3a motif M1 is not
N-terminal as in bc-crystallin sequences, but C-terminal to
motif M2 (Fig. 2). Another case of a one-domain crystallin
fold has been identified [30] in Streptomyces metallo-
proteinase inhibitor (SMPI), with a clear relationship in
three-dimensional structure to bc-crystallins. In this protein,
a significant, albeit weak, sequence similarity has been
detected between its N-terminal motif and M1 motif of
bc-crystallins, but no similarities were found between its
Fig. 2. A scheme of the arrangements of motif encoding gene sequences
in crystalline-type genes. SPHE-, STRE-, S-, C-, b-andc-type nota-
tions indicate motif arrangements in: spherulin 3a, Streptomyces
protease inhibitor, S-protein-, G. Cydonium protein, b-andc-type
crystallin, respectively. Motifs are shown as boxes and their numbers
(M1 through M4) are those typical of both b-type and c-type crys-
tallins, assigned to the other genes on the basis of homologies. Two-
motif domains are formed by adjacent motifs. Thin and thick bars
represent intermotif and interdomain introns, respectively. Dotted line
segments between domains or motifs indicate that it is not known if an
intermotif or an interdomain intron is present in that gene.
3124 G. D’Alessio (Eur. J. Biochem. 269) Ó FEBS 2002
C-terminal domain and any other known crystallin-type
motif sequences (in Fig. 2, this motif is marked as MX). A
crystallin-type one-domain fold has also been proposed for
a yeast toxin [31], and for a Streptomyces toxin-like protein
[32]. However, in these cases the possibility of convergent
evolution may not be excluded [33].
Two-domain crystallin-like folds have also been found.

One was identified in protein S from the spore coat of a
bacterium [34], another long-lived protein (like spherulin 3a
and the crystallins). Interestingly, in this two-domain
protein, the four homologous motifs are not arranged as
in bc-crystallins (M1-M2-M3-M4), but in a reversed pattern
(M2-M1-M4-M3) (Fig. 2). This prompted the suggestion
[13] that the two evolutionary lines of the bacterial
crystallin-like protein and the vertebrate crystallin ancestor
diverged at the one-motif stage.
Another two-domain, evolutionarily related member of
the bc-crystallin superfamily has been identified in the
epidermis differentiation-specific protein (EDSP) from an
amphibian, Cynops pyrrhogaster [35]. The N-terminal
portion of this protein contains four crystallin-type motifs
that appear to be arranged in the M1-M2-M3-M-4 pattern
typical of the b-andc-type lens crystallins. More recently, a
two-domain crystallin signature has been identified in a
protein sequence from a sponge of the genus Geodia [36]. In
this protein too the four Greek key motifs are arranged
in the same order (M1-M2-M3-M4) as in the vertebrate
bc-crystallin genes.
Another impressive addition to the bc-crystallin super-
family is that proposed for AIM1, a protein encoded in a
human gene whose expression has been related to melan-
oma suppression [37]. The 3¢ terminal region of this gene
codes for a protein sequence comprising 12 crystallin-type
motifs arranged in the M1-M2-M3-M4 order. Trimeric
protein models have been constructed connecting the six
two-motif domains with either the bent c-type or the
extended b-type interdomain linkers. The gene, however,

appears to code for protein domains more closely related to
b-type than to c-type crystallins. This conclusion is based on
the following elements: (a) the linker peptide sequences are
closer to those typical of b-type crystallins; (b) the gene
contains intermotif introns as the b-type genes; (c) the
interdomain intron positions are homologous to those of
the b-type crystallins introns.
As for the evolution of dimeric b-type crystallins, the
possibility that a c-type gene encoding a monomeric
crystallin was the immediate ancestor to a b-type gene
encoding a dimeric crystallin has been excluded [14], based
on the absence of intermotif introns in c-crystallin genes and
their presence in b-type genes (Fig. 2). The lack of these
introns in c-crystallins has been attributed to an intron loss
occurred in a two-motif/one-domain crystallin ancestor.
The loss would have occurred in the c-type genes only after
the divergence of the evolutionary paths leading to c-type
and b-type genes, respectively. This because it was deemed
unlikely that an identical mutational event, the intron loss,
could have occurred twice in the evolution of two homol-
ogous one-domain genes after their divergence and before
their fusion into four-motif/two-domain encoding genes.
In fact, the opposite argument may be valid. The
probability that a certain type of gene alteration occurs
(an insertion, a deletion) depends on extrinsic (e.g. nature of
the mutagen, environmental conditions) and on intrinsic
factors: the base sequence, the consequent secondary and
supersecondary structures, as well as the topology of the
DNA region in which the event takes place. For homolog-
ous genes we may assume that they share most of the

intrinsic and extrinsic elements. Thus to evaluate the
probability that a certain gene alteration occurred in
evolution in homologous genes, we should use very similar
probability factors. In conclusion, the likelihood that a gene
alteration, such as an intron loss or insertion, could occur
twice in the evolution of homologous genes in a certain gene
family is greater the closer these genes are in evolution, i.e. if
their divergence was a recent occurrence.
Furthermore, the proposal that an intermotif intron loss
occurred at the two-motif/one-domain stage would have as
a consequence that a single two-motif/one-domain-enco-
ding gene, in which the loss would have occurred, was the
common ancestor to all subsequently diverged one-domain
genes. But this does not appear to be the case, as indicated
by the different motif orders in different crystalline-type
genes. As mentioned above, in cb-type crystallins homol-
ogous M1 motifs are N-terminal to motif M2, whereas in
spherulin 3a and in protein S they are C-terminal to M-2
(see Fig. 2). Hence, in the evolutionary path of crystallin-
type proteins it would seem unlikely that a single two-motif/
one-domain ancestor duplicated and diverged while also
undergoing a switch of motif-encoding sequences to gener-
ate different motif arrangements in the various descendant
genes.
A more general argument in favour of a late insertion of
introns in crystallin-type genes, as opposed to a late deletion
of pre-existing introns, may be based on the intron-late
theory, originally proposed to explain the presence of
spliceosomal introns in eukaryotes, and their absence in
archea and in eubacteria [38–40]. Over recent years, a vast

amount of data has been interpreted as supporting this
theory [41]. In particular, the results of a statistical analysis
[42] of pairs of gene paralogs may only be interpreted to
favour intron gains rather than intron losses in these genes.
Recent data [43] in support of the theory is the finding that
in the sponge Geodia the gene encoding the extracellular and
transmembrane domains of the tyrosine kinase receptor
(TKR) has no introns. In homologous vertebrate TKR
genes instead several introns are present.
As for the late insertion of introns in crystallin-type
proteins, it has been recently found (A. Di Maro, M. V.
Cubellis & G. D’Alessio, unpublished results) that there are
no introns in the gene encoding the crystallin-type protein
from Geodia (see above). It should be noted that Geodia
sponges are very primitive organisms that diverged more
than 500 million years ago (some 300 million years earlier
than mammals), whose crystallin genes have a full comple-
ment of introns. This finding is in support of late gains of
introns, rather than introns loss in the evolution of
crystallin-type genes.
An alternative model may therefore be proposed for the
evolution of crystallin-type genes, clearly evolved from the
previous models reported above [13,14,23]. In this model an
early one-motif crystallin ancestor gene duplicated and
diverged into several one-motif genes, whose combinatorial
fusion engendered several two-motif pairs. This would
accommodate all the motif arrangements (M1-M2, M1-M3,
etc.) identified in present time crystallin-type proteins
(Fig. 3). In this scenario there would be no loss of intermotif
Ó FEBS 2002 On the evolution of crystallins (Eur. J. Biochem. 269) 3125

introns in the evolutionary pathway to c-type genes, but
rather the acquisition of an interdomain intron in the
evolution of c-crystallins, and of both interdomain and
intermotif introns in the evolution of b-type crystallin genes
(Fig. 3). The likelihood of late-in-evolution intron insertion,
or that identical mutational events could have occurred in
evolutionarily proximal genes has been discussed above.
It should be noted that the DNA sequences coding for
C-terminal and N-terminal extensions in one-domain
ancestral crystallin genes would be the most likely candi-
dates for the formation of intron sequences between motifs
or between domains. This can be based not only on their
intermotif and interdomain topologies, but also on their
very high substitution rates (see above).
Thus the available evidence from molecular genetics
studies on crystallin-type genes may be interpreted as
illustrated in Fig. 3, in which three main phases are
summarized. In phase 1, duplications and divergence of
the putative earliest, one-motif M ancestor occurred. In
phase 2, the diverged duplicates fused in different combi-
nations and underwent further divergence. Upon fusion, an
intron formed between motifs in the evolutionary path of
the ancestors toward the b-type, but not in that toward the
c-type genes. In phase 3, the two-motif/one-domain enco-
ding genes duplicated and fused, with the formation of an
interdomain intron, possibly after the divergence of verte-
brates.
It should be noted that the scheme illustrated in Fig. 3
provides parallel, independent evolutionary paths for c-type
monomers and for b-type oligomers. Thus, as previously

proposed [14], oligomeric b-type crystallins did not evolve
from monomeric c-type crystallins, although here this
conclusion is based on different considerations. Naturally,
and in line with previous analyses [2,3], such conclusion
excludes the possibility that a dimeric b-type crystallin
evolved from a monomeric c-type crystallin through a 3D
domainswap[9].
The molecular genetics studies described above also
suggest an important evolutionary role of the DNA regions
encoding the interdomain linker peptides and the terminal
extensions, as they are regions: (a) with high substitution
rates; (b) where intron insertions or deletions occurred.
STRUCTURAL STUDIES: FACTS
AND HYPOTHESES
When the question of crystallin evolution is examined from
a structural viewpoint, the most impressive data is the high
conservation of hydrophobic patches at inter–domain
interfaces [20,23]. In c-crystallin, the hydrophobic residues
Met43, Phe56 and Ile81 from motif M2 interact with the
homologous Val132, Leu145 and Val170 from motif M4.
Identical or analogous interactions occur at the b-crystallin
interface between the triad of Val55, Val68, and Ile92, and
that of Val143, Leu156, and Ile181. Then the C-terminal
extensions have also been suspected to have a role in the
evolution of domain association, as suggested by the
interdomain hydrophobic interactions observed between
the C-terminal extensions of the b-C-domain and the
surface of the N-domain from the partner subunit [20], and
by the peculiar behaviour [44] of the isolated c-C-domain
altered at its C-terminal extension (see below). Finally, the

strikingly different conformations of the interdomain
linkers, bent or extended in c-andb-crystallins, respectively,
could certainly not escape attention.
Thus, the key structural features to focus upon while
analyzing the evolution of two-domain or two-subunit bc-
crystallins (i.e. the determinants of interdomain association,
intramolecular or intermolecular), are the hydrophobic
interdomain patches, the interdomain linker peptides, and
the terminal extensions. These have been the precise targets
selected by the London and Regensburg research groups in
their investigations on the structural determinants and the
evolution of the b-type and c-type crystallins [7,25,44–51].
The burying of the hydrophobic patches at the interdo-
main interfaces, intramolecular in the c-type structure,
intermolecular in the b-type, has been early recognized as
the apparent driving force for domain association [20,23].
Fig. 3. A schematic summary of the main
events in the evolutionary paths leading to pre-
sent-day crystallin-type genes. Mdenotesa
monomeric putative ancestor encoding a one-
motif (Greek key) protein, hyphenated to
indicate duplication and divergence of genes,
with the numerals 1–4 indicating motif
typologies. Subscripts F, S, C, G, and B
denote the respective evolutionarily com-
mitted ancestors of: spherulin 3a, protein S,
the G. cydonium protein, the c-type, and the
b-type crystallin genes. The segments con-
necting the boxed M motifs indicate the pres-
ence of intermotif (thin bars) and interdomain

(thick bars) introns; the lack of separation
lines between motifs or domains indicate that
inthosecasesthepresenceorabsenceof
introns has not been determined.
3126 G. D’Alessio (Eur. J. Biochem. 269) Ó FEBS 2002
However, an impressive network of H-bonds and ion pairs
between Glu and Arg residues is also evident in these
structures at the interdomain interfaces [52]. It is therefore
tempting to conclude that the polar or charged side-chains
involved in these contacts are remnants of the ancestral,
solvent exposed surfaces of single-domain crystallins, now
buried at interdomain interfaces of present day crystallins.
As they concur to the interface stabilization, we can suggest
that a Ôhydrophilic effectÕ [3] apparently concurred in
stabilizing the interfaces of crystallins that evolved into
higher order structures.
As for the hydrophobic patches, many experiments have
been performed to investigate their importance in the
determinism of domain association, some of them with
contradicting results. It has been reported that isolated
c-crystallin domains, perfectly equipped with their hydro-
phobic triad, either obtained through proteolytic cleavage
[53], or as recombinant proteins [54], do not associate
spontaneously into c-like domain dimers, and behave as
stable monomeric proteins. These results would lead to
conclude that the hydrophobic effect is not the only
determinant of domain association. Yet, they may simply
suggest that covalent interdomain linkers are essential to
raise the local concentration of interdomain surfaces and
engender the hydrophobic effect [46]. On the other hand, the

substitution of a single residue (Phe56, replaced by Ala, Asp
or Trp) in the triad responsible for the hydrophobic patch
proved sufficient to destabilize c-crystallin domains to the
point of rendering them incapable of engaging into a stable
association [48].
Different results have been obtained with the isolated
N-domain of rat bB2-crystallin, found to associate in
solution [51], and with the isolated N- and C-domains of
c-S-crystallin, for which a tendency to associate into
heterodimers has been reported [55]. It should be noted
that c-S-crystallin is very similar to b-crystallin, and that for
a long time it was labelled as a b-crystallin. Recently, the
structure of dimeric N-domains from rat bB2-crystallin has
been solved [52] and shown to be maintained essentially by
the canonic hydrophobic contacts described above, and by
the polar interactions mentioned above.
The apparent discrepancy between the two sets of data
may be reconciled by the conclusion that c-type domains,
once dissociated cannot re-associate, whereas domains of
b-type and b-like c-S-type crystallins do not need a high
local concentration of structural elements to build up the
interface. Hence, in c-type crystallins the interdomain
hydrophobic patches may not be the only determinant for
domain association, whereas they are determinant and
sufficient in b-crystallins. This conclusion may not be
surprising if we consider the radically different conforma-
tion of the interdomain linker peptides, bent and extended,
respectively, in c-type and b-type crystallins. In the
former case, the bent linker seems to be essential to drive
the association at the interface, whereas in the latter the

extended, spatially distant linker is not involved in the
association.
The role of the linker peptides in the determination of
monomeric vs. dimeric structures, has also been investigated
by protein engineering, with apparently contradicting
results. One early conclusion had been that the linker
peptides have no role in determining domain association.
This was based on the following findings: a c-type protein
remains monomeric when its c-type linker is replaced by a
b-type linker [46]. Likewise, a b-type protein remains a
dimer when its original linker is replaced with a c-type linker
[56], as described previously [49]. In these experiments, the
exchanged sequences comprised residues 82–87, as under-
lined in the alignment of c- and b-type crystallins (Table 1.)
However, when the latter experiment was carried out [47]
by replacing the linker of the b-type protein with a longer c-
type peptide sequence that included two extra residues at the
N-terminus (Pro80 and Ile81 in the alignment above), the
engineered b-type protein did become monomeric. Thus, if
the linker peptide connecting motifs M2 and M3 of the
protein is defined as the sequence comprising residues 80–87
[20], the linker sequence does appear to have a role as a
determinant of the dimeric structure. It must be noted that
the Pro residue at position 80 is strictly conserved in b-type
crystallins, whereas in c-type proteins a Leu is found at that
position (with the single exception of a Ser in cA-crystallin).
This suggests that the presence of a Pro at position 80 can
force the linker into an extended conformation, that typical
of b-type crystallins, which does not allow for a sufficiently
high local concentration of interdomain interacting residues

[23]. In the absence of Pro80, these residues can interact and
the two domains associate into a c-type monomer. It is
tempting to propose that a key amino-acid substitution (a
Ôprimary mutationÕ) in the evolution of c-type and b-type
crystallins from their common ancestor was the insertion of
a Pro residues at that position in the b-type sequences, and
of a hydrophobic residue in c-type crystallins.
Contrasting results were obtained in another laboratory,
showing that a recombinant b-crystallin variant is isolated
as a dimer also when its linker is replaced with a c-type
linker [57]. Although the b-crystallin used in the latter
experiment was rat b-B3, instead of bovine b-B2, and the
replaced fragment was two residues longer, the replacing
linker was from the same c-B crystallin as in the experiment
cited above [46]. The insertion of a C-terminal Tyr residue in
the substituting fragment, and the presence of a Ser instead
of a Thr, may explain the contrasting results. If these were
both confirmed, we may only surmise that in these types of
engineering experiments only limited areas of the protein
structure under test are narrowly illuminated, while other
effects of the engineering on other areas of the protein
structure remain in the dark, and may affect the interpret-
ation.
However, the overall conclusion that the linker peptides
did have a role in the evolution of monomeric vs. oligomeric
crystallins is convincing. In this respect, it would be
interesting to determine the structure of the crystallin-type
protein from the sponge gene [36], in which a short (only
three residues) interdomain linker peptide has been identi-
fied, i.e. with a length typical of c-type crystallins linkers.

As for the terminal extensions, they are mostly flexible
and mobile [58] and do not seem to play any roles in folding
and domain association [44,59,60]. The proximal stretch of
Table 1. Alignment of c- and b-type crystallins. The exchanged
sequences comprise residues 82–87 (underlined).
80 87
bB2 crystallin linker PIKVDSQE
cB crystallin linker LIPQHTGT
Ó FEBS 2002 On the evolution of crystallins (Eur. J. Biochem. 269) 3127
the C-terminal extension in the b-type structure instead is
not flexible, and has been suggested to mimic a noncovalent
interdomain linker because it introduces its Trp175 residue
in a hydrophobic pocket on the surface of the N-domain
from the partner subunit [20]. When the whole C-terminal
extension, including Trp175, is removed, b-type crystallin
can still associate into dimers and tetramers [47].
But the terminal extensions, although apparently not a
determinant in the structural chemistry of present-day
crystallins, may have instead had key roles in the evolu-
tionary modular assembly of these proteins. It has been
found that although isolated, recombinant c-type
C-domains cannot associate into noncovalent structures to
mimic a c-type crystallin [7], yet they will associate after the
removal of the terminal Tyr residue from their C-terminal
extensions. In the 3D structure of this des-Tyr-c-C-domain,
the C-terminal extension hinders the association of the two
domains by interacting with the hydrophobic interdomain
interface. This destabilizing effect would not be exerted
when the covalent interdomain linker is in position and
displaces the peptide extension out into the solvent. These

results suggest that the extended form of the linker peptide,
characteristic of b-crystallins, could have evolved directly
from the C-terminal extension of a two-domain ancestor [7].
An independent experimental approach has led to similar
conclusions. The C-terminus of the C-domain extension of
rat bB2 crystallin has been fused by protein engineering with
the N-terminus of the N-domain from the partner subunit
[50]. Because the engineering also discontinued the interdo-
main linkers in both subunits, a circularly permuted
structure was obtained. In this structure, the C-terminal
extension was turned into an interdomain linker. The
resulting expressed protein was still a dimer, but differed
from the wild-type bB2 dimer, in that its domain pairing
was that typical of c-crystallin.
These experiments support the proposal that the exten-
sions may have been the evolutionary precursors of
interdomain linkers, but also confirm the crucial role played
in evolution by the linkers themselves. They hint at the
possibility that circular permutation may have been one of
the mechanisms employed in the evolution of new crystallin
structures [15,49]. In modular constructions, structural
variation depends on the different ways modules are
assembled, i.e. on the different types of structural elements
connecting and pairing the modules. Domain extensions
could well have been exploited by evolution to generate a
variety of linkers in order to get the creative advantages
inherent in modular assemblies.
It has been proposed that another experimental approach
to obtain insight to the evolutionary history of an
oligomeric protein is to investigate its unfolding/refolding

[3,8]. This is based on the idea that the folding pathway of
an oligomer might reiterate its evolutionary pathway. Thus,
it may be of interest to analyze the results of unfolding/
refolding experiments carried out on crystallin-type pro-
teins.
Spherulin 3a [61], the single-domain crystallin-type pro-
tein, unfolds in a highly cooperative fashion with a two-state
transition [2,62,63]. Two-domain proteins, such as protein S
[64] and a c-type crystallin [45], unfold instead with three-
state transitions, just as a b-type crystallin does [51]. It
should be added that the isolated N- or C-domains,
prepared by recombinant technology unfold cooperatively
with two-state transitions [54].
The intermediates in the unfolding pathway of both
protein S and c-type crystallin have been described as
presenting a still folded N-domain and a fully unfolded
C-domain. In contrast, in the unfolding pathway of b-type
crystallin the N-domain unfolds first while the C-domain
remains folded. Interestingly, the isolated b-type C-domains
are monomeric, whereas isolated N-domains associate.
Based on these results, and on the findings described
above, we can envisage that single-domain crystallin-type
proteins natural as Spherulin 3a, or artificially produced as
the isolated domains from c-andb-type crystallins resemble
the evolutionary ancestors of two-domain crystallin. Hence,
we may regard these one-domain proteins as stable mono-
mers. Once rendered unstable through mutations in their
encoding genes, they could find a new stable conformation
only upon gene fusion leading to domain association. This
evidently happened along distinct, parallel evolutionary

paths, for c-type and protein-S crystallin-type proteins, and
b)type crystallins, respectively. Thus, the results of the
unfolding/refolding experiments and their interpretation are
in support of the evolutionary pathway illustrated in Fig. 3.
CONCLUSIONS
It appears that the findings described above, based on
structural and protein engineering studies or on molecular
genetics analyses, lead to the same conclusions. Both sets of
data indicate that a series of gene alterations and fusions led
from crystallin ancestors coding for proteins made up of a
single Greek-key motif to two-motif/one-domain proteins,
to two-domain c-type crystallin monomers, or two-domain/
two-monomer b-type dimers. A key role in the evolutionary
cascade was apparently played by the gene sequences
encoding the C-terminal extensions downstream to the
motif encoding exons in one-motif and one-domain ances-
tors. These are the sequences involved in the gene fusion
molecular events and especially marked by high substitution
rates. In the present day, postfusion two-domain crystallin
genes, homologous sequences encode the interdomain linker
peptides. These DNA sequences were the hot spots in the
ancestral crystallin genes, where evolution intensely experi-
mented to generate protein sequences that independently
evolved into two distinct paths, leading to different linker
conformations for c-andb-crystallins, hence to monomers
and dimers, respectively.
ACKNOWLEDGEMENTS
I am grateful for comments and criticism on the manuscript to J. F.
Riordan (Harvard Medical School), G. Wistow (NIH), M. Riley
(MBL, Woods Hole), and my colleagues in Naples: M. V. Cubellis,

A. Di Maro, T. Giancola, R. Piccoli, and A. Russo. The rendering of
molecular graphics for Fig. 1. was provided by M. V. Cubellis.
Figures 2 and 3 were drawn by A. Di Maro; I am very grateful to both.
REFERENCES
1. Park, C. & Raines, R.T. (2000) Dimer formation by a ÔmonomericÕ
protein. Protein Sci. 9, 2026–2033.
2. Jaenicke, R. & Lilie, H. (2000) Folding and association of oligo-
meric and multimeric proteins. Adv. Prot. Chem. 53, 329–401.
3128 G. D’Alessio (Eur. J. Biochem. 269) Ó FEBS 2002
3. D’Alessio, G. (1999) The evolutionary transition from monomeric
to oligomeric proteins: tools, the environment, hypotheses. Prog.
Biophys. Mol. Biol. 72, 271–298.
4. Xu, D., Tsai, C J. & Nussinov, R. (1998) Mechanism and evo-
lution of protein dimerization. Protein Sci. 7, 533–544.
5. Ciglic,M.I.,Jackson,P.J.,Raillard,S.A.,Haugg,M.,Jermann,
T.M., Opitz, J.G., Trabesinger-Ruf, N. & Benner, S.A. (1998)
Origin of dimeric structure in the ribonuclease superfamily. Bio-
chemistry 37, 4008–4022.
6. Beintema, J.J., Breukelman, H.J., Carsana, A. & Furia, A. (1997)
Evolution of vertebrate ribonucleases: ribonuclease A superfamily.
In Ribonucleases: Structures and Functions (D’Alessio, G. &
Riordan, J.F., eds), pp. 245–269. Academic Press, San Diego.
7. Norledge,B.V.,Mayr,E.M.,Glockshuber,R.,Bateman,O.A.,
Slingsby, C., Jaenicke, R. & Driessen, H.P. (1996) The X-ray
structures of two mutant crystallin domains shed light on the
evolution of multi-domain proteins. Nat. Struct. Biol. 3, 267–274.
8. D’Alessio, G. (1995) Oligomer evolution in action? Nat. Struct.
Biol. 2, 11–13.
9. Bennett, M.J., Schlunegger, M.P. & Eisenberg, D. (1995) 3D
domain swapping: a mechanism for oligomer assembly. Protein

Sci. 4, 2455–2468.
10. Crestfield, A.M., Stein, W.H. & Moore, S. (1962) On the
aggregation of bovine pancreatic ribonuclease. Arch. Biochem.
Biophys. 1S, 217–222.
11. Green, S.M., Gittis, A.G., Meeker, A.K. & Lattman, E.E. (1995)
One step evolution of a dimer from a monomeric protein. Nat.
Struct. Biol. 2, 746–751.
12. Russo, A., Antignani, A. & D’Alessio, G. (2000) In vitro evolution
of a dimeric variant of human pancreatic ribonuclease. Biochem-
istry 39, 3585–3591.
13. Wistow, G.J. & Piatigorsky. J. (1988) Lens crystallins: the evolu-
tion and expression of proteins for a highly specialized tissue.
Annu.Rev.Biochem.57, 479–504.
14. Lubsen,N.H.,Aarts,H.J.M.&Schoenmakers,J.G.G.(1989)The
evolution of lenticular proteins: the b-andc-crystallin super gene
family. Prog. Biophys. Mol. Biol. 51, 47–76.
15. Slingsby, C. & Clout, N.J. (1999) Structure of the crystallins. Eye
13, 395–402.
16. Augusteyn, R.C. & Stevens, A. (1998) Macromolecular structure
of the eye lens. Prog. Polym. Sci. 23, 375–413.
17. Wistow, G. (1993) Lens crystallins: gene recruitment and evolu-
tionary dynamism. Trends Biochem. Sci. 18, 301–306.
18. Piatigorsky, J. & Wistow, G. (1991) The recruitment of crystallins:
new functions precede gene duplication. Science 252, 1078–1079.
19. Piatigorsky, J. (2000) Review: a case for corneal crystallins.
J. Ocul. Pharmacol. Therap. 16, 173–180.
20. Bax, B., Lapatto, R., Nalini, V., Driessen, H., Lindley, P.F.,
Mahadevan, D., Blundell, T.L. & Slingsby, C. (1990) X-Ray
analysis of bB2-crystallin and evolution of oligomeric lens pro-
teins. Nature 347, 776–780.

21. Lapatto, R., Nalini, V., Bax, B., Driessen, H., Lindley, P.F.,
Blundell, T.L. & C.S. (1991) High resolution structure of an oli-
gomeric eye lens b-crystallin. J. Mol. Biol. 222, 1067–1083.
22. Nalini, V., Bax, B., Driessen, H., Moss, D.S., Lindley, P.F. &
Slingsby, C. (1994) Close packing of an oligomeric eye lens
b-crystallin induces loss of symmetry and ordering of sequence
extensions. J. Mol. Biol. 236, 1250–1258.
23. Blundell, T., Lindley, P., Miller, L., Moss, D., Slingsby, C., Tickle,
J., Turnell, B. & Wistow, G. (1981) The molecular structure and
stability of the eye lens: X-ray analysis of c-crystallin II. Nature
289, 771–777.
24. Wistow, G., Turnell, B., Summers, L., Slingsby, C., Moss, D.,
Miller, L., Lindley, P. & Blundell, T. (1983) X-ray analysis of the
eye lens protein c-II crystallin at 1.9 A
˚
resolution. J. Mol. Biol. 170,
175–202.
25. Najmudin, S.N.V., Driessen, H.P.C., Slingsby, C., Blundell, T.L.,
Moss, D.S. & Lindley, P.F. (1993) Structure of the bovine eye
lens protein cB(cII) crystallin at 1.47 A
˚
. Acta Crystallogr. D49,
223–233.
26. White, H.E., Driessen, H.P.C., Slingsby, C., Moss, D.S. & Lind-
ley, P.F. (1989) Packing interactions in the eye lens. structural
analysis, internal symmetry and lattice interactions of bovine
gIVa-crystallin. J. Mol. Biol. 207, 217–235.
27. Chirgadze, Y.N., Sergheev, Y.V., Fomenkova, N.P. & Oreshin,
V.D. (1981) Polypeptide chain pathway in g-crystallin IIIb from
calf lens at 3 A

˚
resolution. FEBS Lett. 131, 81–84.
28. Wistow, G. (1990) Evolution of a protein superfamily: relationship
between vertebrate lens crystallins and microorganism dormancy
proteins. J. Mol. Biol. 30, 140–145.
29. Rosinke,B.,Renner,C.,Mayr,E.,Jaenicke,R.&Holak,T.A.
(1997) Ca
2+
-loaded spherulin 3a from Physarum polycephalum
adopts the prototype c-crystallin fold in aqueous solution. J. Mol.
Biol. 271, 645–655.
30. Ohno, A., Tate, S., Seeram, S.S., Hiraga, K., Swindells, M.B.,
Oda, K. & Kainosho, M. (1998) NMR structure of the
Streptomyces metalloproteinase inhibitor, SMPI, isolated from
Streptomyces nigrescens TK-23: another example of an ancestral
bc-crystallin precursor structure. J. Mol. Biol. 282, 421–433.
31.Antuch,W.,Gu
¨
ntert, P. & Wu
¨
thrich, K. (1996) Ancestral
bc-crystallin precursor in a yeast killer toxin. Nat. Struct. Biol. 3,
662–665.
32. Ohki, S., Kariya, E., Hiraga, K., Wakamiya, A., Isobe, T., Oda,
K. & Kainosho, M. (2001) NMR structure of streptomyces killer
toxin-like protein, SKLP: further evidence for the wide distribu-
tion of single-domain betagamma-crystallin superfamily proteins.
J. Mol. Biol. 305, 109–120.
33. Clout, N.J., Slingsby, C. & Wistow, G.J. (1997) An eye on crys-
tallins. Nat. Struct. Biol. 4, 685.

34. Wistow, G., Summers, L. & Blundell, T. (1985) Myxococcus
xanthusspore coat protein S may have a similar structure to
vertebrate lens bc-crystallins. Nature 315, 771–773.
35. Wistow, G., Javorski, C. & Vasantha Rao, P. (1995) A non-lens
member of the bc-crystallin superfamily in a vertebrate, the
amphibian Cynops. Exp. Eye Res. 61, 637–639.
36. Krasko, A., Mu
¨
ller, I.M. & Mu
¨
ller, W.E.G. (1997) Evolutionary
relationships of the metazoan bg-crystallins, including that from
themarinespongeGeodia cydonium. Proc. R. Soc. Lond. B. 264,
1077–1084.
37. Ray,M.E.,Wistow,G.,Su,Y.A.,Meltzer,P.S.&Trent,J.M.
(1997) AIM1, a novel non-lens member of the bc-crystallin
superfamily, is associated with the control of tumorigenicity in
human malignant melanoma. Proc. Natl Acad. Sci. USA 94,
3229–3234.
38. Rogers, J.H. (1989) How were introns inserted into nuclear genes?
Trends Genet. 5, 213–216.
39. Palmer, J.D. & Logsdon Jr, J.M. (1991) The recent origin of
introns. Genet. Dev. 1, 470–477.
40. Cavalier-Smith, T. (1991) Intron phylogeny: a new hypothesis.
Trends Genet. 7, 145–148.
41. Logsdon Jr, J.M. (1998) The recent origins of spliceosomal introns
revisited. Curr. Opin. Genet. Dev. 8, 637–648.
42. Cho, G. & Doolittle, R.F. (1997) Intron distribution in ancient
paralogs supports random insertion and not random loss. J. Mol.
Evol. 44, 573–584.

43. Gamulin, V., Skorokhod, A., Kavsan, V., Muller, I.M. & Muller,
W.E.G. (1997) Experimental indication in favor of the introns-late
theory: the receptor tyrosine kinase gene from the sponge Geodia
cydonium. J. Mol. Evol. 44, 242–252.
44. Norledge, B.V., Trinkl, S., Jaenicke, R. & Slingsby, C. (1997) The
X-ray structure of a mutant eye lens bB2-crystallin with truncated
sequence extensions. Protein Sci. 6, 1612–1620.
Ó FEBS 2002 On the evolution of crystallins (Eur. J. Biochem. 269) 3129
45. Rudolph, R., Siebendritt, R., Nesslauer, G., Sharma, A. & Jae-
nicke, R. (1990) Folding of an all- b protein: independent domain
folding in c-II-crystallin from calf eye lens. Proc. Natl Acad. Sci.
USA 87, 4625–4629.
46. Mayr, E.M., Jaenicke, R. & Glockshuber, R. (1994) Domain
interactions and connecting peptides in lens crystallins. J. Mol.
Biol. 235, 84–88.
47. Trinkl, S., Glockshuber, R. & Jaenicke, R. (1994) Dimerization of
bB2-crystallin: the role of the linker peptide and the N- and
C-terminal extensions. Protein Sci. 3, 1392–1400.
48. Palme, S., Slingsby, C. & Jaenicke, R. (1997) Mutational analysis
of hydrophobic domain interactions in c-B-crystallin from bovine
eye lens. Protein Sci. 6, 1529–1536.
49. Wright, G., Basak, A.K., Wieligmann, K., Mayr, E.M. & Sling-
sby, C. (1998) Circular permutation of betaB2-crystallin changes
the hierarchy of domain assembly. Protein Sci. 7, 1280–1285.
50. Wieligmann, K., Norledge, B., Jaenicke, R. & Mayr, E.M. (1998)
Eye lens betaB2-crystallin: circular permutation does not influence
the oligomerization state but enhances the conformational stabi-
lity. J. Mol. Biol. 280, 721–729.
51. Wieligmann, K., Mayr, E.M. & Jaenicke, R. (1999) Folding and
self-assembly of the domains of betaB2-crystallin from rat eye lens.

J. Mol. Biol. 286, 989–994.
52. Clout, N.J., Basak, A., Wieligmann, K., Bateman, O.A., Jaenicke,
R. & Slingsby, C. (2000) The N-terminal domain of bb2-crystallin
resembles the putative ancestral homodimer. J. Mol. Biol. 304,
253–257.
53. Sharma, A.K., Minke-Gogl, V., Gohl, P., Siebendritt, R.,
Jaenicke, R. & Rudolph, R. (1990) Limited proteolysis of gamma
II-crystallin from calf eye lens. Physicochemical studies on the
N-terminal domain and the intact two-domain protein. Eur. J.
Biochem. 194, 603–609.
54. Mayr, E M., Jaenicke, R. & Glockshuber, R. (1997) The domains
in cb-crystallin: identical fold-different stabilities. J. Mol. Biol. 269,
260–269.
55. Wenk, M., Herbst, R., Hoeger, D., Kretschmar, M., Lubsen,
N.H. & Jaenicke, R. (2000) Gamma-S-crystallin of bocine and
human eye lens: solution structure, stability and folding of the
intact two-domain protein and its separate domains. Biophys.
Chem. 86, 95–108.
56. Trinkl, S. (1995) Einfluss von Strukturelementen auf Assoziation
und Stabilitaet von bb2-Kristallin. Ph D Thesis, University of
Regensburg, Regensburg.
57. Hope, J.N., Chen, H C. & Hejtmancik, J.F. (1994) aggregation
of bA3-crystallin is independent of the specific sequence of the
domain connecting peptide. J.Biol. Chem. 269, 21141–21145.
58. Carver, J.A., Cooper, P.G. & Truscott, R.J.W. (1993)
1
H-NMR-
spectroscopy of bB2-crystallin from bovine eye lens. Conforma-
tion of the N- and C-terminal extensions. Eur. J. Biochem. 213,
313–320.

59. Coop, A., Goode, D., Sumner, I. & Crabbe, M.J.C. (1998) Effects
of controlled mutations on the N- and C-terminal extensions of
chick lens bB1 crystallin. Graefe’s Arch. Clin. Exp. Ophtalmol. 236,
146–150.
60. Kroone, R.C., Elliot, G.S., Ferszt, A., Slingsby, C., Lubsen, N.H.
& Schoenmakers, J.G.G. (1994) The role of sequence extentions in
b-crystallin assembly. Protein Eng. 7, 1395–1399.
61. Kretschmar, M., Mayr, E. & Jaenicke, R. (1999) Homo-dimeric
spherulin 3a: a single domain member of the bc-crystallin super-
family. Biol. Chem. 380, 89–94.
62. Jaenicke, R. (1987) Folding and association of proteins. Prog.
Biophys. Mol Biol. 49, 117–237.
63. Jaenicke, R. (1999) Folding and stability of domain proteins.
Progr. Biophys. Mol. Biol. 71, 155–241.
64. Wenk,M.,Baumgartmer,R.,Holak,T.A.,Huber,R.,Jeanicke,
R. & Mayr, E. (1999) The domains of protein S from Mixococcus
xanthus: structure, stability and interactions. J. Mol. Biol. 286,
1533–1545.
65. Clout, N.J., Kretschmar, M., Jaenicke, R. & Slingsby, C. (2001)
Crystal structure of the calcium-loaded spherulin 3a dimer sheds
light on the evolution of the eye lens bc-crystallin domain fold.
Structure 9, 115–124.
3130 G. D’Alessio (Eur. J. Biochem. 269) Ó FEBS 2002

×