MINIREVIEW
Deciphering enzymes
Genetic selection as a probe of structure and mechanism
Kenneth J. Woycechowsky and Donald Hilvert
Laboratorium fu
¨
r Organische Chemie, Swiss Federal Institute of Technology, ETH-Ho
¨
nggerberg, Zu
¨
rich, Switzerland
The efficient engineering of enzymes with novel activities
remains an ongoing challenge. Towards this end, genetic
selection techniques provide a method for finding rare
solutions to catalytic problems that requires only a limited
foreknowledge of structure–function relationships. We
have used genetic selections to extensively probe the
structure and mechanism of chorismate mutases. The
insights gained from these investigations will aid future
enzyme design efforts.
Keywords: chorismate mutase; functional selection; protein
engineering; protein folding.
Introduction
The incredible catalytic power of enzymes is well-documen-
ted [1,2], but its source remains elusive. Enzymes catalyze
a vast array of reactions with high specificity, under mild
conditions [3]. These properties make enzymes potentially
useful for organic synthesis [4,5]. Still, our current under-
standing of protein structure–function relationships remains
insufficient for the de novo design of enzymes with tailored
catalytic activities [6].
Evolution provides, through multiple rounds of random
mutagenesis and selection, a means to circumvent this
problem. This process has produced the vast array of
proteins found in nature. In the laboratory, directed
evolution offers a promising strategy for the thorough
study of protein structure–function relationships and for
producing novel proteins with properties favorable for
diverse applications, including catalysis [7,8].
Principles of genetic selection
Natural evolution selects for the survival and reproduction
of organisms. By introducing DNA libraries encoding
potential enzymes into microorganisms such as bacteria,
this process can be harnessed in the laboratory to concen-
trate the selection process on an individual catalytic activity.
A great advantage of genetic selection systems is the ability
to perform parallel processing of huge libraries (rather than
the serial analysis required by high-throughput screening).
During selection, only sequences that encode functional
enzymes are observed, which enables the efficient detection
of rare solutions to a catalytic problem (with frequencies as
low as one in 10
10
). Furthermore, while the only know-
ledge of protein structure or function required for this
approach is the DNA sequence encoding the starting
protein, structural and functional information can guide the
choice of residues to be mutated or the content of the amino
acid set to be sampled at these positions [9]. Such choices
may focus the search of sequence space on areas with a
higher frequency of success and thus increase the probability
of their detection. In principle, any enzyme activity can be
selected for in vivo, provided that catalysis of the desired
reaction can be linked to cell growth.
One general strategy for in vivo enzyme selection is the
introduction of a metabolic requirement for the desired
activity. A genetic selection system for chorismate mutase
(CM) activity provides an example of this strategy (Fig. 1)
[10]. CMs catalyze the Claisen rearrangement of chorismate
to prephenate, which is the first committed step in the
biosynthesis of phenylalanine and tyrosine [11]. In this
system, a strain of Escherichia coli was engineered in which
the genes encoding the bifunctional CM–prephanate dehy-
dratase and CM–prephenate dehydrogenase protein com-
plexes were replaced by genes encoding monofunctional
versions of the dehydratase and the dehydrogenase. The
growth of this strain on minimal media lacking phenyl-
alanine and tyrosine requires an added source of CM
activity. This source can be provided by transformation
with a plasmid carrying a gene encoding the enzyme. This
selection system has been used to reveal structural and
mechanistic requirements for enzyme catalysis of this
reaction.
Selection for restructured enzymes
Catalysis requires the fulfilment of exacting structural
criteria; only properly folded proteins are active. Protein
folding is dictated by amino acid sequence [12]. In an
ensemble of proteins composed from the standard set of 20
amino acids with completely random sequences, however,
the chance of encountering a significantly structured
Correspondence to D. Hilvert, Laboratorium fu
¨
r Organische Chemie,
Swiss Federal Institute of Technology, ETH-Ho
¨
nggerberg, CH-8093,
Zu
¨
rich, Switzerland. Fax: + 41 1632 1486, Tel.: + 41 1632 3176,
E-mail:
Abbreviations:BsCM,Bacillus subtilis CM; CM, chorismate mutase;
EcCM, Escherichia coli CM; MjCM, Methanococcus jannaschii CM;
MLE II, muconate lactonizing enzyme II; OSBS, ortho-
succinylbenzoate; TIM, triose phosphate isomerase.
(Received 5 January 2004, accepted 5 March 2004)
Eur. J. Biochem. 271, 1630–1637 (2004) Ó FEBS 2004 doi:10.1111/j.1432-1033.2004.04073.x
molecule is minute. For successful enzyme engineering, it
would be extremely useful to bias protein libraries in favor
of foldable sequences. Genetic selection experiments using
CMs have helped to illuminate factors that influence protein
structure and stability.
One view of enzymes holds that the large size of these
molecules is required for the precise positioning of a few
active site residues around a substrate in three dimensions
and enables their tremendous stabilization of transition
states. Most amino acids in an enzyme thus serve both as
spacers between, and as a scaffold for, the critical active site
residues. This reasoning may account for the widely varying
tolerances of protein structures to substitution at different
sequence positions [13]. The uneven distribution of structural
information in amino acid sequences presents both great
opportunities and great challenges for enzyme engineering.
Genetic selections of Methanococcus jannaschii CM
(MjCM) [14] were used to assess the tolerance of a protein
fold to secondary structures of varying sequence [15].
MjCM belongs to the AroQ class of CMs whose members
adopt a homodimeric, a-helical bundle fold (Fig. 2) [16].
Each monomer consists of three a-helices and two turns.
Three libraries were constructed and subjected to in vivo
selection for CM activity (Fig. 3): first, the N-terminal helix
alone was randomized, secondly, the two C-terminal helices
were randomized simultaneously, and finally, positives from
the first two libraries were randomly combined to give
proteins whose sequences had been varied over all three
Fig. 1. Selection system for chorismate mutase activity in Escherichia coli. An E. coli strain (KA12) was engineered in which the genes encoding the
bifunctional enzymes chorismate mutase–prephenate dehydratase and chorismate mutase–prephenate dehydratase were deleted. Monofunctional
versions of the dehydratase and the dehydrogenase are provided by plasmid pKIMP-UAUC. Random gene libraries are introduced into this strain
and the ability of a cell harboring an individual library member to form a colony on minimal agar media lacking added phenylalanine and tyrosine
reports on the chorismate mutase activity of the encoded protein [10].
Fig. 2. Structure of AroQ chorismate mutases. E. coli chorismate
mutase, the prototypical AroQ chorismate mutase, is shown in a
ribbon diagram representation [16]. AroQ chorismate mutases form
homodimers of intimately entwined a-helices. The three helices of one
subunit are indicated. A transition state analog inhibitor is bound in
each active site and is represented in ball-and-stick form.
Fig. 3. Design of the three binary-patterned libraries of Methanococcus
jannaschii chorismate mutase. The residues within the secondary
structural elements of M. jannaschii chorismate mutase were changed
to a random distribution of only eight different amino acids: four polar
and four hydrophobic [15]. An individual sequence position was ran-
domised using the four amino acid set of similar polarity to the wild
type residue. The libraries were constructed in two stages. First, helix 1
(Library 1) and helices 2 and 3 (Library 2) were randomized and
introduced into the chorismate mutase selection system. Second, the
successful clones from the initial libraries were crossed (Library 3) and
subjected to selection. In Library 3, approximately 80% of the protein
sequence was randomized. Binary patterned segments are depicted in
red and blue; the segments of wild type sequence are colorless.
Ó FEBS 2004 Deciphering enzymes (Eur. J. Biochem. 271) 1631
helices. The turn sequences and six active site residues were
held constant in all cases. Libraries were designed using a
restricted set of eight amino acids to incorporate a random
distribution of four polar residues (Asp, Glu, Asn and Lys)
and four nonpolar residues (Phe, Ile, Leu and Met) at
positions of corresponding polarity in the a-helical regions
of the wild type protein [15]. By using binary patterning of
amino acids with high a-helical propensities, the libraries
were designed to favor correct secondary structure forma-
tion [17]. In addition, folded structures may be more
common in proteins built from a small set of appropriately
chosen amino acid building blocks [18].
Proper protein folding requires not only the formation of
secondary structures, but also their packing together to
form appropriate tertiary and quaternary interactions,
particularly a hydrophobic core. For all three libraries,
the complementation rate was about 0.01% [15]. This low
frequency illustrates the challenge of packing elements of
secondary structure to form proper tertiary and quaternary
interactions. The importance of precise templates for proper
protein folding is underlined by the low (one in 10
4
)
complementation rate of library 3. In this library, the
sequences of helix 1 (itself) and helices 2 and 3 (together) were
each functional in a context where C- and N-terminal halves
of the protein, respectively, had the wild type sequence. The
successful packing of these preselected segments against each
other required an equally extensive search of sequence space
as did the selection for proper folding of the initially
randomized helices with the wild type template. A sequential
strategy of randomizing helix 1 first and then randomizing
helices 2 and 3 of an active variant from the first library (or
vice versa), might prove more efficient than the convergent
library approach outlined in Fig. 3.
In this study of MjCM, about 80% of the protein sequence
was subjected to randomization. Functional enzymes were
found with less than 50% sequence identity to the wild type.
While active catalysts were rare in these libraries, their
presence demonstrates the ability of this protein fold to
tolerate extensive substitutions. Harnessing this structural
plasticity should be advantageous for enzyme redesign.
Examination of the selected sequences revealed that some
positions are more important than others and thus showed
stronger preferences for one particular residue. For exam-
ple, Ile14, Asn84 and Lys85 are all highly conserved in
the active variants. This lack of permissiveness is perhaps
unsurprising given that these residues probably contact the
substrate and transition state during catalysis; active site
sequences tend to be highly conserved. Additionally, Asp15
and Asp18 were also relatively nonpermissive. While they
probably do not directly contact the substrate or transition
state, these residues are thought to help orient catalytically
essential residues that were held invariant in these libraries.
Important second sphere interactions can be easily over-
looked, but were readily apparent in these selections. The
high enrichment for Phe at position 77 shows the import-
ance of certain interactions in the hydrophobic core [19].
Phe77 may represent a Ôhot spotÕ for binding energy during
protein folding [20], analogous to those found for receptor–
ligand interactions [21].
If a smaller set of amino acids is structurally and
functionally viable, then complete sampling becomes
feasible for libraries in which a larger number of amino
acids are varied simultaneously. Primordial protein catalysts
may have had to manage with significantly fewer than the
20 amino acids commonly found in modern-day enzymes
[22]. The active MjCM variants identified in this study lend
credence to this evolutionary hypothesis. Furthermore,
proteins built from a smaller set of building blocks should
simplify the computational study and rational design of
enzymes [23].
In addition to the packing of secondary structural
elements, protein folding also requires the polypeptide
backbone to turn back on itself. The requirements for the
formation of an interhelical turn were examined by selecting
active sequences from libraries of E. coli CM (EcCM)
variants [24]. The solvent-exposed turn between helices 2
and 3 is composed of three amino acids: Ala65, His66 and
His67 (Fig. 4). When these three residues were simulta-
neously changed to a random distribution of the 20
standard amino acids, almost two-thirds of the resulting
tripeptide sequences were functional. When Lys64, the
solvent-exposed C-terminal residue of helix 2, was included
in the randomization, the fraction of functional sequences
dropped to 50%, but all four residues showed similar, high
tolerances to substitution. Despite this high permissiveness,
and in contrast to a previous study on the sequence
requirements for a turn in cytochrome b
562
[25], a close
examination of the sequence data showed a subtle, but
strong, bias for hydrophilic amino acids in these positions.
This bias may have gone undetected in cytochrome b
562
because that study, which found a similar low stringency for
an interhelical turn sequence, relied on an assay for structure
that was probably less sensitive than functional selection.
The thermodynamic benefit resulting from minimizing the
water accessible surface area of hydrophobic residues placed
at these solvent-exposed positions may lead to aggregation
or to local conformational disruptions.
Fig. 4. The turn between helices 2 and 3 in E. coli chorismate mutase.
Random mutagenesis of Lys64, Ala65, His66 and His67 followed by
selection for chorismate mutase activity showed that these solvent
exposed positions are highly permissive. In contrast, a similar experi-
ment including Leu68, which is buried, instead of Lys64 produced
much fewer complementing sequences. Apparently, tertiary contacts
necessitate a hydrophobic amino acid at position 68, preferably one
with a branched aliphatic side chain [24].
1632 K. J. Woycechowsky and D. Hilvert (Eur. J. Biochem. 271) Ó FEBS 2004
A markedly different result was obtained when Leu68,
a buried loop residue (Fig. 4), was randomized in tandem
with the three turn residues [24]. This library had a
complementation rate of about 6%, a 10-fold drop from
the library in which the three turn residues were randomized
alone. This drop was largely attributable to the absolute
requirement for a hydrophobic residue at position 68.
Furthermore, these successful clones exhibited a marked
preference for branched, aliphatic residues at this position.
This bias further highlights the functional importance of
proper tertiary packing in the hydrophobic core of proteins.
The secondary structural context may be relatively unim-
portant, but the tertiary structural context and the pattern-
ing of polar and nonpolar residues can greatly restrict
allowable turn sequences.
Engineering a new turn sequence into a protein structure
presents a greater challenge than simply changing the
sequence of a pre-existing turn. AroQ CMs have composite
active sites, consisting of residues from helix 1 of one
monomer and residues from helices 2 and 3 of the other
monomer. It has been proposed that such domain-swapped
dimers might have evolved from active, monomeric precur-
sors [26]. By inserting a 180° turn into the middle of helix 1
of the thermostable MjCM, it was possible to form an active
site with residues from a single polypeptide chain, and thus
to perform domain swapping in reverse [27].
Like the (proposed) natural domain-swapping evolution-
ary process, domain unswapping relied on selection for
catalytic activity. In this case, two amino acids, Lys20 and
Leu21, were duplicated and a random sequence of six
residues was introduced between them. Introduction of this
library into the CM selection system followed by screening of
the positives using size-exclusion chromatography allowed
the identification of a monomeric variant of MjCM that
retained nearly 30% of the wild type activity (Fig. 5) [27].
Statistical analysis indicated that < 0.05% of the sequences
produced well-behaved monomers, a surprisingly small
fraction given the broad sequence tolerance of interhelical
turn sequences noted above. The tertiary structural context
may place imposing constraints on this turn sequence.
Genetic selection has proved useful in generating other
changes in CM quaternary structure. In a similar strategy to
that described above, a randomized sequence of four to
seven residues was inserted between Ala23 and Leu24 in the
N-terminal helix of the mesostable EcCM (Fig. 5) [28].
Selection of these libraries showed that functional turn
sequences were again rare, giving complementation rates of
< 0.5% in all cases.
While EcCM variants with four or seven amino acid
insertions gave unstable monomers that were prone to
precipitation, a five amino acid insertion surprisingly gave
a stable, well-behaved hexamer [28]. The sequence of the
insertion was nonpolar, suggesting that oligomerization
through hydrophobic interactions may be an easy way to
increase enzyme stability. This hexameric variant, however,
suffered a 200-fold decrease in catalytic efficiency. In
contrast, the unstable monomeric variants had near wild
type activity. Over the limited area of sequence space
covered by these libraries, there may be a trade-off between
protein stability and catalytic activity.
The AroQ CMs can retain function despite large changes
in sequence and structure. The studies described above have
helped both to estimate the tolerance of protein structural
elements towards substitution and to identify structural
constraints, such as packing interactions and polar/non-
polar patterning, on functional sequences. Genetic selection
of CM libraries has been an invaluable tool in the
engineering of drastically restructured variants.
Selection for altered active sites
Genetic selection can also be extremely useful for studying
structure–function relationships in enzymes. The simulta-
neous in vivo analysis of variants randomized at one or
several positions allows for a more thorough analysis of
important functional residues than the traditional one-at-a-
time approach of site-directed mutagenesis, protein purifi-
cation and in vitro kinetic analysis. The development of such
structure–function relationships in enzyme active sites is
particularly useful for examining the important and often
overlooked roles played by the multiple, subtle interactions
between active-site residues.
The rearrangement of chorismate to prephenate is
arguably one of the simplest enzyme-catalyzed reactions.
Like its uncatalyzed counterpart, this pericyclic reaction
utilizes a concerted, but asynchronous, transition state, with
C-O bond breakage preceding C-C bond formation [29–31].
Yet, the catalytic mechanism of CMs remains controversial.
Specifically, a topic of current debate is whether transition-
state stabilization by electrostatic interactions [32–34] or the
preferential binding of reactive ground-state conformers
[35–37] is of greater importance. Selection experiments
provide persuasive evidence for the importance of electro-
static interactions in catalysis by Bacillus subtilis CM
(BsCM).
BsCM is a member of the AroH class of CMs. This class
adopts a trimeric, pseudo-a/b barrel fold [38] (Fig. 6). AroH
and AroQ CMs share some common active site features.
For example, in the crystal structures with an oxabicyclic
transition state analog (TSA), both enzymes show multiple
cationic groups (Arg and Lys) interacting with the carb-
oxylates and the ether oxygen [16,38]. Additionally, both
Fig. 5. Topological rearrangement of dimeric AroQ chorismate mutase
into a monomer. Insertionofaflexibleloopintohelix1,whichspans
the dimer, allows the N-terminal portion of the helix to bend back on
itself and thus form a complete active site within a monomeric four-
helix bundle. The insertion site is indicated by a horizontal red line.
Ó FEBS 2004 Deciphering enzymes (Eur. J. Biochem. 271) 1633
possess a Glu residue that hydrogen bonds to the hydroxyl
group of the TSA. Despite their different folds, both
enzymes are likely to utilize similar catalytic mechanisms.
The transition state for the chorismate mutase reaction is
highly polarized [39]. In the structure of BsCM complexed
with TSA [38], Arg90 seems poised to stabilize developing
negative charge during the C-O bond cleavage (Fig. 7).
An R90A variant exhibits a more than 10
6
-fold decrease in
activity [40].
To further assess the role of this residue in catalysis,
libraries were constructed in which both Arg90 alone and
Arg90 and Cys88 together were randomized [10]. Selection
revealed that, when the rest of the protein sequence is held
constant, no other residue at position 90 is able to
successfully replace Arg in vivo. In contrast, simultaneous
substitution of positions 88 and 90 produces some alternat-
ive, selectable solutions. In particular, a Lys was able to
replace Arg90 if a residue smaller than Cys was present at
position 88, even with the conservative change of Cys to Ser.
Remarkably, it is also possible for a Lys at position 88 to
substitute for Arg90, provided a Gly, Ser, Leu or Met is
present at position 90. The selection of variants with
rearranged active sites shows that, while rare, alternative
active site structures capable of efficient catalysis within a
given enzyme fold are experimentally accessible. Crystal
structures of the R90K/C88S and R90S/C88K variants
reveal the small but significant structural rearrangements
within the active site caused by these mutations that
probably allow the introduced ammonium group to interact
with the developing negative charge on the ether oxygen in
the transition state [41]. Apparently, subtle packing inter-
actions are crucial for proper active site structure, and
(similar to the requirements for proper protein folding
discussed above) the local structural context imposes strict
criteria for efficient function.
During C-O bond breaking, a positive charge develops
within the cyclohexadiene ring of chorismate. Although not
as obvious as the interaction of Arg90 with the oxyanion in
the transition state, the BsCM structure suggests that Glu78
could be important for carbocation stabilization in the
transition state [38]. Glu78 is certainly important for
catalysis; the E78A variant of BsCM is 10
4
-fold less active
than wild type [40]. To examine its role in catalysis, Glu78
was randomized alone and together with Cys75 [42]. Unlike
the strict requirement for Arg90, several other residues were
able to directly replace Glu78, although the selection
produced a bias for residues capable of hydrogen bonding.
Interestingly, Asp was unable to substitute for Glu78,
providing a further indication of the subtle interactions that
dictate active site structure and function. When positions 75
and 78 were varied in tandem, however, an Asp at position
75 was able to substitute for Glu78, provided Ala, Ser, Met
or Val was present at position 78. As functional solutions
lacking an anion were found, the interaction of Glu78 with
the transition state carbocation is not clear-cut. The crucial
role of Glu78 may be to orient the substrate through a
hydrogen bond with the hydroxyl group of chorismate
[42a,42b].
Enzyme catalysis is a dynamic process. Yet, the import-
ance of highly mobile, crystallographically unresolved
residues is often overlooked. At the C-terminus of BsCM,
residues 111–115 adopt a 3
10
helix and the following
11 residues have poorly defined structure (Fig. 6). This
C-terminal tail lies close to the entrance of the substrate
binding pocket and therefore may be important for
catalysis. In the absence of structural information, however,
it is difficult to postulate functional roles for individual
residues. To help provide a functional definition for these
residues, libraries of BsCM variants were constructed using
a random protein truncation mutagenesis strategy and these
libraries were subjected to selection [43]. Individually, none
of the original 17 C-terminal residues are essential for
complementation. Moreover, a truncated variant lacking
the last 11 residues is still active in vivo, despite a 250-fold
Fig. 6. Structure of Bacillus subtilis chorismate mutase. The mono-
functional chorismate mutase from B. subtilis is a homotrimer and
adopts a pseudo-a/b barrel fold [38]. A transition state analog, shown
in a ball-and-stick representation, is bound in each of the active sites,
which are located at the trimer interfaces. The location of the cystal-
lographically unresolved residues at the C-termini are indicated by
dashed lines.
Fig. 7. Important interactions in the B. subtilis chorismate mutase active
site. Electrostatic interactions are used to bind the transition state
analogintheactivesiteofB. subtilis chorismate mutase. The guan-
idinium group of Arg90 is poised to stabilize the developing oxyanion.
Glu78 is positioned to hydrogen bond with the substrate hydroxyl
group, and may also stabilize the developing carbocation in the
cyclohexadiene ring.
1634 K. J. Woycechowsky and D. Hilvert (Eur. J. Biochem. 271) Ó FEBS 2004
decrease in catalytic efficiency relative to the wild type. The
3
10
helix (residues 111–115) is permissive but shows a
modest preference for the wild type residues. In particular,
Ala112 and Leu115 are the most highly conserved residues.
These residues pack against the hydrophobic interior of
BsCM and so are probably more important for structural
stability than catalytic activity. The selected enzyme variants
all showed little change in k
cat
, but significant increases in
K
m
, which precludes their direct participation in catalysis.
Instead, these residues probably contribute to catalytic
efficiency via uniform binding of the substrate and trans-
ition state.
The versatility of functional selection
So far, we have focused on genetic selection of CMs. Other
selection systems have also proved useful for investigating
enzyme structure and mechanism, and have been recently
reviewed elsewhere [8]. Expanding the lessons learned with
CMs, a few recent examples of selections with eight-
stranded b/a-barrel [or triose phosphate isomerase (TIM)
barrel] enzymes have examined the structural requirements
for this fold and the active site differences that separate
members of an enzyme superfamily.
The TIM barrel is the most frequently encountered
enzyme fold [44], and its natural catalytic versatility demon-
strates its enormous potential for enzyme engineering. The
robustness of triose phosphate isomerase (the prototypical
TIM barrel enzyme) to substitutions was examined by
combinatorial mutagenesis and selection for activity using a
TIM-deficient strain of E. coli [45]. In this experiment, 182
residues outside of the TIM active site were mutated to one of
seven amino acids (using binary polar/nonpolar patterning
similar to that described above for MjCM) and introduced
into the selection system. Analysis of complementing
sequences shows that, while most individual sequence
positions were tolerant to substitution by at least one
member of the restricted amino acid set, only about one in
10
10
sequences randomized over the full length of the protein
should be able to complement in vivo.Structuralelements
such as the a/b interface, loops connecting secondary
structures and a-helix caps were found to be permissive. In
contrast, b-strand stop signals (particularly Gly), the central
core of the barrel and a buried salt bridge were highly
conserved. These results provide a more detailed view of how
TIM barrel enzymes decouple catalytic activity and struc-
tural stability [46] and should facilitate the de novo design
of novel TIM barrel proteins [47].
Selections with another TIM barrel enzyme have been
used to evaluate the plasticity of enzyme active sites.
Variants of muconate lactonizing enzyme II (MLE II) with
ortho-succinylbenzoate (OSBS) activity have been identified
using random mutagenesis and genetic selection [48]. This
selection system utilizes a mutant strain of E. coli that
requires an added source of OSBS activity for anaerobic
growth. OSBS and MLE II catalyze different overall
reactions, but both catalytic mechanisms begin with the
formation of an enolate intermediate (Fig. 8). Wild type
MLE II, however, lacks detectable OSBS activity despite
24% sequence identity with E. coli OSBS and a similar TIM
barrel fold. Three MLE II variants, each containing an
E223G mutation were identified from the selection experi-
ment. Indeed, this single mutation alone is sufficient to allow
complementation of the mutant strain, despite a 10
3
-fold
lower catalytic efficiency compared with E. coli OSBS.
Interestingly, this variant retains residual activity for the
MLE II reaction and may therefore resemble a catalytically
promiscuous intermediate of a natural divergent evolution-
ary process.
Both MLE II and OSBS are members of the TIM barrel-
containing enolase superfamily, and therefore both enzymes
catalyze a common chemical step during catalysis of their
respective reactions (Fig. 8) [49]. The gain of function seen
for the MLE II variants, which can be considered as an
extreme case of changing substrate specificity, still represents
the first successful interconversion of catalytic activities
within the well-characterized enolase superfamily. This
result extends prior work that used random mutagenesis
and selection to change substrate specificity without chan-
ging the overall reaction [50].
A rationally designed variant of
L
-Ala-
D
/
L
-Glu epimerase
(a third member of the enolase superfamily, Fig. 8),
containing a mutation (D297G) analogous to that of the
E223G MLE II, also exhibited measurable OSBS activity,
albeit 100-fold lower than that of the selected MLE II
variant [48]. The generality of this single mutation in
conferring OSBS activity on enolases shows the potential
utility of selection experiments in aiding rational design of
enzymes. Apparently, the active sites of enolases may
require only minor restructuring to accommodate the
substrates of other superfamily members.
As an alternative to metabolic requirements, in vivo
selection systems can also be designed that couple enzyme
Fig. 8. Reactions catalyzed by enolase superfamily members. Enolase
superfamily members catalyze different overall reactions using a
common mechanistic step: formation of an enolate intermediate. Some
examples of the different reactions catalyzed by this superfamily are
shown; (A) muconate lactonizing enzyme II (B) ortho-succinylbenzo-
ate synthase and (C)
L
-Ala-
D
/
L
-Glu epimerase. The enolate inter-
mediate for each reaction is enclosed in brackets.
Ó FEBS 2004 Deciphering enzymes (Eur. J. Biochem. 271) 1635
function to antibiotic resistance. In one interesting system, a
rationally designed DNA polymerase was used for the intro-
duction in vivo of mutations in the TEM-1 b-lactamase gene
at a desired frequency [51]. The cells containing this
engineered DNA polymerase and mutated TEM-1 b-
lactamase were grown in the presence of the antibiotic
aztreonam. This system combines library generation and
selection for enzyme activity (in this case, antibiotic
detoxification) into one step. In this selection, three different
mutations were identified that led to a 150-fold increase in
aztreonam resistance. Two of these mutations matched
those found in clinical isolates. Such rapid laboratory
evolution may be useful to better anticipate the natural
evolution of bacterial antibiotic resistance. Hydrolysis of
aztreonam requires a change in the substrate specificity of
TEM-1 b-lactamase. The chance of finding the three
mutations that effected this change was estimated at one
in 10
10
. Genetic selection was used to beat these odds and
find a functional active site with altered structure.
The power of selection is undeniable. Depending on the
application, however, screening can also be useful for
identifying enzymes with novel activities. High-throughput
technology is advancing the size of libraries that can be
thoroughly screened, providing an ever more appealing
alternative to selection [52,53].
Conclusions and outlook
Enzyme function is the product of multiple, subtle inter-
actions within the protein structure. Functional selection of
randomized libraries provides a general, sensitive and
efficient probe of these interactions. The use of selection
techniques with CMs has allowed us to explore the limits of
structure and function for these enzymes.
Although we have learned much from the studies
described above, questions remain. CM, TIM and OSBS
all catalyze reactions with fairly high background rates [2].
Would the complementation rates found in the studies with
these enzymes be lower if the uncatalyzed reactions were
energetically more demanding [54]? Are many different
protein folds capable of catalyzing a given chemical
reaction? Conversely, what is the potential for catalytic
diversity within a given protein fold? The studies described
above have shown that it is feasible to change the
arrangement of functional groups within an active site.
Can we harness divergent evolution to endow an existing
enzyme scaffold with a completely new activity, changing
both substrate specificity and chemical mechanism? Binary
patterning [17] and restricted amino acid sets [18,22] can
produce proteins capable of folding. What is the smallest
set of amino acids that can still produce a catalytically
competent protein? How can the knowledge gained from
these studies be applied to the computational design of
proteins with novel activities [55]? These questions are
important for enzyme engineering. Genetic selection will
probably play a key role in finding the answers.
Acknowledgements
This paper is dedicated to Professor Duilio Arigoni on the occasion of
his 75th birthday. We thank Katherina Vamvaca for helpful comments
on the manuscript. Financial support for the work on chorismate
mutases was provided by the ETH-Zu
¨
rich and the Schweizer
Nationalfonds.
References
1. Knowles, J.R. & Albery, W.J. (1977) Perfection in enzyme
catalysis – energetics of triose phosphate isomerase. Acc. Chem.
Res. 10, 105–111.
2. Wolfenden, R. & Snider, M.J. (2001) The depth of chemical time
and the power of enzymes as catalysts. Acc. Chem. Res. 34,
938–945.
3. Walsh, C. (2001) Enabling the chemistry of life. Nature 409,
226–231.
4. Koeller, K.M. & Wong, C H. (2001) Enzymes for chemical
synthesis. Nature 409, 232–240.
5. Schmid, A., Dordick, J.S., Hauer, B., Kiener, A., Wubbolts, M.
& Wittholt, B. (2001) Industrial biocatalysts for today and
tomorrow. Nature 409, 258–268.
6. DeGrado, W.F., Summa, C.M., Pavone, V., Nastri, F. & Lom-
bardi, A. (1999) De novo design and structural characterization of
proteins and metalloproteins. Annu. Rev. Biochem. 68, 779–819.
7. Arnold, F.H. (1998) Design by directed evolution. Acc. Chem.
Res. 31, 125–131.
8. Taylor, S.V., Kast, P. & Hilvert, D. (2001) Investigating and
engineering enzymes by genetic selection. Angew. Chem. Int. Ed.
40, 3311–3335.
9. Kast, P. & Hilvert, D. (1997) 3D structural information as a guide
to protein engineering using genetic selection. Curr. Opin. Sruct.
Biol. 7, 470–479.
10. Kast, P., Asif-Ullah, M., Jiang, N. & Hilvert, D. (1996) Exploring
theactivesiteofchorismatemutasebycombinatorialmutagen-
esis and selection: the importance of electrostatic catalysis. Proc.
NatlAcad.Sci.USA93, 5043–5048.
11. Haslam, E. (1993) Shikimic Acid: Metabolism and Metabolites.
Wiley, New York.
12. Anfinsen, C.B. (1973) Principles that govern the folding of pro-
tein chains. Science 181, 223–230.
13.Bowie,J.U.,Reidhaar-Olson,J.F.,Lim,W.A.&Sauer,R.T.
(1990) Deciphering the message in protein sequences: tolerance to
amino acid substitutions. Science 247, 1306–1310.
14. Macbeath, G., Kast, P. & Hilvert, D. (1998) A small, thermo-
stable, and monofunctional chorismate mutase from the archeon
Methanococcus janaschii. Biochemistry 37, 10062–10073.
15. Taylor, S.V., Walter, K.U., Kast, P. & Hilvert, D. (2001)
Searching sequence space for protein catalysts. Proc.NatlAcad.
Sci. USA 98, 10596–10601.
16. Lee, A.Y., Karplus, P.A., Ganem, B. & Clardy, J. (1995) Atomic
structure of the buried catalytic pocket of Escherichia coli chor-
ismate mutase. J. Am. Chem. Soc. 117, 3627–3628.
17. Kamtekar, S., Schiffer, J.M., Xiong, H., Babik, J.M. & Hecht,
M.H. (1993) Protein design by binary patterning of polar and
nonpolar amino acids. Science 262, 1680–1685.
18. Davidson, A.R. & Sauer, R.T. (1994) Folded proteins occur
frequently in libraries of random amino acid sequences. Proc.
NatlAcad.Sci.USA91, 2146–2150.
19. Lim, W.A. & Sauer, R.T. (1989) Alternative packing arrange-
ments in the hydrophobic core of k repressor. Nature 339, 31–36.
20. Fersht, A.R., Matouschek, A. & Serrano, L. (1992) The folding
of an enzyme. I. Theory of protein engineering anaylsis of sta-
bility and pathway of protein folding. J. Mol. Biol. 224, 771–782.
21. Clackson, T. & Wells, J.A. (1995) A hot spot of binding energy in
a hormone–receptor interface. Science 267, 383–386.
22. Riddle, D.S., Santiago, J.V., Bray-Hall, S.T., Doshi, N., Grant-
charova, V.P., Yi, Q. & Baker, D. (1997) Functional rapidly
folding proteins from simplified amino acid sequences. Nat.
Struct. Biol. 4, 805–809.
1636 K. J. Woycechowsky and D. Hilvert (Eur. J. Biochem. 271) Ó FEBS 2004
23. Chan, H.S. (1999) Folding alphabets. Nat. Struct. Biol. 6,994–
996.
24. Macbeath, G., Kast, P. & Hilvert, D. (1998) Exploring sequence
constraints on an interhelical turn using in vivo selection for
catalytic activity. Protein Sci. 7, 325–335.
25. Brunet, A.P., Huang, E.S., Huffine, M.E., Loeb, J.E., Weltman,
R.J. & Hecht, M.H. (1993) The role of turns in the structure of an
a-helical protein. Nature 364, 355–358.
26. Bennett, M.J., Sclunegger, M.P. & Eisenberg, D. (1995) 3D
domain swapping: a mechanism for oligomer assembly. Protein
Sci. 4, 2455–2468.
27. Macbeath, G., Kast, P. & Hilvert, D. (1998) Redesigning enzyme
topology by directed evolution. Science 279, 1958–1961.
28. Macbeath, G., Kast, P. & Hilvert, D. (1998) Probing enzyme
quaternary structure by mutagenesis and selection. Protein Sci. 7,
1757–1767.
29. Addadi, L., Jaffe, E.K. & Knowles, J.R. (1983) Secondary tritium
isotope effects as probes of the enzymic and nonenzymic
conversion of chorismate to prephenate. Biochemistry 22, 4494–
4501.
30. Copley, S.D. & Knowles, J.R. (1985) The uncatalyzed Claisen
rearrangement of chorismate to prephenate prefers a transition
state of chairlike geometry. J. Am. Chem. Soc. 107, 5306–5308.
31. Guilford, W.J., Copley, S.D. & Knowles, J.R. (1987) On the
mechanism of the chorismate mutase reaction. J. Am. Chem. Soc.
109, 5013–5019.
32. Lyne, P.D., Mulholland, A.J. & Richards, W.G. (1995) Insights
into chorismate mutase catalysis from a combined QM/MM
simulation of the enzyme reaction. J. Am. Chem. Soc. 117, 11345–
11350.
33. Kienho
¨
fer, A., Kast, P. & Hilvert, D. (2003) Selective stabiliza-
tion of the chorismate mutase transition state by a postiviely
charged hydrogen bond donor. J. Am. Chem. Soc. 125, 3206–
3207.
34. Strajbl, M., Shurki, A., Kato, M. & Warshel, A. (2003) Apparent
NAC effect in chorismate mutase reflects electrostatic transition
state stabilization. J. Am. Chem. Soc. 125, 10228–10237.
35. Khanjin, N.A., Snyder, J.P. & Menger, F.M. (1999) Mechanism
of chorismate mutase: contribution of conformational restriction
to catalysis in the Claisen rearrangement. J. Am. Chem. Soc. 121,
11831–11846.
36. Guo, H., Cui, Q., Lipscomb, W.N. & Karplus, P.A. (2003)
Understanding the role of active-site residues in chorismate
mutase catalysis from molecular dynamics simulations. Angew.
Chem. Int. Ed. 42, 1508–1511.
37. Hur, S. & Bruice, T.C. (2003) Just a near attack conformer for
catalysis (chorismate to prephenate rearrangements in water,
antibody, enzymes, and their mutants). J. Am. Chem. Soc. 125,
10540–10542.
38. Chook, Y.M., Ke, H. & Lipscomb, W.N. (1993) Crystal struc-
tures of the monofunctional chorismate mutase from Bacillus
subtilis and its complex with a transition state analog. Proc. Natl
Acad. Sci. USA 90, 8600–8603.
39. Gustin, D.J., Mattei, P., Kast, P., Wiest, O., Lee, L., Cleland,
W.W. & Hilvert, D. (1999) Heavy atom isotope effects reveal a
highly polarized transition state for chorismate mutase. J. Am.
Chem. Soc. 121, 1756–1757.
40. Cload, S.T., Liu, D.R., Pastor, R.M. & Schultz, P.G. (1996)
Mutagenesis study of active site residues in chorismate mutase
from Bacillus subtilis. J. Am. Chem. Soc. 118, 1787–1788.
41. Kast, P., Grisostomi, C., Chen, I.A., Li, S., Krengel, U., Xue, Y.
& Hilvert, D. (2000) A strategically positioned cation is crucial
for efficient catalysis by chorismate mutase. J. Biol. Chem. 275,
36832–36838.
42. Kast, P., Hartgerink, M., Asif-Ullah, M. & Hilvert, D. (1996)
Electrostatic catalysis of the Claisen rearrangement: probing the
role of Glu78 in Bacillus subtilis chrorismate mutase by genetic
selection. J. Am. Chem. Soc. 118, 3069–3070.
42a. Worthington, S.E., Roitberg, A.E. & Krauss, M. (2001) An
MD/QM study of the chorismate mutase-catalyzed Claisen
rearrangement reaction. J. Phys. Chem. B 105, 7087–7095.
42b. Ranaghan, K.E., Riddler, L., Szefczyk, B., Sokalski, W.A.,
Hermann, J.C. & Mulholland, A.J. (2004) Transition state sta-
bilization and substrate strain in enzyme catalysis: ab initio
QM/MM modelling of the chorismate mutase reaction. Org.
Biomol. Chem. 2, 968–980.
43. Gamper, M., Hilvert, D. & Kast, P. (2000) Probing the role of the
C-terminus of Bacillus subtilis chorismate mutase by a novel
random protein termination strategy. Biochemistry 39, 14087–
14094.
44. Reardon, D. & Farber, G.K. (1995) The structure and evolution
of a/b barrel proteins. FASEB J. 9, 497–503.
45. Silverman, J.A., Balakrishnan, R. & Harbury, P.B. (2001)
Reverse engineering the (b/a)
8
barrel fold. Proc.NatlAcad.Sci.
USA 98, 3092–3097.
46. Ho
¨
cker, B., Ju
¨
rgens,C.,Wilmanns,M.&Sterner,R.(2001)
Stability, catalytic versatility and evolution of the (b/a)
8
-barrel
fold. Curr. Opin. Biotechnol. 12, 376–381.
47. Offredi,F.,Dubail,F.,Kischel,P.,Sarinski,K.,Stern,A.S.,Van
de Weerdt, C., Hoch, J.C., Prosperi, C., Franc¸ ois, J.M., Mayo,
S.L. & Martial, J.A. (2003) De novo backbone and sequence
design of an idealized a/b-barrel protein: evidence of stable
tertiary structure. J. Mol. Biol. 325, 163–174.
48. Schmidt, D.M.Z., Mundorff, E.C., Dojka, M., Bermudez, E.,
Ness, J.E., Govindarajan, S., Babbitt, P.C., Minshull, J. & Gerlt,
J.A. (2003) Evolutionary potential of (b/a)
8
-barrels: functional
promiscuity produced by single substitutions in the enolase
superfamily. Biochemistry 42, 8387–8393.
49. Babbitt, P.C. & Gerlt, J.A. (1997) Understanding enzyme
superfamilies. J. Biol. Chem. 272, 30591–30594.
50. Ju
¨
rgens,C.,Strom,A.,Wegener,D.,Hettwer,S.,Wilmanns,M.
& Sterner, R. (2000) Directed evolution of a (ba)
8
-barrel enzyme
to catalyze related reactions in two different metabolic pathways.
Proc.NatlAcad.Sci.USA97, 9925–9930.
51. Camps, M., Naukkarinen, J., Johnson, B.P. & Loeb, L.A. (2003)
Targeted gene evolution in Escherichia coli usingahighly
error-prone DNA polymerase I. Proc. Natl Acad. Sci. USA 100,
9727–9732.
52. Olsen, M., Iverson, B. & Georgiou, G. (2000) High-throughput
screening of enzyme libraries. Curr. Opin. Biotechnol. 11, 331–
337.
53. Arnold, F.H. (2001) Combinatorial and computational chal-
lenges for biocatalyst design. Nature 409, 253–257.
54. Taylor, E.A., Palmer, D.R.J. & Gerlt, J.A. (2001) The lesser
Ôburden borneÕ by o-succinylbenzoate synthase: an ÔeasyÕ reaction
involving a carboxylate carbon acid. J. Am. Chem. Soc. 123,
5824–5825.
55. Looger,L.L.,Dwyer,M.A.,Smith,J.J.&Hellinga,H.W.(2003)
Computational design of receptor and sensor proteins with novel
functions. Nature 423, 185–190.
Ó FEBS 2004 Deciphering enzymes (Eur. J. Biochem. 271) 1637