Tải bản đầy đủ (.pdf) (511 trang)

Origin and evolution of viruses e domingo (AP, 1999)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (42.73 MB, 511 trang )

ORIGIN AND EVOLUTION
OF VIRUSES


ORIGIN AND EVOLUTION
OF VIRUSES


This Page Intentionally Left Blank


ORIG IN AND
EVOLUTION
OF VIRUSES
d

Edited by

ESTEBANDOMINGO
Centro de Biobgia Molecular
“Severo Ochoa”
Universidad Aut6nom de Mudrid
28049 Madrid
Spain

ROBERTWEBSTER
S t Jude’s Children’s Research Hospital
Memphis) TN 38 105-2794

USA
JOHN HOLLAND


University of California
Sun Diego, CA 92093-0116

USA

ACADEMIC PRESS
Sun Diego London Boston New York Sydney Tokyo Toronto


This book is printed on acid-flee paper.
Copyright 9 1999 by ACADEMIC PRESS
All Rights Reserved.
No part of this publication may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopying, recording, or any
information storage and retrieval system, without permission in writing from the
publisher.
Academic Press
24-28 Oval Road, London NW1 7DX, UK
http ://www. hbuk. co. uk/ap/
Academic Press
a division of Harcourt Brace & Company
525 B Street, Suite 1900, San Diego, California 92101-4495, USA

ISBN 0-12-220360-7
A catalogue for this book is available from the British Library
Library of Congress Catalog Card Number: 99-62165

Typeset by Phoenix Photosetting, Chatham, Kent
Printed in Great Britain by The Bath Press, Bath
99 00 01 02 03 04 BP9 8 7 6 5 4 3 2 1



Contents

List of Contributors
Preface

vii
ix

1. Nature and Evolution of Early Replicons
Peter Schuster and Peter F. Stadler

1

2. Virus Origins: Conjoined RNS Genomes
as Precursors to DNA Genomes
Hugh D. Robertson and Olivia D. Neel

25

3. Viroids in Plants: Shadows and Footprints
of a Primitive RNA
J. S. Semancik and N. Duran-Vila

37

4. Mutation, Competition and Selection as
Measured with Small RNA Molecules
Christof K. Biebricher


65

5. The Fidelity of Cellular and Viral
87
Polymerases and its Manipulation for
Hypermutagenesis
Andreas Meyerhans and Jean-Pierre Vartanian
6. Drift and Conservatism in RNA Virus
Evolution: Are They Adapting or Merely
Changing?
Monica Sala and Simon Wain-Hobson

115

7. Viral Quasispecies and Fitness Variations
Esteban Domingo, Cristina Escarmfs, Luis
Men&dez-Arias and John J. Holland

141

8. The Retroid Agents: Disease, Function
and Evolution
Marcella A. McClure

163

9. Dynamics of HIV Pathogenesis and
Treatment
Dominik Wodarz and Martin A. Nowak


197

10. Interplay Between Experiment and
Theory in Development of a Working
Model for HIV-1 Population Dynamics
I. M. Rouzine and J. M. Coffin

225

11. Plant Virus Evolution: Past, Present
263
and Future
A. J. Gibbs, P. L. Keese, M. J. Gibbs and E
Garda-Arenal
12. Genetics, Pathogenesis and Evolution of
Picornaviruses
Matthias Gromeier, Eckard Wimmer and
Alexander E. Gorbalenya

287

13. The Impact of Rapid Evolution of the
Hepatitis Viruses
Juan I. Esteban, Maria Martell, William F.
Carman and Jordi G6mez

345

14. Antigenic Variation in Influenza Viruses

Robert G. Webster

377

15. DNA Virus Contribution to Host
Evolution
Luis P. Villarreal

391

16. Parvovirus Variation and Evolution
Colin R. Parrish and Uwe Truyen

421

17. The Molecular Evolutionary History of
441
the Herpesviruses
Duncan J. McGeoch and Andrew J. Davison
18. African Swine Fever Virus: A Missing Link 467
Between Poxviruses and Iridoviruses?
Jos~ Salas, Marfa L. Salas and Eladio Vi~uela
Index

481


This Page Intentionally Left Blank



Contributors

Christof K. Biebricher

E Garcia-Arenal

Max-Planck-Institut f~r Biophysikalische Chemie,
Karl-Friedrich-Bonhoeffer-Institut, Am Fassberg 11,
D-37077, GOttingen, Germany

Departamento de Biotecnologia, E.T.S.I. Agr6nomos,
Universidad Polit&nica de Madr/d, 28040 Madrid,
Spain

William E Carman

Adrian J. Gibbs

Institute of Virology, University of Glasgow, Glasgow
Gll 5GR, UK
John M. Coffin

Department of Molecular Bio~gy and Microbiology,
Tufts University School of Medicine, 136 Harrison
Avenue, Boston, MA 02111, USA

Research School of Biological Sciences, Australian
National University, PO Box 475, Canberra,
ACT 2601, Australia
M. J. Gibbs


Research School of Biolo~cal Sciences, Australian
National University, PO Box 475, Canberra,
ACT 2601, Australia

Andrew J. Davison

Jordi G6rnez

MRC Virology Unit, Church Street, Glasgow G 11 5JR,
UK

Area d' Investigaci6 Basica, Hospital General VaU
d'Hebron, PasseigVall d'Hebron, 119-129, 08035
Barcelona, Spain

Esteban Domingo

Centro de Biolog~aMolecular 'Severo Ochoa', Consejo
Superior de Investigaciones Cientrficas, Universidad
Aut6noma de Madrid, 28049 Madrid, Spain
N. Duran-Vila

Istituto Valenciano de InvestigacionesAgrarias,
Moncada (Valencia), Spain
Cristina Escarmis

Centro de Biologia MolecUlar 'Severo Ochoa', Consejo
Superior de Investigaciones Cientl'ficas, Universidad
Aut6noma de Madrid, 28049 Madrid, Spain

Juan I. Esteban

Area d' Investigaci6 Basica, Hospital General Vall
d'Hebron, PasseigVall d'Hebron, 119-129, 08035
Barcelona, Spain

Alexander E. Gorbalenya

Advanced Biomedical Computing Center, 430 Miller
Drive, Room 235, SAIC/NCI-FCRDC, PO Box B,
Frederick, MD 21702-1201, USA
Matthias Gromeier

Department of Molecular Genetics and Microbiology,
School of Medicine, State University of New York at
Stony Brook, Stony Brook, NY l1794-5222, USA
John J. Holland

Department of Biology and Center for Molecular
Genetics, University of California, San Diego, La Jolla,
CA 92093-0116, USA
P. L. Keese

CAMBIA, PO Box 3200 Canberra, ACT 2601,
Australia


viii

CONTRIBUTORS


MarceUa A. McClure

Maria L. Salas

Department of Biological Sciences, University of
Nevada, 4505 Maryland Parkway, Box 454004
Las Vegas, NV 89145-4004, USA

Centro de BiologgaMolecular 'Severo Ochoa', Consejo
Superior de Investigaciones Cientfficas, Universidad
Aut6noma de Madrid, 28049 Madrid, Spain

Duncan J. McGeoch

Peter Schuster

MRC Virology Unit, Church Street, Glasgow G11 5JR,
UK

Institut fur Theoretische Chemie und Molekulare
Strukturbiologie, Universitat Wien, WahringerstraJ3e
17, A- 1090 Vienna, Austria and Santa F~ Institute,
Santa F~, NM 87501, USA

Maria Martell

Area d'Investigaci6 Basica, Hospital General Vall
d'Hebron, PasseigVall d'Hebron, 119-129, 08035
Barcelona, Spain

Luis Men~ndez-Arias

Centro de BiologgaMolecular 'Severo Ochoa', Consejo
Superior de Investigaciones Cientfficas, Universidad
Aut6noma de Madrid, 28049 Madrid, Spain
Andreas Meyerhans

J. S. Sernancik

Department of Plant Pathology, University of
California, Riverside, CA 92521-0122, USA
Peter E Stadler

Institut f~r Theoretische Chemie und Molekulare
Strukturbiologie, Universitat Wien, WgihringerstraJ3e
17, A- 1090 Vienna, Austria and Santa F~ Institute,
Santa F~, NM 87501, USA

Abteilung Virolo~e, Institut fur Medizinische
Mikrobiologie und Hygiene, Klinikum Homburg,
Universitat des Saarlandes, 66421 Homburg/Saar,
Germany

Institut fur Medizinische Mikrobiologie, LudwigMaximiliens-Universitat, Veterin&str. 13,
80539 Munich, Germany

Olivia D. Neel

Jean-Pierre Vartanian


Department of Biochemistry, Weill Medical College of
Cornell University, 1300 York Avenue, New York,
NY 10021, USA

Unit~ de R~trovirolo~e Mo~culaire, Institut Pasteur,
28 rue du Dr Roux, 75725 Paris cedex 15, France

Uwe Truyen

Luis P. Villarreal

Institute for Advanced Study, Olden Lane, Princeton,
NJ 08540, USA

Irvine ResearchUnit on Animal Viruses, Departmentof
MolecularBiologyand Biochemistry,3232 Biolo~calScience
2, Universityof California, Irvine, CA 92697, USA

Colin R. Parrish

Eladio Vifiuela

Martin A. Nowak

James A. Baker Institute, College of Veterinary
Medicine, Cornell University, Ithaca, NY 14853, USA

Centro de BiologgaMolecular 'Severo Ochoa', Consejo
Superior de Investigaciones Cientfficas, Universidad
Aut6noma de Madrid, 28049 Madrid, Spain


Hugh D. Robertson

Simon Wain-Hobson

Department of Biochemistry, Weill Medical College of
Cornell University, 1300 York Avenue, New York,
NY 10021, USA
Igor M. Rouzine

Department of Molecular Biology and Microbiology,
Tufts University, 136 Harrison Avenue,
Boston MA 0211 l, USA
Monica Sala

Unit~ de R~trovirologieMol&ulaire, Institut Pasteur,
28 rue du Dr Roux, 75724, Paris cedex 15, France
Jos~ Salas

Centro de BiologfaMolecular 'Severo Ochoa', Consejo
Superior de Investigaciones Cientfficas, Universidad
Aut6noma de Madrid, 28049 Madrid, Spain

Unit~ de RdtrovirologieMo~culaire, Institut Pasteur,
28 rue du Dr Roux, 75724 Paris cedex 15, France
Robert G. Webster

Department of Virology and Molecular Biology, St Jude
Children's Research Hospital, PO Box 318,332 North
Lauderdale, Memphis, TN 38105-2794, USA

Eckard Wimrner

Department of Molecular Genetics and Microbiology,
School of Medicine, State University of New York at
Stony Brook, 280 Life Sciences Building, Stony Brook,
NY 11794-5222, USA
Dominik Wodarz

Institute for Advanced Study, Olden Lane, Princeton,
NJ 08540, USA


Preface

Viruses differ greatly in their molecular strategies of adaptation to the organisms they infect.
RNA viruses utilize continuous genetic change
as they explore sequence space to improve their
fitness, and thereby to adapt to the changing
environments of their hosts. Variation is intimately linked to their disease-causing potential.
Paramount to the understanding of RNA viruses is the concept of quasispecies, first developed
to describe the early replicons thought to be
components of a primitive RNA world devoid
of DNA or proteins. The first chapters of the
book deal with theoretical concepts of self-organization, RNA-mediated catalysis and the adaptive exploration of sequence space by RNA
replicons. Likely descendants of the RNA world
that we can study today are the plant-infecting
viroids, and the 8 agent (hepatitis D), a unique
RNA genome associated with some cases of
hepatitis B infection. 8 provides an example of a
simple, bifunctional molecule that contains a

viroid-like replication domain, and a minimal
protein-coding domain. It may be a relic of the
type of recombinant molecules that may have
participated in the transition to the DNA world
from the RNA world. The impact of genetic
variability of pathogenic RNA viruses is
addressed in several chapters that cover specific
viruses of animals and plants.
Retroid agents probably had an essential role
in early evolution. Not only are they widely distributed and capable of copying RNA into DNA,
but they may also have provided regulatory elements, and promoted genetic modifications for
adaptation of DNA genomes. Among the
retroelements, retroviruses are transmitted as

RNA-containing particles, prior to intracellular
copying of their RNA genomes into DNA,
which can be stably maintained as an insert into
the DNA of their hosts. The book discusses
retroid agents and retroviruses, with emphasis
on human immunodeficiency virus, the most
thoroughly scrutinized retrovirus of all.
Experiments and modeling meet to try to understand how variation and adaptation of this
dreaded pathogen lead to a collapse of the
human immune system.
DNA viruses are likely to have coevolved
with their hosts while the DNA world was
developing. The last chapters of the book deal
with the interplay between host evolution and
DNA virus evolution, including chapters on the
simplest and the most complex of the DNA viral

genomes known. This broad coverage of topics
would not have been possible without the contributions of many experts. We express our most
sincere gratitude to all of these authors for having joined in the effort. The strong interdisciplinary flavor of the book is due to their different
points of view. We expect the book to take the
reader on a long journey (in time and in concepts) from the primitive and basic to the modern and complex.
While this book was in press, Professor
Eladio Vifiuela passed away on March 9, 1999.
Eladio was an outstanding scientist, a pioneer of
Virology in Spain, and a friend. The editors dedicate this volume to his memory.

E. Domingo, R.G. Webster, J.J. Holland


This Page Intentionally Left Blank


C

HA

P T

E R

1
Nature and Evolution of Early Replicons
Peter Schuster and Peter F. Stadler

SIMPLE REPLICONS A N D THE
ORIGIN OF REPLICATION

A large number of successful experimental studies that tried to work out plausible chemical scenarios for the origin of early replicons, being
molecules capable of replication, have been conducted in the past (Mason, 1991). A sketch of
such a possible sequence of events in prebiotic
evolution is shown in Figure 1.1. Most of the
building blocks of present-day biomolecules are
available from different prebiotic sources, from
extraterrestrial origins as well as from processes
taking place in the primordial atmosphere or
near hot vents in deep oceans. Condensation
reactions and polymerization reactions formed
non-instructed polymers, for example random
oligopeptides of the protenoid type (Fox and
Dose, 1977).
Template catalysis opens the door to molecular copying and self-replication. Several small
templates were designed by Julius Rebek and
co-workers: these molecules indeed show complementarity and undergo self-replication (see,
for example, Tjivikua et al., 1990; Nowick et al.,
1991). Like nucleic acids they consist of a backbone whose role is to bring "molecular digits" in
sterically appropriate positions, so that they can
be read by their complements. Complementarity is also based on essentially the same principle as in nucleic acids: specific patterns of
hydrogen bonds allow recognition of complementary digits and discrimination between "letters" of an alphabet. The hydrogen bonding pat-

Origin and Evolution of Viruses
ISBN 0-12-220360-7

tern in these model replicons may be assisted by
opposite electric charges carried by the complements. We shall encounter the same principle
later in the discussion of Ghadiri's replicons
based on stable coiled coils of oligopeptide c~helices (Lee et al., 1996). Autocatalysis in small
model systems is certainly interesting because it

reveals some mechanistic details of molecular
recognition. These systems are, however, highly
unlikely to be the basis of biologically significant replicons because they cannot be extended
to large polymers in a simple manner and hence
they are unsuitable for storing a sizeable
amount of (sequence) information. Ligation of
small pieces to larger units, on the other hand, is
a source of combinatorial complexity providing
sufficient capacity for information storage and,
hence, evolution. Heteropolymer formation
thus seems inevitable and we shall therefore
focus on replicons that have this property:
nucleic acids and proteins.
A first major transition leads from a world of
simple chemical reaction networks to autocatalytic processes that are able to form self-organized systems, which are capable of replication
and mutation as required for darwinian evolution. This transition can be seen as the interface
between chemistry and biology since an early
darwinian scenario is tantamount to the onset of
biological evolution. Two suggestions were
made in this context: (1) autocatalysis arose in a
network of reactions catalyzed by oligopeptides
(Kauffman, 1993); and (2) the first autocatalyst
was a representative of a class of molecules with

Copyright 9 1999 Academic Press
All rights of reproduction in any form reserved.


2


P. SCHUSTERAND P. E STADLER

Extraterrestrial Organic
Molecules

Heating during condensation ?

hydrogen cyanide, formaldehyde,
amino acids, hydroxi acids ....

Simulation Experiments

meteorites, comets, dust clouds

hydrogen cyanide, amino acids,
hydoxi acids, purine bases ....

Primordial atmosphere ?
Miller-Urey, Fischer-Tropsch, ...
Clay World ???

Solid State Catalysis ?

Non-instructed Polymers

Volcanic Hot Vents ?

random oligopeptides, protenoids,
lipids, carbohydrates ....
condensation, polymerisation,

aggregation

Condensating agent ?

Template Chemistry
Nature of template molecules ?

template induced reactions
ligation, synthesis of complements,
copying, autocatalysis

RNA World

.,,

nucleotide template reactions
RNA precursors ?
Origin of first RNA molecule ?
Stereochemical purity, chirality ?

cleavage, ligation, editing,
replication, selection,
optimization

I
[

First Fossils of Living Organisms
Western Australia, ~, 3.4 x 109years old, photosynthetic (?) bacteria
,,

..........

I

FIGURE 1.1 The RNA world. The concept of a precursor world preceding present-day
genetics based on DNA, RNA and protein is based on the idea that RNA can act as both
a means of storage of genetic information and a specific catalyst for biochemical reactions.
An RNA world is the first scenario on the route from prebiotic chemistry to present-day
organisms that allows for darwinian selection and evolution. Problems and open questions are indicated by question marks. Little is known about further steps (not shown here
explicitly) from early replicons to the first cells (Eigen and Schuster, 1982; Maynard Smith
and Szathm~ry, 1995).


1.NATURE AND EVOLUTION OF EARLY REPLICONS

"obligatory" template function (Eigen, 1971;
Orgel, 1987). The first suggestion works with
molecules that are easily available under prebiotic conditions but lacks plausibility because the
desired properties, conservation and propagation of mutants, are unlikely to occur with
oligopeptides. The second concept suffers from
opposite reasons: it is very hard to obtain the
first nucleic-acid-like molecules but they would
fulfill all functional requirements.
Until the 1980s, biochemists had an empirically well-established but nevertheless prejudiced
view on the natural and artificial functions of
proteins and nucleic acids. Proteins were
thought to be nature's unbeatable universal catalysts, highly efficient as well as ultimately specific and, as in the case of immunoglobulins,
even tunable to recognize previously unseen
molecules. After Watson and Crick's famous
discovery of the double helix, DNA was considered to be the molecule of inheritance, capable

of encoding genetic information and sufficiently
stable to allow for essential conservation of
nucleotide sequences over many replication
rounds. RNA's role in the molecular concert of
nature was reduced to the transfer of sequence
information from DNA to protein, be it as
mRNA or as tRNA. Ribosomal RNA and some
rare RNA molecules did not fit well into this picture: some sort of scaffolding functions were
attributed to them, such as holding supramolecular complexes together or bringing protein
molecules into the correct spatial positions
required for their functions.
This conventional picture was based on the
idea of a complete "division of labor". Nucleic
acids, DNA as well as RNA, were the templates,
ready for replication and read-out of genetic
information, but not to do catalysis. Proteins
were the catalysts and thus not capable of template function. In both cases these rather dogmatic views turned out to be wrong. Tom Cech
and Sidney Altman discovered RNA molecules
with catalytic functions (Guerrier-Takada et al.,
1983; Cech 1983, 1986, 1990). The name ribozyme
was created for this new class of biocatalysts
because they combine properties of ribonucleotides and enzymes (see next section). Their
examples were dealing with RNA cleavage reactions catalyzed by RNA: without the help of a

3

protein catalyst a non-coding region of an RNA
transcript, a group I intron, cuts itself out during
mRNA maturation. The second example concerns the enzymatic reaction of RNase P, which
catalyzes tRNA formation from the precursor

poly-tRNA. For a long time biochemists had
known that this enzyme consists of a protein
and an RNA moiety. It was tacitly assumed that
the protein was the catalyst while the RNA component had only a backbone function. The converse, however, is true: the RNA acts as catalyst
and the protein is merely a scaffold required to
enhance efficiency.
The second prejudice was disproved only
about 2 years ago by the demonstration that
oligopeptides can act as templates for their own
synthesis and thus show autocatalysis (Lee et al.,
1996, 1997; Severin et al., 1997). In this very elegant work, Reza Ghadiri and his co-workers
have demonstrated that template action does
not necessarily require hydrogen bond formation. Two smaller oligopeptides of chain lengths
17 (E) and 15 (N) are aligned on the template (T)
by means of the hydrophobic interaction in a
coiled coil of the leucine zipper type and the 32mer is produced by spontaneous peptide bond
formation between the activated carboxygroup
and the free amino residue (Figure 1.2). The
hydrophobic cores of template and ligands consist of alternating valine and leucine residues
and show a kind of knobs-into-holes packing in
the complex. The capability for template action
of proteins is a consequence of the three-dimensional structure of the protein o~-helix, which
allows the formation of coiled coils. It requires
that the residues making the contacts between
the helices fulfill the condition of space filling
and thus stable packing. Modification of the
oligopeptide sequences allows alteration of the
interaction in the complex and thereby modifies
the specificity and efficiency of catalysis. A highly relevant feature of oligopeptide self-replication concerns easy formation of higher replication complexes: coiled-coil formation is not
restricted to two interacting helices; triple

helices and higher complexes are known to be
very stable too. Autocatalytic oligopeptide formation may thus involve not only a template
and two substrates but, for example, a template
and a catalyst that form a triple helix together


4

P. SCHUSTERAND E E STADLER

FIGURE 1.2 Oligopeptide and oligonucleotide replicons. A. An autocatalytic oligopeptide that
makes use of the leucine zipper for template action. The upper part illustrates the stereochemistry of
oligopeptide template-substrate interaction by means of the helix-wheel. The ligation site is indicated by arrows. The lower part shows the mechanism (Lee et al., 1996; Severin et al., 1997). B. Templateinduced self-replication of oligonucleotides (von Kiedrowski, 1986) follows essentially the same
reaction mechanism. The critical step is the dissociation of the dimer after bond formation, which
commonly prevents these systems from exponential growth and darwinian behavior (see below).


1.NATURE AND EVOLUTION OF EARLYREPLICONS

with the substrates (Severin et al., 1997). Only a
very small fraction of all possible peptide
sequences fold into three-dimensional structures that are suitable for leucine zipper formation and hence a given autocatalytic oligopeptide is very unlikely to retain the capability of
template action upon mutation. Peptides thus
are occasional templates and replicons based
upon peptides are rare.
In contrast to the volume filling principle of
protein packing, specificity of catalytic RNAs is
provided by base pairing and to a lesser extent
by tertiary interactions. Both are the results of
hydrogen bond specificity. Metal ions, in particular Mg 2§ are often involved in RNA structure

formation and catalysis too. Catalytic action of
RNA on RNA is exercised in the cofolded complexes of ribozyme and substrate. Since the formation of a ribozyme's catalytic center, which
operates on another RNA molecule, requires
sequence complementarity in parts of the substrate, ribozyme specificity is thus predominantly reflected by the sequence and not by the
three-dimensional structure of the isolated substrate. Template action of nucleic acid molecules, being the basis for replication, results
directly from the structure of the double helix. It
requires an appropriate backbone provided by
the antiparallel ribose-phosphate or 2'-deoxyribose-phosphate chains and a suitable geometry
of the complementary purine-pyrimidine pairs.
All RNA (and DNA) molecules, however, share
these features, which, accordingly, are independent of sequence. Every RNA molecule has a
uniquely defined complement. Nucleic acid
molecules, in contrast to proteins, are therefore
obligatory templates. This implies that mutations
are conserved and readily propagated into
future generations.
Enzyme-free template-induced synthesis of
longer RNA molecules from monomers, however, has not been successfully achieved so far
(see, for example, Orgel, 1986). A major problem, among others, is the dissociation of doublestranded molecules at the temperature of efficient replication. If monomers bind with sufficiently high binding constants to the template in
order to guarantee the desired accuracy of replication, then the new molecules are too sticky to
dissociate after the synthesis has been complet-

5

ed. Autocatalytic template-induced synthesis of
oligonucleotides from Smaller oligonucleotide
precursors was nevertheless successful: a hexanucleotide through ligation of two trideoxynucleotide precursors was carried out by Gfinter
von Kiedrowski (1986). His system is the
oligonucleotide analog of the autocatalytic template-induced ligation ~Of oligopeptides discussed above (Figure 1.2). In contrast to the latter system the oligonucleotides do not form
triple-helical complexes. Isothermal autocatalytic template-induced synthesis, however, cannot

be used to prepare longer oligonucleotides
because of the same duplex dissociation problem as mentioned for the template-induced
polymerization of monomers (see also Parabolic
and exponential growth, below).

,,

RNA CATALYSIS AND THE RNA
WORLD
|

,

The natural ribozymes discovered early were all
RNA cleaving molecules: the RNA moiety of
RNase P (Guerrier-Takada et al., 1983), the class
I introns (Cech, 1983) as well as the first small
ribozyme called "hammerhead" (Figure 1.3)
because of its characteristic secondary structure
shape (Uhlenbeck, 1987). Three-dimensional
structures are now available for three classes of
RNA-cleaving ribozymes (Pley et al., 1994; Scott
et al., 1995; Cate et al., 1996; Ferr6-D'Amar6 et al.,
1998) and these data revealed the mechanism of
RNA-catalyzed cleavage reactions in full molecular detail. Additional catalytic RNA molecules
were obtained through selection from random
or partially random RNA libraries and subsequent evolutionary optimization (see Evolution
of phenotypes, below). RNA catalysis in nonnatural ribozymes is not restricted only to RNA
cleavage: some ribozymes show ligase activity
(Bartel and Szostak, 1993; Ekland et al., 1995)

and many efforts were undertaken to prepare a
ribozyme with full RNA replicase activity. The
attempt that comes closest to the goal yielded a
ribozyme that catalyzes RNA polymerization in
short stretches (Ekland and Bartel, 1996). RNA
catalysis is not restricted to operating on RNA,
nor do nucleic acid cafalysts require the ribose


6

P. SCHUSTER AND P. E STADLER

F I G U R E 1.3 The hammerhead ribozyme. The substrate is a tridecanucleotide forming two
double-helical stacks together with the ribozyme (n = 34) in the co-folded complex (Pley et al.,
1994). Some tertiary interactions indicated by broken lines in the drawing determine the detailed
structure of the hammerhead ribozyme complex and are important for the enzymatic reaction
cleaving one of the two linkages between the two stacks. Substrate specificity of ribozyme catalysis is caused by the secondary structure in the co-folded complex between substrate and catalyst.

backbone: ribozymes were trained by evolutionary techniques to process DNA rather than their
natural RNA substrate (Beaudry and Joyce,
1992), and catalytically active DNA molecules
were evolved as well (Breaker and Joyce, 1994;
Cuenoud and Szostak, 1995). Polynucleotide
kinase activity has been reported (Lorsch and
Szostak, 1994, 1995) as well as self-alkylation of
RNA on base nitrogens (Wilson and Szostak,
1995).
Systematic studies also revealed examples of
RNA catalysis on non-nucleic acid substrates.

RNA catalyzes ester, amino acid and peptidyl
transferase reactions (Lohse and Szostak, 1996;
Zhang and Cech, 1997; Jenne and Famulok,
1998). The latter examples are particularly interesting because they revealed close similarities
between the RNA catalysis of peptide bond formation and ribosomal peptidyl transfer (Zhang
and Cech, 1998). A spectacular finding in this
respect was that oligopeptide bond cleavage
and formation is catalyzed by ribosomal RNA
and not by protein: more than 90% of the protein fraction can be removed from ribosomes
without losing the catalytic effect on peptide

bond formation (Noller et al., 1992; Green and
Noller, 1997). In addition, ribozymes were prepared that catalyze alkylation on sulfur atoms
(Wecker et al., 1996) and, finally, RNA molecules
were designed that are catalysts for typical reactions of organic chemistry, for example an isomerization of biphenyl derivatives (Prudent et
al., 1994).
For two obvious reasons RNA was chosen as
candidate for the leading molecule in a simple
scenario at the interface between chemistry and
biology: (1) RNA is thought to be capable of
storing retrievable information because it is an
obligatory template; and (2) it has catalytic
properties. Although the catalytic properties of
RNA are less universal than those of proteins,
they are apparently sufficient for processing
RNA. RNA molecules operating on RNA molecules form a self-organizing system that can
develop a form of molecular organization with
emerging properties and functions. This scenario has been termed the R N A world (see, for
example, Gilbert, 1986, Joyce, 1991, as well as
the collective volume by Gesteland and Atkins,

1993). The idea of an RNA world turned out to


1.NATURE AND EVOLUTION OF EARLY REPLICONS

be fruitful in a different aspect too: it initiated
the search for molecular templates and created
an entirely new field, which may be characterized as template chemistry (Orgel, 1992). Series of
systematic studies were performed, for example, on the properties of nucleic acids with modified sugar moieties (Eschenmoser, 1993). These
studies revealed the special role of ribose and
provided explanations why this molecule is
basic to all life processes.
Chemists working on the origin of life see a
number of difficulties for an RNA world being a
plausible direct successor of the functionally
unorganized prebiotic chemistry (see Figure 1.1
and the reviews Orgel, 1987, 1992, Joyce, 1991,
Schwartz, 1997)" (1) no convincing prebiotic synthesis has been demonstrated for all RNA building blocks; (2) materials for successful RNA synthesis require a high degree of purity that can
hardly be achieved under prebiotic conditions;
(3) RNA is a highly complex molecule whose
stereochemically correct synthesis (3'-5' linkage) requires an elaborate chemical machinery;
and (4) enzyme-free template-induced synthesis
of RNA molecules from monomers has not been
achieved so far. In particular, the dissociation of
duplexes into single strands and the optical
asymmetry problem are of major concern.
Template-induced synthesis of RNA molecules
requires pure optical antipodes. Enantiomeric
monomers (containing L-ribose instead of the
natural D-ribose) are "poisons" for the polycondensation reaction on the template since their

incorporation causes termination of the polymerization process. Several suggestions postulating more "intermediate worlds" between
chemistry and biology were made. Most of the
intermediate information carriers were thought
to be more primitive and easier to synthesize
than RNA but nevertheless still having the capability of template action (Schwartz, 1997).
Glycerol, for example, was suggested as a substitute for ribose because it is structurally simpler and it lacks chirality. However, no successful attempts to use such less sophisticated backbone molecules together with the natural purine
and pyrimidine bases for template reactions
have been reported so far.
Starting from a world of replicating molecules, it took a series of many not yet well-

7

understood steps (Eigen and Schuster, 1982) to
arrive at the first organisms that formed the earliest identified fossils (Warrawoona, Western
Australia, 3.4 x 109 years old; Schopf, 1993) and
possibly the even older kerogen found in the
Isua formation (Greenland, 3.8 x 109 years old;
Pflug and Jaeschke-Boyer, 1979; Schidlowski,
1988; Figure 1.1). It has been speculated that
functionally correlated RNA molecules have
developed a primitive translation machinery
based on an early genetic code. After such a
relation between RNA and proteins had been
established the stage was set for concerted evolution of proteins and RNA. Proteins may
induce vesicle formation into lipid-like materials and eventually lead to the formation of compartments. After a number of steps such an
ensemble might have developed a primitive
metabolism and thus led to the first protocells
(Eigen and Schuster, 1982). DNA, being now the
backup copy of genetic information, is seen as a
latecomer in prebiotic evolution.

A successful experimental approach to selfreproduction of micelles and vesicles is highlighting one of the many steps enumerated
above: prebiotic formation of vesicle structures
(Bachmann et al., 1992). The basic reaction leading to autocatalytic production of amphiphilic
materials is the hydrolysis of ethyl caprilate. The
combination of vesicle formation with RNA
replication represents a particularly important
step towards the construction of a kind of minimal synthetic cell (Luisi et al., 1994). Despite
these elegant experimental studies and the
attempts to build comprehensive models, satisfactory answers to the problems of compartment formation and cell division are not at hand
yet.

PARABOLIC A N D EXPONENTIAL
GROWTH
It is relatively easy to derive a kinetic rate equation displaying the elementary behavior of
replicons if one assumes that catalysis proceeds
through the complementary binding of reactant(s) to free template and that autocatalysis is
limited by the tendency of the template to bind


8

PETER SCHUSTER AND PETER E STADLER

to itself as an inactive "product inhibited" dimer
(Von Kiedrowski, 1993). However, in order to
achieve an understanding of what is likely to
happen in systems where there is a diverse mixture of reactants and catalytic templates, it is
desirable to develop a comprehensive kinetic
description of as many individual steps in the
reaction mechanism of template synthesis as is

feasible and tractable from the mathematical
point of view.
Szathm~iry and Gladkih (1989) oversimplified
the resulting dynamics to a simple parabolic
growth law xkocxp, 0 < p < 1 for the concentrations
of the interacting template species. Their model
suffers from a conceptual and a technical problem: (1) under no circumstances does one
observe extinction of a species in any parabolic
growth model; and (2) the vector fields are not
Lipschitz-continuous on the boundary of the
concentration simplex, indicating that we cannot expect a physically reasonable behavior in
this area.
In a recent paper (Wills et al., 1998) we have
derived the kinetic equations of a system of coupled template-instructed ligation reactions of
the form
(1)
A i + Bj + Ckl ( aijkl
_

>AiBj Ckl

dijel ) Cij + Ckl
bijkI )CijCkl ( -dijkl

aijkl

Here A. and B. denote the two substrate
molecules which are ligated on the template
C.., for example, the electrophilic, E, and the
nucleophilic, N, oligopeptide in peptide template reactions or the two different trinucleotides, GGC and GCC, in the autocatalytic

hexanucleotide formation (Figure 1.2). This
scheme thus encapsulates the experimental
results on both peptide and nucleic acid
replicons (Von Kiedrowski, 1986; Lee et al.,
1996).
The following assumptions are straightforward and allow for a detailed mathematical
analysis:
1. The concentrations of the intermediates are
stationary in agreement with the "quasisteady-state" approximation (Segel and
Slemrod, 1989).
2. The total concentration co of all replicating

species is constant in the sense of c o n s t a n t
o r g a n i z a t i o n (Eigen, 1971).

3. The formation of heteroduplices of the form
C/Fkl, ij va kl is neglected.
4. Only reaction complexes of the form AkBICkl
lead to ligation.
Assumptions 3 and 4 are closely related. They
make immediate sense for hypothetical macromolecules for which the template instruction is
direct instead of complementary. It has been
shown, however, that the dynamics of complementary replicating polymers is very similar to
direct replication dynamics if one considers the
two complementary strands as "single species"
by simply adding their concentrations (Stadler,
1991).
Assumptions 3 and 4 suggest a simplified
notation of the reaction scheme:
A k + B k + C k ( ak

_ ) AkBkCk
ak

bk > C k C k ( dk
>2 C k
dk

(2)
It can be shown that equation (2), together with
the assumptions 1 and 2, leads to the following
system of differential equations for the frequencies or relative total concentrations x e i.e. Z~ xk = 1
of the template molecules Ck in the system (note
that xk accounts not only for the free template
molecules but also for those bound in the complexes CkC k and AkBkCk):

:~k "- Xk

(XjXj~(~jXj) , k = 1 ..... M ,

(Xkq)(~kXk)--

J

(3)
where

_

q0(z) : l(,Jz + 1 -1),
Z


1

q0(0): -~
(3')

and the effective kinetic constants cxk and [5k can
be expressed in terms of the physical parameters ae ak, etc. It will turn out that survival of
replicon species is determined by the constants
0te which we characterize therefore as darwinian fitness parameters.


9

1.NATURE AND EVOLUTION OF EARLY REPLICONS

Equation 3 is a special form of a replicator
equation with the non-linear response functions
fk (x) := akq)(fJkXk). Its behavior depends strongly
on the values of ~k: for large values of z we have
q~ (z) ,- 1/qz. Hence equation (3) approaches
Szathm~iry's expression (Szathm~iry and
Gladkih, 1989):
M

Xk =hl~-~k - X k s 1 6 3
J
with suitable constants h k. This equation exhibits
a very simple dynamics: the mean fitness r
=

ZMh ~/x is a Ljapunov function, i.e. it increases
al()n~g ail trajectories, and the system approaches a globally stable equilibrium at which all
species are present (Varga and Szathm~ry, 1997;
Wills et al., 1998). Szathm~iry's parabolic growth
model thus does not lead to selection.
On the other hand, if z remains small, that is,
if ~k is small, then q)([3kXk)is almost constant at 1/2
(since the relative concentration x k is of course a
number between 0 and 1). Thus we obtain:

X'k = 2 X k

(Xk --

(XjXj

(4)

J
which is the "no-mutation" limit of Eigen's
kinetic equation for replication (Eigen, 1971). (If
condition (4) above is relaxed, we in fact arrive
at Eigen's model with a mutation term.)
Equation (4) leads to survival of the fittest: the
species with the largest value of czk will eventually be the only survivor in the system. It is
worth noting that the mean fitness also increases along all orbits of equation (4) in agreement
with the no-mutation case (Schuster and
Swetina, 1988).
The constants [3k that determine whether the
system shows darwinian selection or unconditional coexistence are proportional to the total

concentration c o of the templates. For small total
concentration we obtain equation (4), while for
large concentrations, when the formation of the
dimers CkC k becomes dominant, we enter the
regime of parabolic growth.
Equation (3) is a special case of a class of replicator equations studied in Hofbauer et al. (1981).
Restating the previously given result yields the

following. All orbits or trajectories starting from
physically meaningful points (these are points
in the interior of the simplex SMwith xl > 0 for all
j = 1, 2 , . . . , M) converge to a unique equilibrium
point i = (5:1,x'2,.. ",xM) with ~; > 0, which is called
the c0-1imit of the orbits. This means that species
may go extinct in the limit t ~ oo. If i lies on the
surface of SM (which is tantamount to saying
that at least one component ~ = 0) then it is also
the c0-1imit for all orbits on this surface. If we
label the replicon species according to decreasing values of the darwinian fitness parameters,
(~1 ~-~ (~2 m~, ~ ~ ~_~ C~M, then there is an index l > 1 such
that i is of the form x > 0 if i < l and ~ = 0 for i
> l. In other words, l replicon species survive
and the M - I least efficient replicators die out.
This behavior is in complete analogy to the
reversible
exponential
competition
case
(Schuster and Sigmund, 1985) where the
darwinian fitness parameters czk are simply the

rate constants G- If the smallest concentrationdependent value fJs(Co) = min {~j(c0)}is sufficiently large, we find l = M and no replicon goes
extinct (~ is an interior equilibrium point).
The condition for survival of species k is
explicitly given by:
1

l

czk > 2cI)(~).
It is interesting to note that the darwinian fitness
parameters (zk determine the order in which
species go extinct whereas the concentrationdependent values [Sk(c0) collectively influence
the flux term and hence set the "extinction
threshold". In contrast to Szathm~iry's model
equation the extended replicon kinetics leads to
both competitive selection and coexistence of
replicons depending on total concentration and
kinetic constants.

MOLECULAR EVOLUTION
EXPERIMENTS
In the first half of this century it was apparently
out of the question to do conclusive and interpretable experiments on evolving populations
because of two severe problems: (1) Time scales


10

P. SCHUSTER A N D P. E STADLER


of evolutionary processes are prohibitive for
laboratory investigations; and (2) the numbers
of possible genotypes are outrageously large
and thus only a negligibly small fraction of all
possible sequences can be realized and evaluated by selection. If generation times could be
reduced to a minute or less, thousands of generations, numbers sufficient for the observation of
optimization and adaptation, could be recorded
in the laboratory. Experiments with RNA molecules in the test-tube indeed fulfill this timescale criterion for observability. With respect to
the "combinatorial explosion" of the numbers of
possible genotypes the situation is less clear.
Population sizes of nucleic acid molecules of
1015-1016 individuals can be produced by random synthesis in conventional automata. These
numbers cover roughly all sequences up to
chain lengths of n = 27 nucleotides. These are
only short RNA molecules but their length is
already sufficient for specific binding to predefined target molecules, for example antibiotics
(Jiang et al., 1997). In addition, sequence-tostructure-to-function mappings of RNA are
highly redundant and thus only a small fraction
of all sequences has to be searched in order to
find solutions to given evolutionary optimization problems (Fontana et al., 1993; Schuster et
al., 1994).
The first successful attempts to study RNA
evolution in vitro were carried out in the late
1960s by Sol Spiegelman and his group (Mills et
al., 1967; Spiegelman, 1971). They created a
"protein-assisted RNA replication medium" by
adding an RNA replicase isolated from
Escherichia coli cells infected by the RNA bacteriophage Q~ to a medium for replication that
also contains the four ribonucleoside triphosphates (GTP, ATP, CTP and UTP) in a suitable
buffer solution. Q~ RNA and some of its smaller variants start instantaneously to replicate

when transferred into this medium. Evolution
experiments were carried out by means of the
serial transfer technique: materials consumed in
RNA replication are replenished by transfer of
small samples of the current solution into fresh
stock medium. The transfers were made after
equal time steps. In series of up to 100 transfers
the rate of RNA synthesis increased by orders of
magnitude. The increase in the replication rate

occurs in steps and not continuously as one
might have expected. Analysis of the molecular
weights of the replicating species showed a
drastic reduction of the RNA chain lengths during the series of transfers: the initially applied
Q~ RNA was 4220 nucleotides long and the
finally isolated species contained little more
than 200 bases. What happened during the serial transfer experiments was a kind of degradation due to suspended constraints on the RNA
molecule. In addition, to perform well in replication the viral RNA has to code for four different proteins in the host cell and needs also a
proper structure in order to enable packing into
the virion. In test-tube evolution these constraints are released and the only remaining
requirement is recognition of the RNA by Q~
replicase and fast replication.
Evidence for a non-trivial evolutionary
process came a few years later when the
Spiegelman group published the results of
another serial transfer experiment that gave
evidence for adaptation of an RNA population
to environmental change. The replication of an
optimized RNA population was challenged by
the addition of ethidium bromide to the replication medium (Kramer et al., 1974). This dye

intercalates into DNA and RNA double helices
and thus reduces replication rates. Further serial transfers in the presence of the intercalating
substance led to an increase in the replication
rate until an optimum was reached. A mutant
was isolated from the optimized population
that differed from the original variant by three
point mutations. Extensive studies on the reaction kinetics of RNA replication in the Q~
replication assay were performed by Christof
Biebricher in G6ttingen (Biebricher and Eigen,
1988). These studies revealed consistency of
the kinetic data with a many-step reaction
mechanism. Depending on concentration, the
growth of template molecules allows one to
distinguish three phases of the replication
process.
1. At low concentration all free template molecules are instantaneously bound by the replicase, which is present in excess, and therefore
the template concentration grows exponentially.


1.NATUREAND EVOLUTIONOF EARLYREPLICONS
2. Excess of template molecules leads to saturation of enzyme molecules, then the rate of
RNA synthesis becomes constant and the
concentration of the template grows linearly.
3. Very high template concentrations impede
dissociation of the complexes between template and replicase, and the template concentration approaches a constant in the sense of
product inhibition.
We neglect plus-minus complementarity in replication by assuming constancy in relative concentrations of plus and minus strands (Eigen,
1971) and consider the plus-minus ensemble as
a single species. Then, RNA replication may be
described by the overall mechanism:

k~
A

+ Ii + E

>

k;
A

+ I i . E < ai

>I i . E.

Ii

) I i . E + I i.

< fi~

,c

k;

(5)
Here E represents the replicase and A stands for
the low-molecular-weight material consumed in
the replication process. This simplified reaction
scheme reproduces all three characteristic phases of the detailed mechanism (Figure 1.4) and
can be readily extended to replication and mutation.

Despite the apparent complexity of RNA

<

I
I
I
I

I
I
I
I

I
I
I
I

I
I
!
I

/

9

exponential


I

I

J

. . . . . .

J
saturation

'~f~
k'.t

r..)
0

e kt

/

',

linear
v

)

Time t
FIGURE 1.4 Replicationkinetics of RNA with QI~replicase. In essence, three different phases of growth are distinguished: (1) exponential growth under conditions with

excess of replicase; (2) linear growth when all enzymemolecules are loaded with RNA; and (3) a saturation phase that
is caused by product inhibition.

11

replication kinetics the mechanism at the same
time fulfills an even simpler overall rate law
provided the activated monomers, ATP, UTP,
GTP, and CTP, as well as QI3 replicase are present in excess. In that case, the rate of increase
for the concentration x i of RNA species I i follows the simple relation ~/ oabsence of constraints (cI) = 0) leads to exponential growth. This growth law is identical to
that found for asexually reproducing organisms and hence replication of molecules in the
test-tube leads to the same principal phenomena that are found with evolution proper.
RNA replication in the Q[3 system requires
specific recognition by the enzyme, which
implies sequence and structure restrictions.
Accordingly only RNA sequences that fulfill
these criteria can be replicated. In order to be
able to amplify RNA free of such constraints
many-step replication assays have been developed. The discovery of the DNA polymerase
chain reaction (PCR; Mullis, 1990) was a milestone towards sequence-independent amplification of DNA sequences. It has one
limitation: double helix separation requires
higher temperatures and conventional PCR
therefore works with a temperature program.
PCR is combined with reverse transcription
and transcription by means of bacteriophage
T7 RNA polymerase in order to yield a
sequence-independent amplification procedure
for RNA. This assay contains two possible
amplification steps: PCR and transcription.

Another frequently used assay makes use of
the isothermal self-sustained sequence replication reaction of RNA (3SR; Fahy e t al., 1991).
In this system the RNA-DNA hybrid obtained
through reverse transcription is converted into
single-stranded DNA by RNAse digestion of
the RNA strand, instead of melting the double
strand. DNA double-strand synthesis and
transcription complete the cycle. Here, transcription by T7 polymerase represents the
amplification step. Artificially enhanced error
rates needed for the creation of sequence
diversity in populations can be achieved readily with PCR. Reverse transcription and transcription are also susceptible to increase in
mutation rates. These two and other new techniques for RNA amplification provided


12

p. S C H U S T E R A N D P. E S T A D L E R

universal and efficient tools for the study of
molecular evolution u n d e r laboratory conditions and m a d e the usage of viral replicases
with their undesirable sequence specificities
obsolete.

ERROR PROPAGATION AND
QUASISPECIES
Evolution of molecules based on replication and
mutation exposed to selection at constant population size has been formulated and analyzed in
terms of chemical reaction kinetics (Eigen, 1971;
Eigen and Schuster, 1977; Eigen et al., 1989).
Error-free replication and mutation are parallel

chemical reactions:
A + Ii

aiQij >I j + I i,

M
J(.= xi(aiQii-E-(t))+

s

j,

i = 1 ..... M .

(7)

j=l,jr
The rate constants for replication of the molecular species a r e a i. Once a reaction has been initiated it can lead to a correct copy, I i - - ) !i, or t o a
mutant, I I ~ I.1 The frequencies of the individual
reaction channels are contained in the mutation
matrix Q - {Qd i, j - 1 , . . . , M}, in particular the
fraction of error copies of genotype I i falling into

Stock Solution ~

Reaction Mixture

(6)

and form a network that in principle allows formation of every RNA genotype as a m u t a n t of

any other genotype. The materials required for,
or c o n s u m e d by, RNA synthesis, again denoted
by A, are replenished by continuous flow in a
reactor resemblimg a chemostat for bacterial
cultures (Figure 1.5). The object of interest is
n o w the distribution of genotypes in the population and its time-dependence. We present here
a short account of the most relevant features of
such replication-mutation assays, in particular
the existence of thresholds in error propagation.
Selection in populations is described by
ordinary differential equations. It has been
s h o w n for systems of type (6) that the outcome of selection is i n d e p e n d e n t of the selection constraint applied. In particular, the flow
reactor and constant organization yield essentially the same results (Schuster and Sigmund,
1985; H a p p e l and Stadler, 1999) and thus we
used the latter simpler condition without losing generality. Variables are again the frequencies of individual genotypes, x i measuring that
of genotype or RNA sequence I i. The frequencies are nomalized, s
x i - 1 (due to constant
organization), the population size is denoted
by N and the n u m b e r of different genotypes
by M. The time-dependence of the sequence
distribution is described by the kinetic equation:

iS

FIGURE 1.5 A flow reactor for the evolution of RNA
molecules. A stock solution containing all materials for RNA
replication including an RNA polymerase flows continuously into a well-stirred tank reactor and an equal volume containing a fraction of the reaction mixture leaves the reactor.
The population in the reactor fluctuates around a mean
value, N + qN. RNA molecules replicate and mutate in the
reactor, and the fastest replicators are selected. The RNA

flow reactor has been used also as an appropriate setup for
computer simulations (Fontana and Schuster, 1987, 1998;
Huynen et al., 1996). There, other criteria than fast replication can be used for selection. For example, fitness functions
are defined that measure the distance to a predefined target
structure and fitness increases during the approach towards
the target (Huynen et al., 1996; Fontana and Schuster, 1998).


13

1.NATURE AND EVOLUTION OF EARLY REPLICONS

species /j is given by Qq and thus we have
GQq = 1. The diagonal elements of Q are the
replication accuracies, i.e. the fractions of correct
replicas produced on the corresponding templates. The time-dependent excess productivity
which is compensated by the flow in the reactor
is the mean value E(t) = ~ax~(t). The quantities
determining then the outcome of selection are
the products of replication rate constants and
mutation frequencies subsumed in the value
matrix: W - {wq = a~Qq; i, j = 1, . . . , M}; its diagonal elements, w,, were called the selective values
of the individual genotypes.
The selective value of a genotype is tantamount to its fitness in the case of vanishing
mutational backflow and hence the genotype
with maximal selective value, I 9
m

W m "-


max{wii[i - 1,..., M},

(8)

dominates a population after it has reached the
selection equilibrium and hence it is called the
master sequence. The notion quasispecies was
introduced for the stationary genotype distribution in order to point at its role as the genetic
reservoir of the population.
A simple expression for the stationary frequency can be found, if the master sequence is
derived from the single-peak model landscape
that assigns a higher replication rate to the master and identical values to all others, for example a m = r~m 9a and a i = a for all i , m (Swetina and
Schuster, 1982; Tarazona, 1992; Alves and
Fontanari, 1996). The (dimensionless) factor r~
is called the superiority of the master sequence.
The assumption of a single-peak landscape is
tantamount to lumping all mutants together
into a mutant cloud with average fitness. The
probability of being in the cloud is simply x c =
s
j,m X = 1 -- X m and the replication-mutation
problem boils down to an exercise in a single
variable, x,, the frequency of the master. The
single-peak model can be interpreted as a kind
of mean field approximation since the mutant
cloud is characterizable by "mean-except-themaster" properties, for example by the meanexcept-the-master replication rate constant ~ =
Zj, m a x / ( 1 - Xm). The superiority then reads: r~ =
am/~. Neglecting mutational backflow we can
rn


readily compute the stationary frequency of the
master sequence:
(~mQmm - 1

_
a m Q m m - -a
x m =
=
a m --a

•m

.

(9)

-1

In this expression the master sequence vanishes
at some finite replication accuracy, Qmm I Xm--0=
Qmm = r~m-1" Non-zero frequency of the master
thus requires Qmm > Qmin- We introduce the uniform error rate model, which assumes that the
mutation rate is p per site and replication event
independently of the nature of the nucleotide to
be copied and the position in the sequence
(Eigen and Schuster, 1977). Then, the single digit
accuracy q - 1 - p is the mean fraction of correctly incorporated nucleotides and the elements of the mutation matrix for a polynucleotide of chain length n are of the form:

Qij = qn


with dq being the Hamming distance between
two sequences I and I. The critical condition
occurs at the minimum accuracy:
z

qmin = 1-- Pmax

j

-- ~ Q m i n

- (~m - 1 / n ,

(10)

which was called the error threshold. Above
threshold no stationary distribution of
sequences is formed. Instead, the population
drifts randomly through sequence space. This
implies that all genotypes have only finite
lifetimes, inheritance breaks down and evolution becomes impossible.
Figure 1.6 shows the stationary frequency of
the master sequence as a function of the error
rate. Variations in the accuracy of in-vitro replication can indeed be easily achieved because
error rates can be tuned over many orders of
magnitude (Leung et al., 1989; Martinez et al.,
1994) The range of replication accuracies that
are suitable for evolution is limited by the maximum accuracy that can be achieved by the
replication machinery and the minimum accuracy determined by the error threshold.
Populations in constant environments have an



14

P. SCHUSTER AND P. E STADLER

/~

StationaryMutantDistribution

>

1.0

o
~J
~J

0.8

...

i

0.6

...........:

i


i

j I.................

.......................
!...........................Frequency of Mutants ...............................................

~

O

L)
>

0.4
0.2

........ e r

ii of Master Siique

io.ol

0.02

Accuracy Limit of Replication

i

i


..................!"......

0.'03

0.04

~

0.;5
[
Error R~e p
Error Threshold

FIGURE 1.6 The genotypic error threshold. The fraction of mutants in stationary populations
increases with the error rate p. Stable stationary mutant distributions called quasispecies require
sufficient accuracy of replication: the single-digit accuracy has to exceed a minimal value known
as error threshold, 1 - p = q > qmm"Above threshold populations migrate through sequence space
in random walk-like manner (Huynen et al., 1996). There is also a lower limit to replication accuracy, which is given by the maximum accuracy of the replication machinery.

advantage when they operate near the maxim u m accuracy because then they lose as few
copies as possible through mutation. In highly
variable environments the opposite is true: it
pays to produce as many mutants as possible
because then the chance is largest to cope successfully with change.
In order to be able to study stochastic features
of population dynamics around the error
threshold, the replication-mutation system was
modeled by a multitype branching process
(Demetrius et al., 1985) The main result of this

study is the derivation of an expression for the
probability of survival to infinite time for the
master sequence and its mutants. In the regime
of sufficiently accurate replication the survival
probability is non-zero and decreases with
increasing error rate. At the critical accuracy qmin
this probability becomes zero. This implies that
all molecular species that are currently in the
populations, master and mutants, will die out in
finite times and new variants will appear. This
scenario is tantamount to migration of the pop-

ulation through sequence space. The critical
accuracy qmin' commonly seen as an error threshold for replication, can also be understood as the
localization threshold of the population in
sequence space (McCaskill, 1984). Later investigations aimed directly at a derivation of the
error threshold in finite populations (Nowak
and Schuster 1989; Alves and Fontanari, 1998).
In order to check the relevance of the error
threshold for the replication of RNA viruses the
minimum accuracy of replication can be transformed into a maximum chain length nma• for a
given error rate p. The condition for stationarity
of the quasispecies then reads:
lno
lnq

F/
lnr~
1- q


(10a)

The populations of most RNA viruses were
shown to live indeed near the above-mentioned
critical value of replication accuracy (Domingo,
1996; Domingo and Holland, 1997). In particular, the chain length n was found to be roughly


×