Tải bản đầy đủ (.pdf) (525 trang)

computational biochemistry and biophysics - oren m. becker

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (5.9 MB, 525 trang )


ISBN: 0-8247-0455-X
This book is printed on acid-free paper.
Headquarters
Marcel Dekker, Inc.
270 Madison Avenue, New York, NY 10016
tel: 212-696-9000; fax: 212-685-4540
Eastern Hemisphere Distribution
Marcel Dekker AG
Hutgasse 4, Postfach 812, CH-4001 Basel, Switzerland
tel: 41-61-261-8482; fax: 41-61-261-8896
World Wide Web

The publisher offers discounts on this book when ordered in bulk quantities. For more information,
write to Special Sales/Professional Marketing at the headquarters address above.
Copyright  2001 by Marcel Dekker, Inc. All Rights Reserved.
Neither this book nor any part may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, microfilming, and recording, or by any informa-
tion storage and retrieval system, without permission in writing from the publisher.
Current printing (last digit):
10987654321
PRINTED IN THE UNITED STATES OF AMERICA
Foreword
The long-range goal of molecular approaches to biology is to describe living systems in
terms of chemistry and physics. Over the last 70 years great progress has been made in
applying the quantum mechanical equations representing the underlying physical laws to
chemical problems involving the structures and reactions of small molecules. This work
was recognized in the awarding of the Nobel Prize in Chemistry to Walter Kohn and John
Pople in 1998. Computational studies of mesoscopic systems of biological interest have
been attempted only more recently. Classical mechanics is adequate for describing most


of the properties of these systems, and the molecular dynamics simulation method is the
most important theoretical approach used in such studies. The first molecular dynamics
simulation of a protein, the bovine pancreatic trypsin inhibitor (BPTI), was published
more than 20 years ago [1]. Although the simulation was ‘‘crude’’ by present standards,
it was important because it introduced an important conceptual change in our view of
biomolecules. The classic view of biopolymers, like proteins and nucleic acids, had been
static in character. The remarkable detail evident in the protein crystal structures available
at that time led to an image of ‘‘rigid’’ biomolecules with every atom fixed in place [2].
The molecular dynamics simulation of BPTI was instrumental in changing the static view
of the structure of biomolecules to a dynamic picture. It is now recognized that the atoms
of which biopolymers are composed are in a state of constant motion at ordinary tempera-
tures. The X-ray structure of a protein provides the average atomic positions, but the atoms
exhibit fluidlike motions of sizable amplitudes about these averages. The new understand-
ing of protein dynamics subsumed the static picture in that the average positions are still
useful for the discussion of many aspects of biomolecule function in the language of
structural chemistry. The recognition of the importance of fluctuations opened the way
for more sophisticated and accurate interpretations of functional properties.
In the intervening years, molecular dynamics simulations of biomolecules have un-
dergone an explosive development and been applied to a wide range of problems [3,4].
Two attributes of molecular dynamics simulations have played an essential role in their
increasing use. The first is that simulations provide individual particle motions as a func-
tion of time so they can answer detailed questions about the properties of a system, often
more easily than experiments. For many aspects of biomolecule function, it is these details
iii
iv Foreword
that are of interest (e.g., by what pathways does oxygen get into and exit the heme pocket
in myoglobin? How does the conformational change that triggers activity of ras p21 take
place?). The second attribute is that, although the potential used in the simulations is
approximate, it is completely under the user’s control, so that by removing or altering
specific contributions to the potential, their role in determining a given property can be

examined. This is most graphically demonstrated in the calculation of free energy differ-
ences by ‘‘computer alchemy’’ in which the potential is transmuted reversibly from that
representing one system to another during a simulation [5].
There are three types of applications of molecular dynamics simulation methods in
the study of macromolecules of biological interest, as in other areas that use such simula-
tions. The first uses the simulation simply as a means of sampling configuration space.
This is involved in the utilization of molecular dynamics, often with simulated annealing
protocols, to determine or refine structures with data obtained from experiments, such as
X-ray diffraction. The second uses simulations to determine equilibrium averages, includ-
ing structural and motional properties (e.g., atomic mean-square fluctuation amplitudes)
and the thermodynamics of the system. For such applications, it is necessary that the
simulations adequately sample configuration space, as in the first application, with the
additional condition that each point be weighted by the appropriate Boltzmann factor. The
third area employs simulations to examine the actual dynamics. Here not only is adequate
sampling of configuration space with appropriate Boltzmann weighting required, but it
must be done so as to properly represent the time development of the system. For the first
two areas, Monte Carlo simulations, as well as molecular dynamics, can be utilized. By
contrast, in the third area where the motions and their development are of interest, only
molecular dynamics can provide the necessary information. The three types of applica-
tions, all of which are considered in the present volume, make increasing demands on the
simulation methodology in terms of the accuracy that is required.
In the early years of molecular dynamics simulations of biomolecules, almost all
scientists working in the field received specialized training (as graduate students and/or
postdoctoral fellows) that provided a detailed understanding of the power and limitations
of the approach. Now that the methodology is becoming more accessible (in terms of
ease of application of generally distributed programs and the availability of the required
computational resources) and better validated (in terms of published results), many people
are beginning to use simulation technology without training in the area. Molecular dynam-
ics simulations are becoming part of the ‘‘tool kit’’ used by everyone, even experimental-
ists, who wish to obtain an understanding of the structure and function of biomolecules.

To be able to do this effectively, a person must have access to sources from which he or
she can obtain the background required for meaningful applications of the simulation
methodology. This volume has an important role to play in the transition of the field
from one limited to specialists (although they will continue to be needed to improve the
methodology and extend its applicability) to the mainstream of molecular biology. The
emphasis on an in-depth description of the computational methodology will make the
volume useful as an introduction to the field for many people who are doing simulations
for the first time. They will find it helpful also to look at two earlier volumes on macro-
molecular simulations [3,4], as well as the classic general text on molecular dynamics
[6]. Equally important in the volume is the connection made with X-ray, neutron scatter-
ing, and nuclear magnetic resonance experiments, areas in which molecular dynamics
simulations are playing an essential role. A number of well-chosen ‘‘special topics’’ in-
volving applications of simulation methods are described. Also, several chapters broaden
Foreword v
the perspective of the book by introducing approaches other than molecular dynamics for
modeling proteins and their interactions. They make the connection with what many peo-
ple regard—mistakenly, in my view—as ‘‘computational biology.’’ Certainly with the
announced completion of a description of the human genome in a coarse-grained sense,
the part of computational biology concerned with the prediction of the structure and func-
tion of gene products from a knowledge of the polypeptide sequence is an important
endeavor. However, equally important, and probably more so in the long run, is the bio-
physical aspect of computational biology. The first set of Investigators in Computational
Biology chosen this year demonstrates that the Howard Hughes Foundation recognized
the importance of such biophysical studies to which this volume serves as an excellent
introduction.
I am very pleased to have been given the opportunity to contribute a Foreword to
this very useful book. It is a particular pleasure for me to do so because all the editors
and fifteen of the authors are alumni of my research group at Harvard, where molecular
dynamics simulations of biomolecules originated.
REFERENCES

1. JA McCammon, BR Gelin, and M Karplus. Nature 267:585, 1977.
2. DC Phillips. In: RH Sarma, ed. Biomolecular Stereodynamics, II. Guilderland, New York: Ade-
nine Press, 1981, p 497.
3. JA McCammon and S Harvey. Dynamics of Proteins and Nucleic Acids. Cambridge: Cambridge
University Press, 1987.
4. CL Brooks III, M Karplus, and BM Pettitt. Proteins: A Theoretical Perspective of Dynamics,
Structure, and Thermodynamics. New York: John Wiley & Sons, 1988.
5. For an early example, see J Gao, K Kuczera, B Tidor, and M Karplus. Science 244:1069–1072,
1989.
6. MP Allen and DJ Tildesley. Computer Simulations of Liquids. Oxford: Clarendon Press, 1987.
Martin Karplus
Laboratoire de chimie Biophysique, ISIS
Universite
´
Louis Pasteur
Strasbourg, France
and
Department of Chemistry and Chemical Biology
Harvard University
Cambridge, Massachusetts

Preface
The first dynamical simulation of a protein based on a detailed atomic model was reported
in 1977. Since then, the uses of various theoretical and computational approaches have
contributed tremendously to our understanding of complex biomolecular systems such
as proteins, nucleic acids, and bilayer membranes. By providing detailed information on
biomolecular systems that is often experimentally inaccessible, computational approaches
based on detailed atomic models can help in the current efforts to understand the relation-
ship of the structure of biomolecules to their function. For that reason, they are now
considered to be an integrated and essential component of research in modern biology,

biochemistry, and biophysics.
A number of books and journal articles reviewing computational methods relevant
to biophysical problems have been published in the last decade. Two of the most popular
texts, however, were published more than ten years ago: those of McCammon and Harvey
in 1987 and Brooks, Karplus, and Pettitt in 1988. There has been significant progress in
theoretical and computational methodologies since the publication of these books. There-
fore, we feel that there is a need for an updated, comprehensive text including the most
recent developments and applications in the field.
In recent years the significant increase in computer power along with the implemen-
tation of a wide range of theoretical methods into sophisticated simulation programs have
greatly expanded the applicability of computational approaches to biological systems. The
expansion is such that interesting applications to important and complex biomolecular
systems are now often carried out by researchers with no special training in computational
methodologies. To successfully apply computational approaches to their systems of inter-
est, these ‘‘nonspecialists’’ must make several important choices about the proper methods
and techniques for the particular question that they are trying to address. We believe that
a good understanding of the theory behind the myriad of computational methods and
techniques can help in this process. Therefore, one of this book’s aims is to provide readers
with the required background to properly design and implement computational investiga-
tions of biomolecular systems. In addition, the book provides the needed information for
calculating and interpreting experimentally observed properties on the basis of the results
generated by computer simulations.
vii
viii Preface
This book is organized so that nonspecialists as well as more advanced users can
benefit. It can serve as both an introductory text to computational biology, making it useful
for students, and a reference source for active researchers in the field. We have tried
to compile a comprehensive but reasonably concise review of relevant theoretical and
computational methods that is self-contained. Therefore, the chapters, particularly in Part
I, are ordered so that the reader can easily follow from one topic to the next and be

systematically introduced to the theoretical methods used in computational studies of bio-
molecular systems. The remainder of the book is designed so that the individual parts as
well as their chapters can be read independently. Additional technical details can be found
in the references listed in each chapter. Thus the book may also serve as a useful reference
for both theoreticians and experimentalists in all areas of biophysics and biochemical
research.
This volume thus presents a current and comprehensive account of computational
methods and their application to biological macromolecules. We hope that it will serve
as a useful tool to guide future investigations of proteins, nucleic acids, and biological
membranes, so that the mysteries of biological molecules can continue to be revealed.
We are grateful to the many colleagues we have worked with, collaborated with,
and grown with over the course of our research careers. The multidimensionality of those
interactions has allowed us to grow in many facets of our lives. Special thanks to Professor
Martin Karplus for contributing the Foreword of this book and, most important, for supply-
ing the insights, knowledge, and environment that laid the foundation for our scientific
pursuits in computational biochemistry and biophysics and led directly to the creation of
this book. Finally, we wish to acknowledge the support of all our friends and family.
Oren M. Becker
Alexander D. MacKerell, Jr.
Benoı
ˆ
t Roux
Masakatsu Watanabe
Contents
Foreword Martin Karplus iii
Preface vii
Contributors xi
Part I Computational Methods
1. Introduction 1
Oren M. Becker, Alexander D. MacKerell, Jr., Benoı

ˆ
t Roux, and
Masakatsu Watanabe
2. Atomistic Models and Force Fields 7
Alexander D. MacKerell, Jr.
3. Dynamics Methods 39
Oren M. Becker and Masakatsu Watanabe
4. Conformational Analysis 69
Oren M. Becker
5. Treatment of Long-Range Forces and Potential 91
Thomas A. Darden
6. Internal Coordinate Simulation Method 115
Alexey K. Mazur
7. Implicit Solvent Models 133
Benoı
ˆ
t Roux
8. Normal Mode Analysis of Biological Molecules 153
Steven Hayward
9. Free Energy Calculations 169
Thomas Simonson
ix
x Contents
10. Reaction Rates and Transition Pathways 199
John E. Straub
11. Computer Simulation of Biochemical Reactions with QM–MM Methods 221
Paul D. Lyne and Owen A. Walsh
Part II Experimental Data Analysis
12. X-Ray and Neutron Scattering as Probes of the Dynamics of
Biological Molecules 237

Jeremy C. Smith
13. Applications of Molecular Modeling in NMR Structure Determination 253
Michael Nilges
Part III Modeling and Design
14. Comparative Protein Structure Modeling 275
Andra
´
s Fiser, Roberto Sa
´
nchez, Francisco Melo, and Andrej S
ˇ
ali
15. Bayesian Statistics in Molecular and Structural Biology 313
Roland L. Dunbrack, Jr.
16. Computer Aided Drug Design 351
Alexander Tropsha and Weifan Zheng
Part IV Advanced Applications
17. Protein Folding: Computational Approaches 371
Oren M. Becker
18. Simulations of Electron Transfer Proteins 393
Toshiko Ichiye
19. The RISM-SCF/MCSCF Approach for Chemical Processes in Solutions 417
Fumio Hirata, Hirofumi Sato, Seiichiro Ten-no, and Shigeki Kato
20. Nucleic Acid Simulations 441
Alexander D. MacKerell, Jr. and Lennart Nilsson
21. Membrane Simulations 465
Douglas J. Tobias
Appendix: Useful Internet Resources 497
Index 503
Contributors

Oren M. Becker Department of Chemical Physics, School of Chemistry, Tel Aviv Uni-
versity, Tel Aviv, Israel
Thomas A. Darden Laboratory of Structural Biology, National Institute of Environ-
mental Health Sciences, National Institutes of Health, Research Triangle Park, North
Carolina
Roland L. Dunbrack, Jr. Institute for Cancer Research, Fox Chase Cancer Center,
Philadelphia, Pennsylvania
Andra
´
s Fiser Laboratories of Molecular Biophysics, The Rockefeller University, New
York, New York
Steven Hayward School of Information Systems, University of East Anglia, Norwich,
England
Fumio Hirata Department of Theoretical Study, Institute for Molecular Science, Oka-
zaki National Research Institutes, Okazaki, Japan
Toshiko Ichiye School of Molecular Biosciences, Washington State University, Pull-
man, Washington
Shigeki Kato Department of Chemistry, Kyoto University, Kyoto, Japan
Paul D. Lyne Computer Aided Drug Design, Biogen, Inc., Cambridge, Massachusetts
Alexander D. MacKerell, Jr. School of Pharmacy, University of Maryland, Baltimore,
Maryland
Alexey K. Mazur Institut de Biologie Physico-Chimique, CNRS, Paris, France
xi
xii Contributors
Francisco Melo Laboratories of Molecular Biophysics, The Rockefeller University,
New York, New York
Michael Nilges Structural and Computational Biology Program, European Molecular
Biology Laboratory, Heidelberg, Germany
Lennart Nilsson Department of Biosciences at NOVUM, Karolinska Institutet, Hud-
dinge, Sweden

Benoı
ˆ
t Roux Department of Biochemistry and Structural Biology, Weill Medical Col-
lege of Cornell University, New York, New York
Andrej S
ˇ
ali Laboratories of Molecular Biophysics, The Rockefeller University, New
York, New York
Roberto Sa
´
nchez Laboratories of Molecular Biophysics, The Rockefeller University,
New York, New York
Hirofumi Sato Department of Theoretical Study, Institute for Molecular Science, Oka-
zaki National Research Institutes, Okazaki, Japan
Thomas Simonson Laboratory for Structural Biology and Genomics, Centre National
de la Recherche Scientifique, Strasbourg, France
Jeremy C. Smith Lehrstuhl fu
¨
r Biocomputing, Interdisziplina
¨
res Zentrum fu
¨
r Wissen-
schaftliches Rechnen der Universita
¨
t Heidelberg, Heidelberg, Germany
John E. Straub Department of Chemistry, Boston University, Boston, Massachusetts
Seiichiro Ten-no Graduate School of Information Science, Nagoya University, Nagoya,
Japan
Douglas J. Tobias Department of Chemistry, University of California at Irvine, Irvine,

California
Alexander Tropsha Laboratory for Molecular Modeling, University of North Carolina
at Chapel Hill, Chapel Hill, North Carolina
Owen A. Walsh Physical and Theoretical Chemistry Laboratory, Oxford University,
Oxford, England
Masakatsu Watanabe* Moldyn, Inc., Cambridge, Massachusetts
Weifan Zheng Laboratory for Molecular Modeling, University of North Carolina at
Chapel Hill, Chapel Hill, North Carolina
* Current affiliation: Wavefunction, Inc., Irvine, California.
1
Introduction
Oren M. Becker
Tel Aviv University, Tel Aviv, Israel
Alexander D. MacKerell, Jr.
University of Maryland, Baltimore, Maryland
Benoı
ˆ
t Roux
Weill Medical College of Cornell University, New York, New York
Masakatsu Watanabe*
Moldyn, Inc., Cambridge, Massachusetts
I. INTRODUCTION
The first hints of the chemical basis of life were noted approximately 150 years ago.
Leading up to this initial awareness were a series of insights that living organisms comprise
a hierarchy of structures: organs, which are composed of individual cells, which are them-
selves formed of organelles of different chemical compositions, and so on. From this
realization and the observation that nonviable extracts from organisms such as yeast could
by themselves catalyze chemical reactions, it became clear that life itself was the result
of a complex combination of individual chemicals and chemical reactions. These advances
stimulated investigations into the nature of the molecules responsible for biochemical

reactions, culminating in the discovery of the genetic code and the molecular structure of
deoxyribonucleic acid (DNA) in the early 1950s by Watson and Crick [1]. One of the
most fascinating aspects of their discovery was that an understanding of the mechanism
by which the genetic code functioned could not be achieved until knowledge of the three-
dimensional (3D) structure of DNA was attained. The discovery of the structure of DNA
and its relationship to DNA function had a tremendous impact on all subsequent biochemi-
cal investigations, basically defining the paradigm of modern biochemistry and molecular
biology. This established the primary importance of molecular structure for an understand-
ing of the function of biological molecules and the need to investigate the relationship
between structure and function in order to advance our understanding of the fundamental
processes of life.
As the molecular structure of DNA was being elucidated, scientists made significant
contributions to revealing the structures of proteins and enzymes. Sanger [2] resolved the
* Current affiliation: Wavefunction, Inc., Irvine, California.
1
2 Becker et al.
primary sequence of insulin in 1953, followed by that of an enzyme, ribonuclease A, 10
years later. The late 1950s saw the first high resolution 3D structures of proteins, myoglo-
bin and hemoglobin, as determined by Kendrew et al. [3] and Perutz et al. [4], respectively,
followed by the first 3D structure of an enzyme, lysozyme, by Phillips and coworkers [5]
in 1965. Since then, the structures of a very large number of proteins and other biological
molecules have been determined. There are currently over 10,000 3D structures of proteins
available [6] along with several hundred DNA and RNA structures [7] and a number of
protein–nucleic acid complexes.
Prior to the elucidation of the 3D structure of proteins via experimental methods,
theoretical approaches made significant inroads toward understanding protein structure. One
of the most significant contributions was made by Pauling and Corey [8] in 1951, when
they predicted the existence of the main elements of secondary structure in proteins, the
α-helix and β-sheet. Their prediction was soon confirmed by Perutz [9], who made the
first glimpse of the secondary structure at low resolution. This landmark work by Pauling

and Corey marked the dawn of theoretical studies of biomolecules. It was followed by
prediction of the allowed conformations of amino acids, the basic building block of proteins,
in 1963 by Ramachandran et al. [10]. This work, which was based on simple hard-sphere
models, indicated the potential of computational approaches as tools for understanding the
atomic details of biomolecules. Energy minimization algorithms with an explicit potential
energy function followed readily to assist in the refinement of model structures of peptides
by Scheraga [11] and of crystal structures of proteins by Levitt and Lifson [12].
The availability of the first protein structures determined by X-ray crystallography
led to the initial view that these molecules were very rigid, an idea consistent with the
lock-and-key model of enzyme catalysis. Detailed analysis of protein structures, however,
indicated that proteins had to be flexible in order to perform their biological functions.
For example, in the case of myoglobin and hemoglobin, there is no path for the escape
of O
2
from the heme-binding pocket in the crystal structure; the protein must change
structure in order for the O
2
to be released. This and other realizations lead to a rethinking
of the properties of proteins, which resulted in a more dynamic picture of protein structure.
Experimental methods have been developed to investigate the dynamic properties of pro-
teins; however, the information content from these studies is generally isotropic in nature,
affording little insight into the atomic details of these fluctuations [13]. Atomic resolution
information on the dynamics of proteins as well as other biomolecules and the relationship
of dynamics to function is an area where computational studies can extend our knowledge
beyond what is accessible to experimentalists.
The first detailed microscopic view of atomic motions in a protein was provided in
1977 via a molecular dynamics (MD) simulation of bovine pancreatic trypsin inhibitor
by McCammon et al. [14]. This work, marking the beginning of modern computational
biochemistry and biophysics, has been followed by a large number of theoretical investiga-
tions of many complex biomolecular systems. It is this large body of work, including the

numerous methodological advances in computational studies of biomolecules over the last
decade, that largely motivated the production of the present book.
II. OVERVIEW OF COMPUTATIONAL BIOCHEMISTRY
AND BIOPHYSICS
Although the dynamic nature of biological molecules has been well accepted for over
20 years, the extent of that flexibility, as manifested in the large structural changes that
Introduction 3
biomolecules can undergo, has recently become clearer due to the availability of experi-
mentally determined structures of the same biological molecules in different environments.
For example, the enzyme triosephosphate isomerase contains an 11 amino acid residue
loop that moves by more than 7 A
˚
following the binding of substrate, leading to a catalyti-
cally competent structure [15,16]. In the enzyme cytosine-5-methyltransferase, a loop con-
taining one of the catalytically essential residues undergoes a large conformational change
upon formation of the DNA–coenzyme–protein complex, leading to some residues chang-
ing position by over 20 A
˚
[17]. DNA, typically envisioned in the canonical B form [18],
has been shown to undergo significant distortions upon binding to proteins. Bending of
90° has been seen in the CAP–DNA complex [19], and binding of the TATA box binding
protein to the TATAAAA consensus sequence leads to the DNA assuming a unique con-
formation referred to as the TA form [20]. Even though experimental studies can reveal
the end points associated with these conformational transitions, these methods typically
cannot access structural details of the pathway between the end points. Such information
is directly accessible via computational approaches.
Computational approaches can be used to investigate the energetics associated with
changes in both conformation and chemical structure. An example is afforded by the
conformational transitions discussed in the preceding paragraph. Conformational free en-
ergy differences and barriers can be calculated and then directly compared with experimen-

tal results. Overviews of these methods are included in Chapters 9 and 10. Recent advances
in techniques that combine quantum mechanical (QM) approaches with molecular me-
chanics (MM) now allow for a detailed understanding of processes involving bond break-
ing and bond making and how enzymes can accelerate those reactions. Chapter 11 gives
a detailed overview of the implementation and current status of QM/MM methods. The
ability of computational biochemistry to reveal the microscopic events controlling reaction
rates and equilibrium at the atomic level is one of its greatest strengths.
Biological membranes provide the essential barrier between cells and the organelles
of which cells are composed. Cellular membranes are complicated extensive biomolecular
sheetlike structures, mostly formed by lipid molecules held together by cooperative nonco-
valent interactions. A membrane is not a static structure, but rather a complex dynamical
two-dimensional liquid crystalline fluid mosaic of oriented proteins and lipids. A number
of experimental approaches can be used to investigate and characterize biological mem-
branes. However, the complexity of membranes is such that experimental data remain
very difficult to interpret at the microscopic level. In recent years, computational studies
of membranes based on detailed atomic models, as summarized in Chapter 21, have greatly
increased the ability to interpret experimental data, yielding a much-improved picture of
the structure and dynamics of lipid bilayers and the relationship of those properties to
membrane function [21].
Computational approaches are now being used to facilitate the experimental determi-
nation of macromolecular structures by aiding in structural refinement based on either
nuclear magnetic resonance (NMR) or X-ray data. The current status of the application
of computational methods to the determination of biomolecular structure and dynamics
is presented in Chapters 12 and 13. Computational approaches can also be applied in
situations where experimentally determined structures are not available. With the rapid
advances in gene technology, including the human genome project, the ability of computa-
tional approaches to accurately predict 3D structures based on primary sequence represents
an area that is expected to have a significant impact. Prediction of the 3D structures of
proteins can be performed via homology modeling or threading methods; various ap-
proaches to this problem are presented in Chapters 14 and 15. Related to this is the area

4 Becker et al.
of protein folding. As has been known since the seminal experimental refolding studies
of ribonuclease A in the 1950s, the primary structure of many proteins dictates their 3D
structure [22]. Accordingly, it should be possible ‘‘in principle’’ to compute the 3D struc-
ture of many proteins based on knowledge of just their primary sequences. Although this
has yet to be achieved on a wide scale, considerable efforts are being made to attain this
goal, as overviewed in Chapter 17.
Drug design and development is another area of research where computational bio-
chemistry and biophysics are having an ever-increasing impact. Computational approaches
can be used to aid in the refinement of drug candidates, systematically changing a drug’s
structure to improve its pharmacological properties, as well as in the identification of novel
lead compounds. The latter can be performed via the identification of compounds with a
high potential for activity from available databases of chemical compounds or via de novo
drug design approaches, which build totally novel ligands into the binding sites of target
molecules. Techniques used for these types of studies are presented in Chapter 16. In
addition to aiding in the design of compounds that target specific molecules, computational
approaches offer the possibility of being able to improve the ability of drugs to access their
targets in the body. These gains will be made through an understanding of the energetics
associated with the crossing of lipid membranes and using the information to rationally
enhance drug absorption rates. As evidenced by the recent contribution of computational
approaches in the development of inhibitors of the HIV protease, many of which are
currently on the market, it can be expected that these methods will continue to have an
increasing role in drug design and development.
Clearly, computational and theoretical studies of biological molecules have ad-
vanced significantly in recent years and will progress rapidly in the future. These advances
have been partially fueled by the ever-increasing number of available structures of pro-
teins, nucleic acids, and carbohydrates, but at the same time significant methodological
improvements have been made in the area of physics relevant to biological molecules.
These advances have allowed for computational studies of biochemical processes to be
performed with greater accuracy and under conditions that allow for direct comparison

with experimental studies. Examples include improved force fields, treatment of long-
range atom–atom interactions, and a variety of algorithmic advances, as covered in Chap-
ters 2 through 8. The combination of these advances with the exponential increases in
computational resources has greatly extended and will continue to expand the applicability
of computational approaches to biomolecules.
III. SCOPE OF THE BOOK
The overall scope of this book is the implementation and application of available theoreti-
cal and computational methods toward understanding the structure, dynamics, and function
of biological molecules, namely proteins, nucleic acids, carbohydrates, and membranes.
The large number of computational tools already available in computational chemistry
preclude covering all topics, as Schleyer et al. are doing in The Encyclopedia of Computa-
tional Chemistry [23]. Instead, we have attempted to create a book that covers currently
available theoretical methods applicable to biomolecular research along with the appro-
priate computational applications. We have designed it to focus on the area of biomolecu-
lar computations with emphasis on the special requirements associated with the treatment
of macromolecules.
Introduction 5
Part I provides an introduction to the field of computational biochemistry and bio-
physics for nonspecialists, with the later chapters in Part I presenting more advanced
techniques that will be of interest to both the nonspecialist and the more advanced reader.
Part II presents approaches to extract information from computational studies for the inter-
pretation of experimental data. Part III focuses on methods for modeling and designing
molecules. Chapters 14 and 15 are devoted to the determination and modeling of protein
structures based on limited available experimental information such as primary sequence.
Chapter 16 discusses the recent developments in computer-aided drug designs. The algo-
rithms presented in Part III will see expanding use as the fields of genomics and bioinfor-
matics continue to evolve. The final section, Part IV, presents a collection of overviews
of various state-of-the-art theoretical methods and applications in specific areas relevant
to biomolecules: protein folding (Chapter 17), protein simulation (Chapter 18), chemical
process in solution (Chapter 19), nucleic acids simulation (Chapter 20), and membrane

simulation (Chapter 21).
In combination, the book should serve as a useful reference for both theoreticians
and experimentalists in all areas of biophysical and biochemical research. Its content repre-
sents progress made over the last decade in the area of computational biochemistry and
biophysics. Books by Brooks et al. [24] and McCammon and Harvey [25] are recom-
mended for an overview of earlier developments in the field. Although efforts have been
made to include the most recent advances in the field along with the underlying fundamen-
tal concepts, it is to be expected that further advances will be made even as this book is
being published. To help the reader keep abreast of these advances, we present a list of
useful WWW sites in the Appendix.
IV. TOWARD A NEW ERA
The 1998 Nobel Prize in Chemistry was given to John A. Pople and Walter Kohn for
their work in the area of quantum chemistry, signifying the widespread acceptance of
computation as a valid tool for investigating chemical phenomena. With its extension to
bimolecular systems, the range of possible applications of computational chemistry was
greatly expanded. Though still a relatively young field, computational biochemistry and
biophysics is now pervasive in all aspects of the biological sciences. These methods have
aided in the interpretation of experimental data, and will continue to do so, allowing for
the more rational design of new experiments, thereby facilitating investigations in the
biological sciences. Computational methods will also allow access to information beyond
that obtainable via experimental techniques. Indeed, computer-based approaches for the
study of virtually any chemical or biological phenomena may represent the most powerful
tool now available to scientists, allowing for studies at an unprecedented level of detail.
It is our hope that the present book will help expand the accessibility of computational
approaches to the vast community of scientists investigating biological systems.
REFERENCES
1. JD Watson, FHC Crick. Nature 171:737, 1953.
2. F Sanger. Annu Rev Biochem 57:1, 1988.
6 Becker et al.
3. JC Kendrew, G Bodo, MH Dintzis, RG Parrish, H Wyckoff, DC Phillips. Nature 181:622,

1958.
4. MF Perutz, MG Rossmann, AF Cullis, H Muirhead, G Will, ACT North. Nature 185:416,
1960.
5. CCF Blake, DF Koenig, GA Mair, ACT North, DC Phillips, VR Sarma. Nature 206:757, 1965.
6. FC Bernstein, TF Koetzle, GJB Williams, DF Meyer Jr, MD Brice, JR Rodgers, O Kennard,
T Shimanouchi, M Tasumi. J Mol Biol 112:535, 1977.
7. HM Berman, WK Olson, DL Beveridge, J Westbrook, A Gelbin, T Demeny, S-H Hsieh, AR
Srinivasan, B Schneider. Biophys J 63:751, 1992.
8. L Pauling, RB Corey. Proc Roy Soc Lond B141:10, 1953.
9. MF Perutz. Nature 167:1053, 1951.
10. GN Ramachandran, C Ramakrishana, V Sasisekharan. J Mol Biol 7:95, 1963.
11. HA Scheraga. Adv Phys Org Chem 6:103, 1968.
12. M Levitt, S Lifson. J Mol Biol 46:269, 1969.
13. M Karplus, GA Petsko. Nature 347:631, 1990.
14. JA McCammon, BR Gelin, M Karplus. Nature 267:585, 1977.
15. D Joseph, GA Petsko, M Karplus. Science 249:1425, 1990.
16. DL Pompliano, A Peyman, JR Knowles. Biochemistry 29:3186, 1990.
17. S Klimasauskas, S Kumar, RJ Roberts, X Cheng. Cell 76:357, 1994.
18. W Saenger. Principles of Nucleic Acid Structure. New York: Springer-Verlag, 1984.
19. SC Schultz, GC Shields, TA Steitz. Science 253:1001, 1991.
20. G Guzikevich-Guerstein, Z Shakked. Nature Struct Biol 3:32, 1996.
21. KM Merz Jr, B Roux, eds. Biological Membranes: A Molecular Perspective from Computation
and Experiment. Boston: Birkhauser, 1996.
22. CB Anfinsen. Science 181:223, 1973.
23. PvR Schleyer, NL Allinger, T Clark, J Gasteiger, PA Kollman, HF Schaefer III, PR Schreiner,
eds. The Encyclopedia of Computational Chemistry. Chichester: Wiley, 1998.
24. CL Brooks III, M Karplus, BM Pettitt. Proteins, A Theoretical Perspective: Dynamics, Struc-
ture, and Thermodynamics, Vol 71. New York: Wiley, 1988.
25. JA McCammon, SC Harvey. Dynamics of Proteins and Nucleic Acids. New York: Cambridge
University Press, 1987.

2
Atomistic Models and Force Fields
Alexander D. MacKerell, Jr.
University of Maryland, Baltimore, Maryland
I. INTRODUCTION
Central to the success of any computational approach to the study of chemical systems
is the quality of the mathematical model used to calculate the energy of the system as a
function of its structure. For smaller chemical systems studied in the gas phase, quantum
mechanical (QM) approaches are appropriate. The success of these methods was empha-
sized by the selection of John A. Pople and Walter Kohn as winners of the 1998 Nobel
prize in chemistry. These methods, however, are typically limited to systems of approxi-
mately 100 atoms or less, although approaches to treat large systems are under develop-
ment [1]. Systems of biochemical or biophysical interest typically involve macromolecules
that contain 1000–5000 or more atoms plus their condensed phase environment. This can
lead to biochemical systems containing 20,000 atoms or more. In addition, the inherent
dynamical nature of biochemicals and the mobility of their environments [2,3] require
that large number of conformations, generated via various methods (see Chapters 3, 4, 6,
and 10), be subjected to energy calculations. Thus, an energy function is required that
allows for 10
6
or more energy calculations on systems containing on the order of 10
5
atoms.
Empirical energy functions can fulfill the demands required by computational stud-
ies of biochemical and biophysical systems. The mathematical equations in empirical en-
ergy functions include relatively simple terms to describe the physical interactions that
dictate the structure and dynamic properties of biological molecules. In addition, empirical
force fields use atomistic models, in which atoms are the smallest particles in the system
rather than the electrons and nuclei used in quantum mechanics. These two simplifications
allow for the computational speed required to perform the required number of energy

calculations on biomolecules in their environments to be attained, and, more important,
via the use of properly optimized parameters in the mathematical models the required
chemical accuracy can be achieved. The use of empirical energy functions was initially
applied to small organic molecules, where it was referred to as molecular mechanics [4],
and more recently to biological systems [2,3].
7
8 MacKerell
II. POTENTIAL ENERGY FUNCTIONS
A. Potential Energy Functions for the Treatment of
Biological Molecules
A potential energy function is a mathematical equation that allows for the potential energy,
V, of a chemical system to be calculated as a function of its three-dimensional (3D) struc-
ture, R. The equation includes terms describing the various physical interactions that dic-
tate the structure and properties of a chemical system. The total potential energy of a
chemical system with a defined 3D structure, V(R)
total
, can be separated into terms for the
internal, V(R)
internal
, and external, V(R)
external
, potential energy as described in the following
equations.
V(R)
total
ϭ V(R)
internal
ϩ V(R)
external
(1)

V(R)
internal
ϭ
Α
bonds
K
b
(b Ϫ b
0
)
2
ϩ
Α
angles
K
θ
(θ Ϫ θ
0
)
2
ϩ
Α
dihedrals
K
χ
[1 ϩ cos(nχ Ϫ σ)]
(2)
and
V(R)external ϭ
Α

nonbonded
atompairs
΂
ε
ij
΄
΂
R
min,ij
r
ij
΃
12
Ϫ
΂
R
min,ij
r
ij
΃
6
΅
ϩ
q
i
q
j
ε
D
r

ij
΃
(3)
The internal terms are associated with covalently connected atoms, and the external terms
represent the noncovalent or nonbonded interactions between atoms. The external terms
are also referred to as interaction, nonbonded, or intermolecular terms.
Beyond the form of Eqs. (1)–(3), which is discussed below, it is important to empha-
size the difference between the terms associated with the 3D structure, R, being subjected
to the energy calculation and the parameters in the equations. The terms obtained from
the 3D structure are the bond lengths, b; the valence angles, θ; the dihedral or torsion
angles, χ; and the distances between the atoms, r
ij
. A diagrammatic representation of two
hypothetical molecules in Figure 1 allows for visualization of these terms. The values
of these terms are typically obtained from experimental structures generated from X-ray
crystallography or NMR experiments (see Chapter 13), from modeled structures (e.g.,
from homology modeling of a protein; see Chapters 14 and 15), or a structure generated
during a molecular dynamics (MD) or Monte Carlo (MC) simulation. The remaining terms
in Eqs. (2) and (3) are referred to as the parameters. These terms are associated with the
particular type of atom and the types of atoms covalently bound to it. For example, the
parameter q, the partial atomic charge, of a sodium cation is typically set to ϩ1, while
that of a chloride anion is set to Ϫ1. Another example is a CE C single bond versus a
CC C double bond, where the former may have bond parameters of b
0
ϭ 1.53 A
˚
, K
b
ϭ
225 kcal/(mol ⋅ A

˚
2
) and the latter b
0
ϭ 1.33 A
˚
, K
b
ϭ 500 kcal/(mol ⋅ A
˚
2
) Thus, different
parameters allow for different types of atoms and different molecular connectivities to be
treated using the same form of Eqs. (2) and (3). Indeed, it is the quality of the parameters,
as judged by their ability to reproduce experimentally, and quantum-mechanically deter-
mined target data (e.g., information on selected molecules that the parameters are adjusted
to reproduce) that ultimately determines the accuracy of the results obtained from compu-
Atomistic Models and Force Fields 9
Figure 1 Hypothetical molecules to illustrate the energetic terms included in Eqs. (1)–(3). Mole-
cule A comprises atoms 1–4, and molecule B comprises atom 5. Internal terms that occur in molecule
A are the bonds, b, between atoms 1 and 2, 2 and 3, and 3 and 4; angles θ, involving atoms 1–2–
3 and atoms 2–3–4, and a dihedral or torsional angle, χ, described by atoms 1–2–3–4. Bonds can
also be referred to as 1,2 atom pairs or 1,2 interactions; angles as 1,3 atom pairs or 1,3 interactions;
and dihedrals as 1,4 atom pairs or 1,4 interactions. Molecule B is involved in external interactions
with all four atoms in molecule A, where the different interatomic distances, r
ij
, must be known.
Note that external interactions (both van der Waals and Coulombic) can occur between the 1,2, 1,3,
and 1,4 pairs in molecule A. However, external interactions involving 1,2 and 1,3 interactions are
generally not included as part of the external energy (i.e., 1,2 and 1,3 exclusions), but 1,4 interactions

are. Often the 1,4 external interaction energies are scaled (i.e., 1,4 scaling) to diminish the influence
of these external interactions on geometries, vibrations, and conformational energetics. It should
also be noted that additional atoms that could be present in molecule A would represent 1,5 interac-
tions, 1,6 interactions, and so on, and would also interact with each other via the external terms.
tational studies of biological molecules. Details of the parameter optimization process are
discussed below.
The mathematical form of Eqs. (2) and (3) represents a compromise between sim-
plicity and chemical accuracy. Both the bond-stretching and angle-bending terms are
treated harmonically, which effectively keeps the bonds and angles near their equilibrium
values. Bond and angle parameters include b
0
and θ
0
, the equilibrium bond length and
equilibrium angle, respectively. K
b
and K
θ
are the force constants associated with the bond
and angle terms, respectively. The use of harmonic terms is sufficient for the conditions
under which biological computations are performed. Typically MD or MC simulations
are performed in the vicinity of room temperature and in the absence of bond-breaking
or bond-making events; because the bonds and angles stay close to their equilibrium values
at room temperature, the harmonic energy surfaces accurately represent the local bond
and angle distortions. It should be noted that the absence of bond breaking is essential
for simulated annealing calculations performed at elevated temperatures (see Chapter 13).
Dihedral or torsion angles represent the rotations that occur about a bond, leading to
changes in the relative positions of atoms 1 and 4 as described in Figure 1. These terms
are oscillatory in nature (e.g., rotation about the CE C bond in ethane changes the structure
10 MacKerell

from a low energy staggered conformation to a high energy eclipsed conformation, then
back to a low energy staggered conformation, and so on), requiring the use of a sinusoidal
function to accurately model them.
In Eq. (2), the dihedral term includes parameters for the force constant, K
χ
; the
periodicity or multiplicity, n; and the phase, δ. The magnitude of K
χ
dictates the height
of the barrier to rotation, such that K
χ
associated with a double bond would be significantly
larger that that for a single bond. The periodicity, n, indicates the number of cycles per
360° rotation about the dihedral. In the case of an sp
3
–sp
3
bond, as in ethane, n would
equal 3, while the sp
2
–sp
2
CC C bond in ethylene would have n ϭ 2. The phase, δ,
dictates the location of the maxima in the dihedral energy surface allowing for the location
of the minima for a dihedral with n ϭ 2 to be shifted from 0° to 90° and so on. Typically,
δ is equal to 0 or 180, although recent extensions allow any value from 0 to 360 to be
assigned to δ [5]. Finally, each torsion angle in a molecule may be treated with a sum of
dihedral terms that have different multiplicities, as well as force constants and phases [i.e.,
the peptide bond can be treated by a summation of 1-fold (n ϭ 1) and 2-fold (n ϭ 2)
dihedral terms with the 2-fold term used to model the double-bonded character of the

CEN bond and the 1-fold term used to model the energy difference between the cis and
trans conformations]. The use of a summation of dihedral terms for a single torsion angle,
a Fourier series, greatly enhances the flexibility of the dihedral term, allowing for more
accurate reproduction of experimental and QM energetic target data.
Equation (3) describes the external or nonbond interaction terms. These terms may
be considered the most important of the energy terms for computational studies of biologi-
cal systems. This is because of the strong influence of the environment on the properties
of macromolecules as well as the large number of nonbond interactions that occur in
biological molecules themselves (e.g., hydrogen bonds between Watson–Crick base pairs
in DNA, peptide bond–peptide bond hydrogen bonds involved in the secondary structures
of proteins, and dispersion interactions between the aliphatic portions of lipids that occur
in membranes). Interestingly, although the proper treatment of nonbond interactions is
essential for successful biomolecular computations, it has been shown that the mathemati-
cal model required to treat these terms accurately can be relatively simple. Parameters
associated with the external terms are the well depth, ε
ij
, between atoms i and j; the
minimum interaction radius, R
min
,
ij
; and the partial atomic charge, q
i
. Also included is the
dielectric constant, ε
D
, which is generally treated as equal to 1, the permittivity of vacuum,
although exceptions do exist (see below).
The term in square brackets in Eq. (3) is used to treat the van der Waals (VDW)
interactions. The particular form in Eq. (3) is referred to as the Lennard-Jones (LJ) 6–12

term. The 1/r
12
term represents the exchange repulsion between atoms associated with
overlap of the electron clouds of the individual atoms (i.e., the Pauli exclusion principle).
The strong distance dependence of the repulsion is indicated by the 12th power of this
term. Representing London’s dispersion interactions or instantaneous dipole–induced di-
pole interactions is the 1/r
6
term, which is negative, indicating its favorable nature. In the
LJ 6-12 equation there are two parameters; The well depth, ε
ij
, indicates the magnitude
of the favorable London’s dispersion interactions between two atoms i, j; and R
min
,
ij
is
the distance between atoms i and j at which the minimum LJ interaction energy occurs
and is related to the VDW radius of an atom. Typically, ε
ij
and R
min
,
ij
are not determined
for every possible interaction pair, i, j; but rather ε
i
and R
min
,

i
parameters are determined
for the individual atom types (e.g., sp
2
carbon versus sp
3
carbon) and then combining
rules are used to create the ij cross terms. These combining rules are generally quite
Atomistic Models and Force Fields 11
simple, being either the arithmetic mean [i.e., R
min
,
ij
ϭ (R
min
,
i
ϩ R
min
,
j
)/2] or the geometric
mean [i.e., ε
ij
ϭ (ε
i
ε
j
)
1/2

]. The use of combining rules greatly simplifies the determination
of the ε
i
and R
min
,
i
parameters.
In special cases the use of combining rules can be supplemented by specific i,j LJ
parameters, referred to as off-diagonal terms, to treat interactions between specific atom
types that are poorly modeled by the use of combining rules. The final term contributing
to the external interactions is the electrostatic or Coulombic term. This term involves the
interaction between partial atomic charges, q
i
and q
j
, on atoms i and j divided by the
distance, r
ij
, between those atoms with the appropriate dielectric constant taken into ac-
count. The use of a charge representation for the individual atoms, or monopoles, effec-
tively includes all higher order electronic interactions, such as those between dipoles and
quadrupoles. Combined, the Lennard-Jones and Coulombic interactions have been shown
to produce a very accurate representation of the interaction between molecules, including
both the distance and angle dependencies of hydrogen bonds [6].
Once the 3D structure of a molecule and all the parameters required for the atomic
and molecular connectivities are known, the energy of the system can be calculated via
Eqs. (1)–(3). First derivatives of the energy with respect to position allow for determina-
tion of the forces acting on the atoms, information that is used in the energy minimization
(see Chapter 4) or MD simulations (see Chapter 3). Second derivatives of the energy with

respect to position can be used to calculate force constants acting on atoms, allowing the
determination of vibrational spectra via normal mode analysis (see Chapter 8).
B. All-Atom Versus Extended-Atom Models
Always a limiting factor in computational studies of biological molecules is the ability
to treat systems of adequate size for the required amount of simulation time or number
of conformations to be sampled. One method to minimize the size of the system is to use
extended-atom models versus all-atom models. In extended-atom models the hydrogens
are not explicitly represented but rather are treated as part of the nonhydrogen atom to
which they are covalently bound. For example, an all-atom model would treat a methyl
group as four individual atoms (a carbon and three hydrogens), whereas in an extended-
atom model the methyl group would be treated as a single atom, with the LJ parameters
and charges adjusted to account for the omission of the hydrogens. Although this approach
could be applied for all hydrogens it was typically used only for nonpolar (aliphatic and
aromatic) hydrogens; polar hydrogens important for hydrogen bonding interactions were
treated explicitly. Extended-atom models were most widely applied for the simulation of
proteins in vacuum, where the large number of nonpolar hydrogens yields a significant
decrease in the number of atoms compared to all-atom models. However, as more simula-
tions were performed with explicit solvent representation, making the proportion of nonpo-
lar hydrogens in the system much smaller, with ever-increasing computer resources the
use of extended-atom models in simulations has decreased. Extended-atom models, how-
ever, are still useful for applications where a large sampling of conformational space is
required [7].
C. Extensions of the Potential Energy Function
The potential energy function presented in Eqs. (2) and (3) represents the minimal mathe-
matical model that can be used for computational studies of biological systems. Currently,
12 MacKerell
the most widely used energy functions are those included with the CHARMM [8,9],
AMBER [10], and GROMOS [11] programs. Two extensions beyond the terms in Eqs.
(2) and (3) are often included in biomolecular force fields. A harmonic term for improper
dihedrals is often used to treat out-of-plane distortions, such as those that occur with

aromatic hydrogens (i.e., Wilson wags). Historically, the improper term was also used to
maintain the proper chirality in extended-atom models of proteins (e.g., without the H
α
hydrogen, the chirality of amino acids is undefined). Some force fields also contain a
Urey–Bradly term that treats 1,3 atoms (the two terminal atoms in an angle; see Fig. 1)
with a harmonic bond-stretching term in order to more accurately model vibrational
spectra.
Beyond the extensions mentioned in the previous paragraph, a variety of terms are
included in force fields used for the modeling of small molecules that can also be applied
to biological systems. These types of force fields are often referred to as Class II force
fields, to distinguish then from the Class I force fields such as AMBER, CHARMM, and
GROMOS discussed above. For example, the bond term in Eq. (2) can be expanded to
include cubic and quartic terms, which will more accurately treat the anharmonicity associ-
ated with bond stretching. Another extension is the addition of cross terms that express
the influence that stretching of a bond has on the stretching of an adjacent bond. Cross
terms may also be used between the different types of terms such as bond angle or dihedral
angle terms, allowing for the influence of bond length on angle bending or of angle bending
on dihedral rotations, respectively, to be more accurately modeled [12]. Extensions may
also be made to the interaction portion of the force field [Eq. (3)]. These may include
terms for electronic polarizability (see below) or the use of 1/r
4
terms to treat ion–dipole
interactions associated with interactions between, for example, ions and the peptide back-
bone [13]. In all cases the extension of a potential energy function should, in principle,
allow for the system of interest to be modeled with more accuracy. The gains associated
with the additional terms, however, are often significant only in specific cases (e.g., the
use of a 1/r
4
term in the study of specific cation–peptide interactions), making their inclu-
sion for the majority of calculations on biochemical systems unwarranted, especially when

those terms increase the demand on computational resources.
D. Alternatives to the Potential Energy Function
The form of the potential energy function in Eqs. (1)–(3) was developed based on a combi-
nation of simplicity with required accuracy. However, a number of other forms can be
used to treat the different terms in Eqs. (2) and (3). One alternative form used to treat the
bond is referred to as the Morse potential. This term allows for bond-breaking events to
occur and includes anharmonicity in the bond-stretching surface near the equilibrium
value. The ability to break bonds, however, leads to forces close to zero at large bond
distances, which may present a problem when crude modeling techniques are used to
generate structures [14]. A number of variations in the form of the equation to treat the
VDW interactions have been applied. The 1/r
12
term used for modeling exchange repulsion
overestimates the distance dependence of the repulsive wall, leading to the use of an
1/r
9
term [15] or exponential repulsive terms [16]. A more recent variation is the buffered
14-7 form, which was selected because of its ability to reproduce interactions between
rare gas atoms [17]. Concerning electrostatic interactions, the majority of potential energy
functions employ the standard Coulombic term shown in Eq. (3), with one variation being
the use of bond dipoles rather than atom-centered partial atomic charges [16]. As with

×