Tải bản đầy đủ (.pdf) (241 trang)

barrett - amino acids and peptides (cambridge, 2004)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.59 MB, 241 trang )


The authors’ objective has been to concentrate on amino acids and pep-
tides without detailed discussions of proteins, although the book gives all
the essential background chemistry, including sequence determination,
synthesis and spectroscopic methods, to allow the reader to appreciate
protein behaviour at the molecular level. The approach is intended to
encourage the reader to cross classical boundaries, such as in the later
chapter on the biological roles of amino acids and the design of peptide-
based drugs. For example, there is a section on enzyme-catalysed synthesis
of peptides, an area often neglected in texts describing peptide synthesis.
This modern text will be of value to advanced undergraduates, graduate
students and research workers in the amino acid, peptide and protein field.

Amino Acids and Peptides

Amino Acids
and Peptides
G. C.IBARRETT

D. T. ELMORE
         
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
  
The Edinburgh Building, Cambridge CB2 2RU, UK
40 West 20th Street, New York, NY 10011-4211, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
Ruiz de Alarcón 13, 28014 Madrid, Spain
Dock House, The Waterfront, Cape Town 8001, South Africa

First published in printed format
ISBN 0-521-46292-4 hardback


ISBN 0-521-46827-2
p
a
p
erback
ISBN 0-511-03952-2 eBook
Cambrid
g
e University Press 2004
1998
(netLibrary)
©
vii
Contents
Forewordpagexiii
1Introduction1
1.1Sources and roles of amino acids and peptides1
1.2Definitions1
1.3‘Protein amino acids’, alias ‘the coded amino acids’3
1.4 Nomenclature for ‘the protein amino acids’, alias ‘the coded amino
acids’7
1.5 Abbreviations for names of amino acids and the use of these
abbreviations to give names to polypeptides7
1.6 Post-translational processing: modification of amino-acid residues
within polypeptides11
1.7 Post-translational processing: in vivo cleavages of the amide
backbone of polypeptides11
1.8 ‘Non-protein amino acids’, alias ‘non-proteinogenic amino acids’
or ‘non-coded amino acids’11
1.9Coded amino acids, non-natural amino acids and peptides in

nutrition and food science and in human physiology13
1.10The geological and extra-terrestrial distribution of amino acids15
1.11Amino acids in archaeology and in forensic science15
1.12Roles for amino acids in chemistry and in the life sciences16
1.12.1Amino acids in chemistry16
1.12.2Amino acids in the life sciences16
1.13␤- and higher amino acids17
1.14References19
2Conformations ofamino acids and peptides20
2.1 Introduction: the main conformational features of amino acids
and peptides20
2.2Configurational isomerism within the peptide bond20
2.3Dipeptides26
2.4Cyclic oligopeptides26
2.5Acyclic oligopeptides27
2.6Longer oligopeptides: primary, secondary and tertiary structure27
2.7Polypeptides and proteins: quaternary structure and aggregation28
2.8 Examples of conformational behaviour; ordered and disordered
states and transitions between them29
2.8.1The main categories of polypeptide conformation29
2.8.1.1One extreme situation29
2.8.1.2The other extreme situation29
2.8.1.3The general case29
2.9Conformational transitions for amino acids and peptides30
2.10References31
3Physicochemical properties ofamino acids and peptides32
3.1Acid–base properties32
3.2Metal-binding properties of amino acids and peptides34
3.3 An introduction to the routine aspects and the specialised
aspects of the spectra of amino acids and peptides35

3.4Infrared (IR) spectrometry36
3.5 General aspects of ultraviolet (UV) spectrometry, circular dichroism
(CD) and UVfluorescence spectrometry37
3.6Circular dichroism38
3.7Nuclear magnetic resonance (NMR) spectroscopy41
3.8 Examples of assignments of structures to peptides from NMR
spectra and other data43
3.9References46
4Reactions and analytical methods for amino acids and peptides48
Part 1Reactions of amino acids and peptides48
4.1Introduction48
4.2General survey48
4.2.1Pyrolysis of amino acids and peptides49
4.2.2Reactions of the amino group49
4.2.3Reactions of the carboxy group49
4.2.4Reactions involving both amino and carboxy groups51
4.3A more detailed survey of reactions of the amino group51
4.3.1N-Acylation51
4.3.2Reactions with aldehydes52
4.3.3N-Alkylation53
Contents
viii
4.4A survey of reactions of the carboxy group53
4.4.1Esterification54
4.4.2Oxidative decarboxylation54
4.4.3Reduction54
4.4.4Halogenation55
4.4.5 Reactions involving amino and carboxy groups of
␣-amino acids and their N-acyl derivatives55
4.4.6 Reactions at the ␣-carbon atom and racemisation of

␣-amino acids55
4.4.7Reactions of the amide group in acylamino acids and
peptides57
4.5Derivatisation of amino acids for analysis58
4.5.1 Preparation of N-acylamino acid esters and similar
derivatives for analysis58
4.6References60
Part 2 Mass spectrometry in amino-acid and peptide
analysis and in peptide-sequence
determination61
4.7General considerations61
4.7.1Mass spectra of free amino acids61
4.7.2Mass spectra of free peptides62
4.7.3Negative-ion mass spectrometry65
4.8Examples of mass spectra of peptides65
4.8.1 Electron-impact mass spectra (EIMS) of peptide
derivatives65
4.8.2Finer details of mass spectra of peptides68
4.8.3Difficulties and ambiguities69
4.9 The general status of mass spectrometry in peptide
analysis69
4.9.1 Specific advantages of mass spectrometry in peptide
sequencing70
4.10Early methodology: peptide derivatisation71
4.10.1N-Terminal acylation and C-terminal esterification71
4.10.2N-Acylation and N-alkylation of the peptide bond72
4.10.3Reduction of peptides to ‘polyamino-polyalcohols’72
4.11 Current methodology: sequencing by partial acid hydrolysis,
followed by direct MS analysis of peptide
hydrolysates72

4.11.1Current methodology: instrumental variations74
4.12Conclusions77
4.13References77
Contents
ix
Part 3 Chromatographic and related methods for the separation of
mixtures of amino acids, mixtures of peptides and mixtures of amino
acids and peptides78
4.14Separation of amino-acid and peptide mixtures78
4.14.1Separation principles78
4.15Partition chromatography; HPLC and GLC80
4.16Molecular exclusion chromatography (gel chromatography)80
4.17Electrophoretic separation and ion-exchange chromatography82
4.17.1Capillary zone electrophoresis (CZE)83
4.18Detection of separated amino acids and peptides83
4.18.1 Detection of amino acids and peptides separated by HPLC
and by other liquid-based techniques84
4.18.2Detection of amino acids and peptides separated by GLC85
4.19Thin-layer chromatography (planar chromatography; HPTLC)86
4.20Quantitative amino-acid analysis86
4.21References87
Part 4Immunoassays for peptides87
4.22Radioimmunoassays87
4.23Enzyme-linked immunosorbent assays (ELISAs)88
4.24References90
Part 5Enzyme-based methods for amino acids90
4.25Biosensors90
4.26References90
5Determination ofthe primary structure ofpeptides and proteins91
5.1Introduction91

5.2Strategy92
5.3Cleavage of disulphide bonds96
5.4Identification of the N-terminus and stepwise degradation97
5.5Enzymic methods for determining N-terminal sequences105
5.6Identification of C-terminal sequences106
5.7Enzymic determination of C-terminal sequences107
5.8Selective chemical methods for cleaving peptide bonds107
5.9Selective enzymic methods for cleaving peptide bonds109
5.10Determination of the positions of disulphide bonds112
5.11 Location of post-translational modifications and prosthetic
groups114
5.12Determination of the sequence of DNA117
5.13References118
Contents
x
6 Synthesis ofamino acids120
6.1General120
6.2Commercial and research uses for amino acids120
6.3Biosynthesis: isolation of amino acids from natural sources121
6.3.1Isolation of amino acids from proteins121
6.3.2 Biotechnological and industrial synthesis of coded amino
acids121
6.4 Synthesis of amino acids starting from coded amino acids other
than glycine122
6.5 General methods of synthesis of amino acids starting with a
glycine derivative123
6.6Other general methods of amino acid synthesis123
6.7Resolution of -amino acids125
6.8Asymmetric synthesis of amino acids127
6.9References129

7Methods for the synthesis ofpeptides130
7.1Basic principles of peptide synthesis and strategy130
7.2Chemical synthesis and genetic engineering132
7.3Protection of ␣-amino groups134
7.4Protection of carboxy groups135
7.5Protection of functional side-chains138
7.5.1Protection of ␧-amino groups138
7.5.2Protection of thiol groups139
7.5.3Protection of hydroxy groups140
7.5.4Protection of the guanidino group of arginine141
7.5.5Protection of the imidazole ring of histidine142
7.5.6Protection of amide groups145
7.5.7Protection of the thioether side-chain of methionine145
7.5.8Protection of the indole ring of tryptophan146
7.6Deprotection procedures146
7.7Enantiomerisation during peptide synthesis146
7.8Methods for forming peptide bonds149
7.8.1The acyl azide method150
7.8.2The use of acid chlorides and acid fluorides151
7.8.3The use of acid anhydrides151
7.8.4The use of carbodiimides153
7.8.5The use of reactive esters153
7.8.6The use of phosphonium and isouronium derivatives155
7.9Solid-phase peptide synthesis (SPPS)156
7.10Soluble-handle techniques163
7.11Enzyme-catalysed peptide synthesis and partial synthesis164
Contents
xi
7.12Cyclic peptides168
7.12.1Homodetic cyclic peptides168

7.12.2Heterodetic cyclic peptides170
7.13The formation of disulphide bonds170
7.14References172
7.14.1References cited in the text172
7.14.2References for background reading173
8Biological roles ofamino acids and peptides174
8.1Introduction174
8.2The role of amino acids in protein biosynthesis175
8.3Post-translational modification of protein structures178
8.4Conjugation of amino acids with other compounds182
8.5Other examples of synthetic uses of amino acids183
8.6Important products of amino-acid metabolism187
8.7Glutathione190
8.8The biosynthesis of penicillins and cephalosporins192
8.9References198
8.9.1References cited in the text198
8.9.2References for background reading199
9Some aspects ofamino-acid and peptide drug design200
9.1Amino-acid antimetabolites200
9.2Fundamental aspects of peptide drug design201
9.3The need for peptide-based drugs202
9.4The mechanism of action of proteinases and design of inhibitors204
9.5Some biologically active analogues of peptide hormones210
9.6The production of antibodies and vaccines213
9.7The combinatorial synthesis of peptides215
9.8The design of pro-drugs based on peptides216
9.9Peptide antibiotics217
9.10References218
9.10.1References cited in the text218
9.10.2References for background reading218

Subject index220
Contents
xii
Foreword
This is an undergraduate and introductory postgraduate textbook that gives
information on amino acids and peptides, and is intended to be self-sufficient in all
the organic and analytical chemistry fundamentals. It is aimed at students of chem-
istry, and allied areas. Suggestions for supplementary reading are provided, so that
topic areas that are not covered in depth in this book may be followed up by readers
with particular study interests.
A particular objective has been to concentrate on amino acids and peptides, as
the title of the book implies; the exclusion of detailed discussion of proteins is
deliberate, but the book gives all the essential background chemistry so that protein
behaviour at the molecular level can be appreciated.
There is an emphasis on the uses of amino acids and peptides, and on their bio-
logical roles and, while Chapter 8 concentrates on this, a scattering of items of
information of this type will be found throughout the book. Important pharma-
ceutical developments in recent years underline the continuing importance and
potency of amino acids and peptides in medicine and the flavour of current research
themes in this area can be gained from Chapter 9.
Supplementary reading
(see also lists at the end of each Chapter)
Standard Student Texts
Standard undergraduate Biochemistry textbooks relate the general field to the
coverage of this book. Several such topic areas are covered in
Zubay, G. (1993) Biochemistry, Third Edition, Wm. C. Brown Communications
Inc, Dubuque, IA
and
Voet, D. and Voet, J. G. (1995) Biochemistry, Second edition, Wiley, New York
xiii

Typically, these topic areas as covered by Zubay are
Chapter 3: ‘The building blocks of proteins: amino acids, peptides and proteins’
Chapter 4: ‘The three-dimensional structure of proteins’
Chapter 5: ‘Functional diversity of proteins’
Removed more towards biochemical themes, are
Chapter 18: ‘Biosynthesis of amino acids’
Chapter 19: ‘The metabolic fate of amino acids’
Chapter 29: ‘Protein synthesis, targeting, and turnover’
Voet and Voet give similar coverage in
Chapter 24: ‘Amino acid metabolism’
Chapter 30: ‘Translation’ (i.e. protein biosynthesis)
Chapter 34: ‘Molecular physiology’ (of particular relevance to coverage in this book of
blood clotting, peptide hormones and neurotransmitters)
Supplementary reading:
suggestions for further reading
(a) Protein structure
Branden, C., and Tooze, J. (1991) Introduction to Protein Structure, Garland Publishing
Inc., New York
(b) Protein chemistry
Hugli, T. E. (1989) Techniques of Protein Chemistry, Academic Press, San Diego, California
Cherry, J. P. and Barford, R. A. (1988) Methods for Protein Analysis, American Oil
Chemists’ Society, Champaign, Illinois
(c) Amino acids
Barrett, G. C., Ed. (1985) Chemistry and Biochemistry of the Amino Acids, Chapman and
Hall, London
Barrett, G. C. (1993) in Second Supplements to the 2nd Edition of Rodd’s Chemistry of
Carbon Compounds, Volume 1, Part D: Dihydric alcohols, their oxidation products and
derivatives, Ed. Sainsbury, M., Elsevier, Amsterdam, pp. 117–66
Barrett, G. C. (1995) in Amino Acids, Peptides, and Proteins, A Specialist Periodical Report
of The Royal Society of Chemistry, Vol. 26, Ed. Davies, J. S., Royal Society of Chemistry,

London (preceding volumes cover the literature on amino acids, back to 1969 (Volume
1))
Coppola, G. M. and Schuster, H. F. (1987) Asymmetric Synthesis: Construction of Chiral
Molecules using Amino Acids, Wiley, New York
Dawson, R. M. C., Elliott, D. C., Elliott, W. H., and Jones, K. M. (1986) Data for
Biochemical Research, Oxford University Press, Oxford
Foreword
xiv
Greenstein, J. P., and Winitz, M. (1961) Chemistry of the Amino Acids, Wiley, New York (a
facsimile version (1986) of this three-volume set has been made available by Robert E.
Krieger Publishing Inc., Malabar, Florida)
Williams, R. M. (1989) Synthesis of Optically Active

-Amino Acids, Pergamon Press,
Oxford
(d) Peptides
Bailey, P. D. (1990) An Introduction to Peptide Chemistry, Wiley, Chichester
Bodanszky, M. (1988) Peptide Chemistry: A Practical Handbook. Springer-Verlag, Berlin
Bodanszky, M. (1993) Principles of Peptide Synthesis, Second Edition, Springer-Verlag,
Heidelberg
Elmore, D. T. (1993) in Second Supplements to the 2nd Edition of Rodd’s Chemistry of
Carbon Compounds, Volume 1, Part D: Dihydric alcohols, their oxidation products and
derivatives, Ed. Sainsbury, M., Elsevier, Amsterdam, pp. 167–211
Elmore, D. T. (1995) in Amino Acids, Peptides, and Proteins, A Specialist Periodical Report of
The Royal Society of Chemistry, Vol. 26, Ed. Davies, J. S., Royal Society of Chemistry,
London (preceding volumes cover the literature of peptide chemistry back to 1969
(Volume 1))
Jones, J. H. (1991) The Chemical Synthesis of Peptides, Clarendon Press, Oxford
Foreword
xv


1
Introduction
1.1 Sources and roles of amino acids and peptides
More than 700 amino acids have been discovered in Nature and most of them are
␣-amino acids. Bacteria, fungi and algae and other plants provide nearly all these,
which exist either in the free form or bound up into larger molecules (as constitu-
ents of peptides and proteins and other types of amide, and of alkylated and ester-
ified structures).
The twenty amino acids (actually, nineteen ␣-amino acids and one ␣-imino acid)
that are utilised in living cells for protein synthesis under the control of genes are in
a special category since they are fundamental to all life forms as building blocks for
peptides and proteins. However, the reasons why all the other natural amino acids
are located where they are, are rarely known, although this is an area of much
speculation. For example, some unusual amino acids are present in many seeds and
are not needed by the mature plant. They deter predators through their toxic or oth-
erwise unpleasant characteristics and in this way are thought to provide a defence
strategy to improve the chances of survival for the seed and therefore help to ensure
the survival of the plant species.
Peptides and proteins play a wide variety of roles in living organisms and display
a range of properties (from the potent hormonal activity of some small peptides to
the structural support and protection for the organism shown by insoluble proteins).
Some of these roles are illustrated in this book.
1.2 Definitions
The term ‘amino acids’ is generally understood to refer to the aminoalkanoic acids,
H
3
N
ϩ
—(CR

1
R
2
)
n
—CO
2
Ϫ
with nϭ1 for the series of ␣-amino acids, nϭ2 for ␤-amino
acids, etc. The term ‘dehydro-amino acids’ specifically describes 2,3-unsaturated (or

␣␤
-unsaturated’)-2-aminoalkanoic acids, H
3
N
ϩ
—(C෇CR
1
R
2
)—CO
2
Ϫ
.
However, the term ‘amino acids’ would include all structures carrying amine and
acid functional groups, including simple aromatic compounds, e.g. anthranilic acid,
1
o-H
3
N

ϩ
—C
6
H
4
—CO
2
Ϫ
, and would also cover other types of acidic functional
groups (such as phosphorus and sulphur oxy-acids, H
3
N
ϩ
—(R
1
R
2
C—)
n
HPO
3
Ϫ
and
R
3
N
ϩ
—(R
1
R

2
C—)
n
SO
3
Ϫ
, etc). The family of boron analogues R
3
N
·
BHR
1
—CO
2
R
2
(
·
denotes a dative bond) has recently been opened up through the synthesis of
some examples (Sutton et al., 1993); it would take only the substitution of the
carboxy group in these ‘organoboron amino acids’ (RϭR
1
ϭR
2
ϭH) by phospho-
rus or sulphur equivalents to obtain an amino acid that contains no carbon!
However, unlike the amino acids containing sulphonic and phosphonic acid group-
ings, naturally occurring examples of organoboron-based amino acids are not
known.
The term ‘peptides’ has a more restricted meaning and is therefore a less ambigu-

ous term, since it covers polymers formed by the condensation of the respective
amino and carboxy groups of ␣, ␤, ␥ . . . -amino acids. For the structure with mϭ2
in Figure 1.1 (i.e., for a dipeptide) up to values of mӍ20 (an eicosapeptide), the term
‘oligopeptide’ is used and a prefix di-, tri-, tetra-, penta- (see Leu-enkephalin, a linear
pentapeptide, in Figure 1.1), . . . undeca- (see cyclosporin A, a cyclic undecapeptide,
in Figure 1.4 later), dodeca-, etc. is used to indicate the number of amino-acid
residues contained in the compound. Homodetic and heterodetic peptides are illus-
trated in Chapter 7.
Isopeptides are isomers in which amide bonds are present that involve the side-
chain amino group of an ␣␻-di-amino acid (e.g. lysine) or of a poly-amino acid
and/or the side-chain carboxy-group of an ␣-amino-di- or -poly-acid (e.g. aspartic
acid or glutamic acid). Glutathione (Chapter 8) is a simple example. Longer poly-
mers are termed ‘polypeptides’ or ‘proteins’ and the term ‘polypeptides’ is becoming
the most commonly used general family name (though proteins remains the pre-
ferred term for particular examples of large polypeptides located in precise biolog-
ical contexts). Nonetheless, the relationship between these terms is a little more
contentious, since the change-over from polypeptide to protein needs definition.
The figure ‘roughly fifty amino acid residues’ is widely accepted for this. Insulin (a
polymer of fifty-one ␣-amino acids but consisting of two crosslinked oligopeptide

2
Figure 1.1. Peptides as condensation polymers of ␣-amino acids.
chains; see Figure 1.4 later) is on the borderline and has been referred to both as a
small protein and as a large polypeptide.
Poly(␣-amino acid)s is a better term for peptides formed by the self-condensation
of one amino acid; natural examples exist, such as poly(-glutamic acid), the protein
coat of the anthrax spore (Hanby and Rydon, 1946). In early research in the textile
industry, poly(␣-amino acid)s showed promise as synthetic fibres, but the synthesis
methodology required for the polymerisation of amino acids was complex and
uneconomic.

Polymers of controlled structures made from N-alkyl-␣-amino acids (Figure 1.1;
—NR
n
instead of —NH—, R
1
ϭR
2
ϭH; nϭ1), i.e. H
2
ϩ
NR
n
—CH
2
CO—[NR
n

CH
2
—CO—]
m
NR
n
—CH
2
—CO
2
Ϫ
, which are poly(N-alkylglycine)s of defined
sequence (various R

n
at chosen points along the chain), have been synthesised as
peptide mimetics (see Chapter 9) and have been given the name peptoids. These can
be viewed as peptides with side-chains shifted from carbon to nitrogen; they will
therefore have a very different conformational flexibility (see Chapter 2) from that
of peptides and will also be incapable of hydrogen bonding. This is a simple enough
way of providing all the correct side-chains on a flexible chain of atoms, in order to
mimic a biologically active peptide, but the mimic can avoid enzymic breakdown
before it reaches the site in the body where it is needed.
Using the language of polymer chemistry, polypeptides made from two or more
different ␣-amino acids are copolymers or irregular poly(amide)s, whereas poly(␣-
amino acid)s, H—[NH—CR
1
R
2
—CO—]
m
OH, are homopolymers that could be
described as members of the nylon[2] family.
Depsipeptides are near-relatives of peptides, with one or more amide bonds
replaced by ester bonds; in other words, they are formed by condensing ␣-amino
acids with ␣-hydroxy-acids in various proportions. There are several important
natural examples of these, of defined sequence; for example the antibiotic valino-
mycin and the family of enniatin antibiotics. Structures of other examples of
depsipeptides are given in Section 4.8.
Nomenclature for conformational features of peptide structure is covered in
Chapter 2.
1.3 ‘Protein amino acids’, alias ‘the coded amino acids’
The twenty -amino acids (actually, nineteen ␣-amino acids and one ␣-imino acid
(Table 1.1)) which, in preparation for their role in protein synthesis, are joined in vivo

through their carboxy group to tRNA to form ␣-aminoacyl-tRNAs, are organised
by ribosomal action into specific sequences in accordance with the genetic code
(Chapter 8).
‘Coded amino acids’ is a better name for these twenty amino acids, rather than
‘protein amino acids’ or ‘primary protein amino acids’ (the term ‘coded amino
acids’ is increasingly used), because changes can occur to amino-acid residues after
they have been laid in place in a polypeptide by ribosomal synthesis. Greenstein and
1.3 Protein amino acids
3
Structures Hydrophobicity Hydrophilicity
Name of Three-letter Single-letter
amino acid abbreviation abbreviation
*Amino acid side-chain, Rϭ High High
One with no Glycine Gly G H *
side-chain* (i.e. with
a hydrogen atom)
Four with saturated Alanine Ala A CH
3
*
aliphatic side- Leucine Leu L CH
2
CH(CH
3
)
2
*
chains* (hydrophobic Valine Val V CH(CH
3
)
2

*
side-chains) Isoleucine Ile I (S)-CH(CH
3
)C
2
H
5
*
Table 1.1. The twenty ‘coded’ amino acids (nineteen ‘coded’ -␣-amino acids, and one ‘coded’ -␣-imino acid): structures and
definitions
a
Structure conventions for the -␣-amino acids are
Fischer projection of an
-␣-amino acid, requiring the carbon chain
to be arranged vertically, with the carboxy
group at the top
One of the commonly-used three-
dimensional representations of an
-␣-amino acid
Barrett representation of an -␣-amino acid
is
equivalent
to
CO
2
H
3
N
HR


+
which
is
equivalent
to
CO
2
H
3
N
R

+
CO
2
H
3
NCH
R

+
CO
2
H
3
N
R

+
Ten with functionalised Arginine Arg R CH

2
CH
2
CH
2
NHC(ϭNH)NH
2
*
aliphatic side-chains* Aspartic acid Asp D CH
2
CO
2
H*
(mostly hydrophilic Asparagine Asn N CH
2
CONH
2
*
side-chains) Glutamic acid Glu E CH
2
CH
2
CO
2
H*
Glutamine Gln Q CH
2
CH
2
CONH

2
*
Lysine Lys K CH
2
CH
2
CH
2
CH
2
NH
2
*
Methionine Met M CH
2
CH
2
SCH
3
*
Cysteine Cys C CH
2
SH *
Serine Ser S CH
2
OH *
Threonine Thr T (R)-CH(CH
3
)OH *
Four with aromatic Phenylalanine Phe F CH

2
C
6
H
5
*
or heteroaromatic Tyrosine Tyr Y CH
2
-(p-OH-C
6
H
4
)*
side-chains* Histidine His H CH
2
-(imidazol-4-yl) *
(most of these side-chains Tryptophan Trp W CH
2
-(indol-3-yl) *
are hydrophobic)
The ‘coded’ ␣-imino Proline Pro P *
acid
Notes:
1. The structure of each side-chain, R, is given for the 19 ‘coded ␣-amino acids’, after each name. The full structure of the ‘coded ␣-imino
acid’ proline is given. ‘Three-letter’ and ‘one-letter’ abbreviations are given for the 20. The three-letter abbreviation is the first three letters of
the name for all twenty, except for asparagine (Asn), glutamine (Gln), isoleucine (Ile) and tryptophan (Trp). The single-letter abbreviated name
is the first letter of their full name for eleven of them. Different letters are needed for the other nine, to avoid ambiguity: arginine (R),
asparagine (N), aspartic acid (D), glutamic acid (E), glutamine (Q), lysine (K), phenylalanine (F), tryptophan (W) and tyrosine (Y).
2. All full names end in ‘ine’ except aspartic acid, glutamic acid and tryptophan. Adjectives are derived from the names by dropping the ‘ine’
or its equivalent ending and adding ‘yl’; thus, alanyl, glutamyl, prolyl, tryptophyl, etc.

3. Configurations. The ‘R/S’ convention can easily be transferred to replace the Fischer ‘/’ system, while retaining the trivial names: -
enantiomers of all the coded amino acids are members of the S series except -cysteine, which becomes R-cysteine through proper application
of the R/S rules. Diastereoisomers (the isoleucine/allo-isoleucine and threonine/allothreonine pairs, ‘allo’ indicating inversion of the side-chain
configuration of the coded amino acid) are less ambiguously named through the ‘R/S’ system, although the side-chain configuration can be
indicated; for example, natural -isoleucine is (2S,3S)-isoleucine:
CO
2
H

+
NH
2
Table 1.1. (cont.)
whereas -alloisoleucine is (2S,3R)-isoleucine:
For the structures of natural -threonine ((2S,3R)-threonine) and -allothreonine ((2S,3S)-threonine), replace the side-chain ethyl group (C
2
H
5
)
in isoleucine and alloisoleucine by OH.
4. IUPAC–IUB nomenclature recommendations (1983), reproduced in full in Amino Acids, Peptides, and Proteins, 1985, Vol. 16, The Royal
Society of Chemistry, p. 387; and in Eur.J.Biochem., 1984, 138, 9, encourage the retention of trivial names for the common ␣-amino acids, but
systematic names are relatively straightforward; thus, -alanine is 2S-aminopropanoic acid and -histidine is 2S-amino-3-(imidazol-4-yl)-
propanoic acid (the name for the predominant tautomer).
5. ‘Hydrophilic’ and ‘hydrophobic’ are terms used to denote the relative water-attracting and water-repelling property, respectively, of the
side-chain when the amino acid is condensed into a polypeptide (see Chapter 5). The term ‘hydropathy index’ may be used to place the amino
acids in order of their ‘hydrophilicity’ (Kyte and Doolittle, 1985), and their relative positions are shown here on an arbitrary scale.
a
Selenocysteine (i.e. cysteine with the sulphur atom replaced by a selenium atom) has been found in certain proteins, e.g. formate
dehydrogenase, an enzyme from Escherichia coli, and it has very recently been shown to be placed there through normal ribosomal synthesis

(Stadtman, 1996). Thus selenocysteine can now be accepted as the ‘twenty-first coded amino acid’.
CO
2
H
3
NCH
is
equivalent
to
which
is
equivalent
to

CO
2
H
3
N
H
C

CO
2
H
3
N

+
+

+
CH
3
CH
C
2
H
5
C
2
H
5
H
CH
3
CO
2
H
3
NC
is
equivalent
to
which
is
equivalent
to

CO
2

H
3
N
HC

CO
2
H
3
N

+
+
+
HC
C
2
H
5
C
2
H
5
H
CH
3
H
CH
3
Winitz, in their 1961 book, listed ‘the 26 protein amino acids’, six of which were later

found to be formed from among the other twenty ‘protein amino acids’ in the list of
Greenstein and Winitz, after the protein had left the gene (‘post-translational (some-
times called post-ribosomal) modification’ or ‘post-translational processing’).
Because of these changes made to the polypeptide after ribosomal synthesis, amino
acids that are not capable of being incorporated into proteins by genes (‘secondary
protein amino acids’, Table 1.2) can, nevertheless, be found in proteins.
1.4 Nomenclature for ‘the protein amino acids’, alias ‘the coded amino acids’
The common amino acids are referred to through trivial names (for example, glycine
would not be named either 2-aminoethanoic acid or amino-acetic acid in the amino
acid and peptide literature). Table 1.1 summarises conventions and gives structures.
The rarer natural amino acids are usually named as derivatives of the common
amino acids, if they do not have their own trivial names related to their natural
source (Table 1.2), but apart from these, there are occasional examples of the use of
systematic names for natural amino acids.
1.5 Abbreviations for names of amino acids and the use of
these abbreviations to give names to polypeptides
To keep names of amino acids and peptides to manageable proportions, there are
agreed conventions for nomenclature (see the footnotes to Table 1.1). The simplest
␣-amino acid, glycine, would be depicted H—Gly—OH in the standard ‘three-
letter’ system, the H— and —OH representing the ‘H
2
O’ that is expelled when this
amino acid undergoes condensation to form a peptide (Figure 1.2). The three-letter
abbreviations therefore represent the ‘amino-acid residues’ that make up peptides
and proteins.
So this ‘three-letter system’ was introduced, more with the purpose of space-
saving nomenclature for peptides than to simplify the names of the amino acids. A
‘one-letter system’ (thus, glycine is G) is more widely used now for peptides (but is
never used to refer to individual amino acids in other contexts) and is restricted to
naming peptides synthesised from the coded amino acids (Figure 1.3).

1.5 Abbreviations
7
Figure 1.2. Polymerisation of glycine.
8
Table 1.2. Post-translational changes to proteins: the modified coded amino acids
present in proteins, including crosslinking amino acids (secondary amino acids)
Modifications to side-chain functional groups of coded amino acids
1. The aliphatic and aromatic coded amino acids may exist in ␣␤-dehydrogenated forms
and the ␤-hydroxy-␣-amino acids may undergo post-translational dehydration, so as to
introduce ␣␤-dehydroamino acid residues, ϪNHϪ(CϭCR
1
R
2
)ϪCOϪ, into polypeptides.
2. Side-chain OH, NH or NH
2
proton(s) may be substituted by glycosyl, phosphate or
sulphate. These substituent groups are ‘lost’ during hydrolysis preceding analysis and during
laboratory treatment of proteins by hydrolysis prior to chemical sequencing, which creates a
problem that is usually solved through spectroscopic and other analytical techniques.
3. Side-chain NH
2
of lysine may be methylated or acylated: (N

-methylalanyl, N

-di-
aminopimelyl).
4. Side-chain NH
2

of glutamine may be methylated; giving N
5
-methylglutamine, and the
side-chain NH
2
of asparagine may be glycosylated.
5. Side-chain CH
2
may be hydroxylated, e.g. hydroxylysine, hydroxyprolines (trans-4-
hydroxyproline in particular), or carboxylated, e.g. to give ␣-aminomalonic acid, ␤-
carboxyaspartic acid, ␥-carboxyglutamic acid, ␤-hydroxyaspartic acid, etc.
6. Side-chain aromatic or heteroaromatic moieties may be hydroxylated, halogenated or N-
methylated.
7. The side-chain of arginine may be modified (e.g. to give ornithine (Orn),
RϭCH
2
CH
2
CH
2
NH
2
, or citrulline (Cit), RϭCH
2
CH
2
CH
2
NHCONH
2

).
8. The side-chain of cysteine may be modified, as in 1 above, also selenocysteine (CH
2
SeH
instead of CH
2
SH; see footnote a to Table 1.1), lanthionine (see 10 below).
9. The side-chain of methionine may be S-alkylated (see Table 1.3) or oxidised at S to give
methionine sulphoxide.
10. Crosslinks in proteins may be formed by condensation between nearby side-chains.
(a) From lysine: e.g. lysinoalanine as if from [lysineϩserineϪH
2
O]
H-Lys-OH
→ dehydroalanine →|
H-Ala-OH
(b) From tyrosine: 3,3Ј-dityrosine, 3,3Ј,5Ј,3Љ-tertyrosine, etc.
(c) From cysteine: oxidation of the thiol grouping (HSϪϩϪSH→ϪSϪSϪ) to give the
disulphide or to give cysteic acid (Cya): ϪSH→ϪSO
3
H and alkylation leading to
sulphide formation (e.g. alkylation as if by dehydroalanine to give lanthionine):
(Further examples of crosslinking amino acids in peptides and proteins are given in
Section 5.11.)
Nomenclature of post-translationally modified amino acids
Abbreviated names for close relatives of the ‘coded amino acids’ can be based on the ‘three-
letter’ names when appropriate; thus, -Pro after post-translational hydroxylation gives -
Hypro (trans-4-hydroxyproline, or (2S,4R)-hydroxyproline).
Current nomenclature recommendations (see footnote to Table 1.1) allow a number of
abbreviations to be used for some non-coded amino acids possessing trivial names (some of

which are used above and elsewhere in this book): Dopa, ␤-Ala, Glp, Sar, Cya, Hcy
(homocysteine) and Hse (homoserine) are among the more common.
2H –– Cys –– OH → H –– Ala –– OH H –– Ala –– OH
S

×