Crystal structures of the human SUMO-2 protein at 1.6 A
˚
and 1.2 A
˚
resolution
Implication on the functional differences of SUMO proteins
Wen-Chen Huang
1,2
, Tzu-Ping Ko
1
, Steven S L Li
3
and Andrew H J. Wang
1
1
Institute of Biological Chemistry, Academia Sinica, Taipei, Taiwan;
2
Institute of Biomedical Sciences, National Sun Yat-Sen
University, Kaoshiung, Taiwan;
3
Department of Biotechnology, College of Life Sciences, Kaoshiung Medical University, Taiwan
The S UMO proteins are a class of small ubiquitin-like
modifiers. SUMO is attached to a s pecific lysine side chain
on the target protein via an isopeptide bond with its
C-terminal glycine. There are at least four SUMO proteins in
humans, wh ich are involved in protein trafficking and tar-
geting. A truncated human SUMO-2 protein that contains
residues 9–93 was expressed i n Escherichia c oli and crystal-
lized in two d ifferent unit cells, w ith dimensions of a ¼ b ¼
75.25 A
˚
,c¼ 29.17 A
˚
and a ¼ b ¼ 74.96 A
˚
,c¼ 33.23 A
˚
,
both b elonging to the rhombohedral space group R3. They
diffracted X-rays to 1.6 A
˚
and 1.2 A
˚
resolution, respectively.
The structures were determined by molecular re placement
using the yeast SMT3 protein as a search m odel. Subsequent
refinements yielded R/R
free
values of 0 .169/0.190 and 0.119/
0.185, at 1.6 A
˚
and 1 .2 A
˚
, r espectively. The peptide fo lding
of SU MO-2 consis ts of a h alf-open b-barrel a nd two flank-
ing a-helices with secondary structural elements arran ged as
bbabbab in the sequence, identical to t hose of ubiquitin,
SMT3 and SUMO-1. Comparison of SUMO-2 with
SUMO-1 showed a surface region near the C terminus with
significantly different charge distributions. This may explain
their distinct intracellular locations. In addition, crystal-
packing a nalysis s uggests a possible trimeric assembly of
the SUMO-2 protein, of which the biological significance
remains t o be determined.
Keywords: homology m odeling; m olecular interactio ns;
protein mod ification; surface charge distributions; synchro-
tron radiations.
Control of protein expression and regulation of protein
activities are central to the cellular processes in an organism.
Many proteins are rather short lived, and are eventually
targeted to proteosomes f or degradation via conjugation
with ubiquitin [1]. However, the functions of various
proteins are not only a matter of time but also a matter of
place. T hus, n ewly synthesized proteins must be directed
toward specific subcellular compartments. SUMO is the
acronym for small ubiquitin-like modifier and named after
its three-dimensional structural similarity to ubiquitin. Both
SUMO and ubiquitin a re attached to target proteins by
forming an isopeptide bond between the C-terminal glycine
and a specific lysine side chain o n the target [2]. The extra
amino acids beyond the l ast g lycine–glycine m otif o f n ative
SUMO proteins are proteolytically removed in vivo.In
mammals, there are at least four different SUMO proteins,
SUMO-1, -2, -3 and -4. The h uman hSMT3 cDNA
encoding the SUMO-2 protein was first reported by
Mannen et al.[3].SUMO-2andSUMO-3share87%
sequence identity with each other, but they have only 47%
identity with SUMO-1 [4]. The novel SUMO-4 associated
with diabetes is also more similar in sequence to SUMO-2
than to SUMO-1 [5].
The first three-dimensional structure of SUMO-1 deter-
mined by NMR showed that the SUMO proteins are
remarkably similar in protein fold to ubiquitin despite the
amino acid sequence identity of only 18% [6]. Recently, a
high-resolution NMR structure of SUMO-1 was deter-
mined by using heteronuclear resonance [ 7], in which the
overall conformation was slightly different from the previ-
ous model. On the other hand, the yeast SMT3 (SUMO)
protein is 40–45% identical to human SUMO proteins in
amino acid sequence a nd the human SUMO proteins have
an insertion between the strands b1andb2, as shown in
Fig. 1. The crystal structure o f yeast SMT3 was d etermined
in complex with Ulp1 protease [8]. Significant deviation
between the crystal structur e and solution s tructure of yeast
SMT3 was also observed using high-resolution hetero-
nuclear NMR spectroscopy [9].
In addition to the N-terminal extensions, the m ost
significant difference between SUMO and ubiquitin are
their surface charge distributions [6]. SUMO-1, - 2 and -3
proteins were shown to localize on nuclear membrane, in
nuclear bodies and in the cytoplasm, respectively [10].
Presumably, the different locations are due to their
Correspondence to S. S L. Li, Department of Biotechnology, College
of Life Sciences, Kaoshiung Medical University, Kaoshiung 807,
Taiwan. Fax: +886 7 312 5339, Tel.: +886 7 313 5162,
E-mail: and A. H J. Wang, Institute of Biological
Chemistry, Academia Sinica, Taipei 115, Taiwan.
Fax: +886 2 2788 2043, Tel.: +886 2 2788 1981,
E-mail:
Abbreviations: CHES, 2-(cyclohexylamino)ethanesulfonic acid;
IPTG, isopropyl thio-b-
D
-galactoside.
(Received 2 1 May 2004, revised 14 July 2004,
accepted 31 August 2004)
Eur. J. Biochem. 271, 4114–4122 (2004) Ó FEBS 2004 doi:10.1111/j.1432-1033.2004.04349.x
functions in protein targeting. A rrangement of side chains
confers t he protein w ith unique surface properties. Thus,
comparison of SUMO-1, - 2 and -3 surface p roperties by
modelling provides an approach to understanding the
relationship b etween structure and function. To date, no
crystal structure of mammalian SUMO proteins has been
determined. In order to obtain more structural information,
especially about the protein side chains, we tried to
determine a three-dimensional structure of human SUMO
at high resolution by X-ray crystallography.
In this paper we present the crystal structure of a
truncated SUMO-2. To facilitate crystallization, our strat-
egy was to reduce the length of N-terminal arm while
preserving the sequence of Val10–Lys11–Thr12–Glu13, as
well as the C-terminal Gly92–Gly93 for conjugation via
an isopepti de bond. The VKTE s equence in SUMO-2 is
consistent with the S UMOylation consensus YKXE w here
Y represents a hydrophobic amino acid and X means a ny
amino acid in target proteins, and this consensus sequence is
functional for possible polymerization [11]. Furthermore,
the truncated SUMO-2 cDNA encoding sequence 9–93 was
fusedtoaHis
10
tag at the N terminus with a Factor Xa
cleavage site for efficient purificaion.
Materials and methods
Cloning, expression and purification
The full-length cDNA encoding human SUMO-2 protein
[3] was first cloned into the pET28a expression vector, and
the cDNA sequence of truncated SUMO-2 was ampli-
fied by PCR using the SUMO-2/pET28a as a template.
The PCR was carried out for 25 cycles of 30 s at 95 °C,
30 s at 55 °Cand30sat72°C, using two primers
5¢-GGAATTCCATATGGGAGTCAAGACTGA GAA
CAAC-3¢ and 5¢-CCGCTCGAGTCAACCTCCCGTCT
G-3¢. The DNA products were checked on 1.5% agarose
gels stained with e thidium bromide and t hen digested with
restriction enzymes. The truncated SUMO-2 with an
N-terminal His
10
tag was then expressed using pET16b
(Novagen) in Escherichia coli BL21 (DE3) at 37 °C,
induced by adding 1 m
M
isopropyl thio-b-
D
-galactoside
(IPTG) at D
600
¼ 0.8. Bacterial cells were harvested after
4 h of induction by centrifuging at 8983 g for 30 min
using Avan tiÒ J-20XP (Beckman). Cells w ere lyse d i n a
buffer c ontaining 25 m
M
Tris-base a nd 150 m
M
NaCl
(pH 8.0) with a French Press (Cell Disruption, Constant-
systems) at 206 843 kPa twice and centrifuged (18 592 g,
20 min ) for supernatant collection.
The S UMO-2 protein was purified using a column
packed with Ni–NTA HisBindÒ resin ( Novagen) in two
steps. In the first purification, major protein was eluted
using an i midazole gradient of 0–250 m
M
and the collec ted
fractions were analysed by SDS/PAGE. The SUMO-2
protein in peak fractions was pooled and dialysed three
times against 25 m
M
Tris-base, 150 m
M
NaCl (pH 8.0) and
incubated for 26 h at room temperature in the presence of
Factor Xa (Novagen). This step removes the His
10
tag to
generate the truncated SUMO-2 protein (9–93 amino acids).
The protein solution was then purified a second time,
in which the flow-through was collected using a wash buffer
that contained 20 m
M
inidazole, and dialysed t hree times i n
25 m
M
Tris-base, 20 m
M
NaCl, 1 m
M
dithiothreitol
(pH 8.0). Molecular mass o f the truncated SUMO-2 was
determined to be 9950 Da by ESI-MS, exactly as calculated
from the amino acid sequence. The purified protein was
concentrated to 60 mgÆmL
)1
by ultrafiltration using 3 kDa
Jumbosep
TM
membrane (Pall Corporation, MI).
Fig. 1. Structure-based sequence alignment of SUMO proteins from human (Homo sapiens; h_SUMO-2/-3/-4/-1) and yeast (Sacchromyces cerevisiae;
y_SMT3) SUMO, and human ubiquitin (h_Ubiquitin). Secondary st ructu re elements of SU MO-2 are shown ab ove the s equenc es with a-helices and
b-strands depicted as red cylinders and green arrows, respectively, and t he N-terminal arm a s a line. Identical r esidues conserved in five or more
sequences are shaded in yellow and gaps are denoted by dots. The residues of human SUMO-1 that interact with Ubc9 are coloured orange, those of
yeast SMT3 that interact with Ulp1 are in cyan, and the overlapping regions are shown in magenta. The target proteins are a ttached directly to the
C-terminal glycine of ubiquitin, whereas SUMO requires additional proc essing to remove the C-terminal tail. The C-terminal Gly-Gly motifs in the
mature proteins are s hown in green.
Ó FEBS 2004 Structure and function of human SUMO-2 (Eur. J. Biochem. 271) 4115
Crystallization and data collection
Crystallization was achieved by the hanging-drop vapour
diffusion method at room temperature using the CryoII
screen kits (Emerald Biostructures). After optimization, two
different crystal forms of the truncated SUMO-2 protein
(9–93 amino acids) were obtained. One crystal form having
a triangular plate shape ( type I, Fig. 2A ) grew in 40% (w/v)
PEG-600, 0.1
M
2-(cyclohexylamino)ethanesulfonic acid
(CHES) and 0.1
M
Tris/HCl pH 8.0, and diffracted to
1.6 A
˚
. The other one, of rectangular p olyhedron shape (type
II, Fig. 2B), grew in 40% (w/v) PEG-600, 0.1
M
CHES,
0.1
M
sodium HEPES pH 8.0, and diffracted well to a
resolution of 1.2 A
˚
.
Two data sets were collected using MSC R-AXIS
IV++ image plate detectors and processed using the
software package of
HKL
[12]. The first one was carried
out using the triangular plate crystal form (type I) at
Institute of Biological Chemistry, Academia Sinica, using
an MSC MicroMax 002 X-ray generator. The second data
set o f the polyhedral crystal form (type II) was c ollected
at t he National Synchrotron Radiation R esearch Center,
Hsinchu, Taiwan, using beam line 17B2 as an X-ray
source.
Crystallographic computing and modelling
Most calculations for molecular replacement, electron
density maps a nd structural refinem ents were c arried out
using t he program
CNS
[13]. F or type II crystal, r efinements
and map calculations also used
SHELX
-97 [14]. Substitution
of side chains, addition of water molecules, manual
adjustment of the protein models and rebuilding of the
N- and C-terminal segments were performed using the
program
O
[15].
For h omology modelling of SUMO-1 a nd -3, t he refined
SUMO-2 model at 1.6 A
˚
resolution of type I crystal was
used as a template. After substituting the side chains, their
conformations were adjusted with reference t o the NMR
structure of SUMO-1 and the crystal structure of yeast
A
CD
B
Fig. 2. Photographs and electron density maps of the SUMO-2 crystals. Shown in (A) and (B) are two different crystal forms I and II obtaine d under
slightly different conditions. The sizes of crystals are 0.25 · 0.25 · 0.05 mm
3
in (A) and 0.35 · 0.15 · 0.1 mm
3
in (B). In (C) and (D) are
representative electron density maps superimposed on the refined models of the two crystal forms I a nd II, respectively. B oth were contoured at
2.0 r levels using 2Fo–Fc maps phased by the refined mo dels. The side chain of Lys21 lacks well-defined density, presumably because i t is flexible.
4116 W C. Huang et al. (Eur. J. Biochem. 271) Ó FEBS 2004
SMT3. The models were then subjected to molecular
dynamics and energy minimization using
CNS
, while the
backbone atoms were restrained w ith t he original model
coordinates. For structural comparisons with ubiquitin,
yeast SMT3 and human SUMO-1, models directly from the
Protein Data Base (PDB) entries 1UBQ, 1 EUV (chain B)
and 1A5R (model 1), respectively, were used.
Figure 1 was produced using the program
ALSCRIPT
[16].
The r ibbon diagrams and the electron density m aps in
Figs 2, 3 and 5 were drawn using
MOLSCRIPT
[17],
BOBSCRIPT
[18] and
RASTER
3
D
[19]. The molecular surface properties
were examined using
GRASP
[20], w hich was a lso used t o
generate Fig. 4. Model geometry and crystal contacts were
analysed using the programs
PROCHECK
and
AREAIMOL
of
the CCP4 package [21].
Results and Discussion
Structure determination and refinement
Analysis of the diffraction patterns suggested that both type
I and type II SUMO-2 crystals belong to the rhombohedral
space group R3. Statistics for the two data sets are shown in
Table 1 . Although t he unit cell dimensions are similar in the
a-andb-axes, the s ignificant difference in the c-axes i mplies
that the crystals are not entirely isomorphous. Using
synchrotron, type I crystals also d iffracted to a h igher
resolution than 1.6 A
˚
, but not as good as type II crystals.
With one SUMO-2 molecule in an asymmetric unit, the
specific volumes (or Matthews coefficients [22]) are 1.60
and 1.81 A
˚
3
ÆDa
)1
, suggesting solvent contents of 23.0%
Fig. 3. Tertiary structure of SUMO-2 and comparison with other proteins. (A) A ribbon r epresent ation of t he protein fold. (B) A topo logy diagram
with well-defined backbone hydrogen bonds. The helices (a1, a2) and strands (b1–b5) are coloured in magenta, blue, green, yellow a nd red f rom N
to C t erminus. The hydrogen bond d istances, with a c rite rion of less t han 3.2 A
˚
, are observed in the r efined model at 1.2 A
˚
, with o ne exception
betweenAsp16andArg36,whichisseeninthe1.6A
˚
model. The amino acids are shaded in red, gree n and blue for acidic, neutral an d basic polar
residues, and in yellow for prolines and glycines. In (C) the polypeptide tracings of two SUMO-2 models from type I (12–89) and type II (17–88)
crystals, shown in green and red, are superimposed with th at of human ubiquitin (1–76), shown in blue. In (D ) the yeast SMT3 crystal structure
(20–98) and human SUMO-1 NMR structure ()2–101), coloured yellow and cyan, respectively, are compared with the SUMO-2 structure (type I
crystal), shown in red.
Ó FEBS 2004 Structure and function of human SUMO-2 (Eur. J. Biochem. 271) 4117
and 31.9% for the type I and type II crystal forms,
respectively.
The NMR model of human SUMO-1 (PDB code 1A5R)
contains full-length protein, whereas the N- and C-terminal
regions are fl exible. Molecular replacement search u sing the
NMR model did not yield a correct solution for the crystal
structure of SUMO-2, even with omission of the terminal
segments. Instead, i t was solved using yeast SMT3 (PDB
code 1EUV) as a search model. The initial R value for the
type I crystal was 0.465 after rigid-body refinement at 3.0 A
˚
resolution. The final m odel c ontains amino acid residues
12–89 and 67 water molecules, with R and R
free
values of
0.169 and 0.190, respectively. The R value for the type II
crystal based on the refined type I model was 0.409 at 1.5 A
˚
.
After refinement, the model contains amino acid residues
17–88 and 127 water molecules, with R and R
free
of 0.119
and 0.185, respectively. Statistics are shown in Table 1.
Details of the refinement procedures are summarized in
Table 2 . The atomic coordinates a nd structure f actors of
type I and type II crystals have been deposited in the R CSB
Protein D ata B ank, with accession cod es 1WM2 and
1WM3, respectively.
Quality of the model and structure comparison
The coordinate errors in the refined SUMO-2 models are
between 0.15 A
˚
and 0.20 A
˚
as estimated by Luzzati plots
[23]. The electron density maps in a representative region are
shown in F ig. 2C and D. At 1.2 A
˚
resolution, individual
atoms b egin to appear as discrete spheres. An overall r ibbon
diagram i s shown i n F ig. 3A. The peptide folding of
SUMO-2 protein c onsists of a h alf-open b-barrel and
two fl anking a-helices, w ith secondary structure elements
arranged as bbabbab in th e s equence (Fig. 1), identical to
those of ubiquitin, SMT3 and SUMO-1. Fig. 3B shows a
topology diagram of S UMO-2. The 39 w ell-defined back-
bone hydrogen bonds include not only those for the
b-st rands and a-helices, but also three bonds for turns and
two for tertiary interactions.
The protein models of SUMO-2 type I and type II
crystals superimpose with a n r .m.s.d. of 0.544 A
˚
for 288
backbone atoms and 1.201 A
˚
for all 584 atoms. Larger
deviations of Ca coordinates t han 1.0 A
˚
occur in the
residues 17, 26, 27, 5 6 a nd 88. A lthough type I I crystal
diffracts to higher resolution, its visible N terminus is
shorter than that o f type I crystal by five residues. As shown
in Fig. 3A, this segment extends away from t he protein core
and should be flexible because of exposure to the bulk
solvent. The smaller unit-cell dimension of type I crystal
allows the N terminus to be docked onto a neighbouring
molecule, specifically, near the region of Phe60–Thr70, and
thus stabilizes the extended conformation.
Also shown in F ig. 3C, the model of human ubiquitin
(PDB code 1UBQ) is superimposed on the S UMO-2
models of type I and II crystals, with an r.m.s.d.
of 0.952 A
˚
and 1.135 A
˚
for 55 and 65 Ca atoms,
Fig. 4. Surface properties of SUMO proteins.
The molecular surface of SUMO-2 (type I
crystal)isshownin(A)and(C);thatofthe
SUMO-1modelisshownin(B)and(D).The
charge potentials in ( A) (C) and (D) a re cal-
culated using
GRASP
with a r ange of )10 to
+10 k
B
T, in which k
B
is Boltzmann constant
and T is Kelvin temperature, and coloured
from red t o blue. Neutral a reas are shown in
white. In (B) the conserved regions that
interact with Ubc9 and Ulp1 are highlighted
andcolouredinorange,cyanandmagenta,as
in Fig. 1. In (E) and (F) the c orresponding
amino acids for different surface charges on
SUMO-2 and SUMO-1 are shown. Positively
charged, negative charge d and neutral polar
residues are coloured blue, r ed and magenta,
respectively, a nd nonpolar residues are shown
in green. The views in (C–F) are similar to that
of Fig. 3A and those of (A) and (B) a re rota-
ted 180° about the horizontal axis.
4118 W C. Huang et al. (Eur. J. Biochem. 271) Ó FEBS 2004
respectively. This is based on a distance criterion o f le ss
than 2.0 A
˚
, which excluded t he residues 45–58 in the
former model and 40 , 49 and 55–58 in the latter model of
SUMO-2 and the equivalents of ubiquitin. Although the
sequences have only 18% identity, the protein folds of
SUMO-2 and ubiquitin are very similar, even without
insertion (Fig. 1). Yet th ese t wo classes of proteins h ave
very different functions, which m ay be explained by the
disparate surface charge distributions [6].
Significant difference between the yeast SMT3 crystal
structure and the human SUMO-1 NMR s tructure has been
observed by Mossessova and Lima [8]. In Fig. 3D the
SUMO-2 model is superimposed with those of SMT3
(1EUV) and S UMO-1 (1A5R). Based on a distance
criterion of 2 .0 A
˚
, the r.m.s.d. is 1.096 A
˚
between 43 pairs
of Ca atoms in S UMO-1 ( NMR) and SUMO-2 (type I
crystal). Under the same condition, the r msd is 0.918 A
˚
between 67 Ca pairs in SUMO-2 a nd SMT3, and it is
0.470 A
˚
for 40 matched pairs with a distance criterion of
1.0 A
˚
. Therefore, the crystal structure of human SUMO-2
is more similar to that of yeast SMT3 than to the NMR
structure of SUMO-1. The difference between SUMO-2
Table 1. X-ray data statistics for S UM O-2 crystals. Numbers in parentheses are for the highest resol ution shells.
Crystal form
Type I Type II
Data collection
Space group R3 (hexagonal indexing) R3 (hexagonal indexing)
Unit cell (A
˚
)a¼ b ¼ 75.25, c ¼ 29.17 a ¼ b ¼ 74.96, c ¼ 33.23
X-ray source MicroMax 002 NSRRC BL17B2
Wavelength (A
˚
) 1.5418 1.0717
Detector RAXIS-IV++ RAXIS-IV++
Crystal-to-film distance (mm) 100 83.4
Oscillation range (°) 1.0 1.5
Mosaicity (°) 0.614 0.292
Number of frames 186 145
Resolution range (A
˚
) 50–1.6 (1.66–1.60) 20–1.2 (1.24–1.20)
Number of observations 42023 (2120) 141402 (11874)
Unique reflections 8015 (690) 21781 (2109)
Completeness (%) 98.5 (85.1) 100.0 (99.8)
Average I/r(I) 45.5 (5.3) 39.6 (2.6)
R
merge
(%) 3.9 (24.6) 4.6 (55.9)
Refinement
Software
CNS
1.1
SHELX
-97
Total reflection used [F >0r(F)] 7868 (633) 20948 (1924)
R for 95% working data set 0.169 (0.266) 0.119 (0.217)
R
free
for 5% est data set 0.190 (0.273) 0.185 (0.239)
rmsd from ideal bond lengths (A
˚
) 0.017 0.013
rmsd from ideal bond angles (°) 1.8 2.3
rmsd from ideal dihedral angles (°)2726
rmsd from ideal improper angles (°) 1.3 1.8
Ramachandran plot: number of residues in most favored regions (%) 97.1 96.8
In additional allowed regions (%) 2.9 3.2
Average B-values/number of atoms for protein backbone (A
˚
2
) 17.7/312 18.4/288
For protein side chains (A
˚
2
) 22.8/322 27.6/297
For water molecules (A
˚
2
) 34.3/67 42.8/127
Table 2. Refinement procedures of the SUMO-2 crystals.
Description of steps Protein Water Resolution R/R
free
Type I crystal, yeast SMT3 model 13–98 (SMT3) 3.0 A
˚
0.464
Delete N- and C-termini, insert Asp26 16–88 (SUMO-2) 2.0 A
˚
0.339/0.371
Add water molecules, B-value refinement 16–88 38 1.6 A
˚
0.190/0.224
Extend the termini, add more waters 12–89 67 1.6 A
˚
0.169/0.190
Type II crystal, type I model 12–89 67 1.5 A
˚
0.409
Delete N- and C-termini, remove waters 16–87 0 1.5 A
˚
0.375
Modify N-terminus, add water molecules 17–87 102 1.2 A
˚
0.191/0.205
Use
SHELX
, anisotropic B-values 17–87 102 1.2 A
˚
0.133/0.190
Extend C-terminus, add more waters 17–88 127 1.2 A
˚
0.119/0.185
Ó FEBS 2004 Structure and function of human SUMO-2 (Eur. J. Biochem. 271) 4119
crystal structure and SUMO-1 NMR structure is partic-
ularly evident in the regions of 28–43 and 71–83, that
correspond to the strand b2, the N terminus of the h elix a1,
the helix a2, and the connecting loop to the strand b5
(Fig. 3 A,D).
SuchalargedifferencebetweentheNMRandcrystal
structures may explain the fact t hat we were not able to
solve our crystal structure by the molecular replacement
method using SUMO-1 N MR structure as the starting
model. The high-resolution NMR structure of SUMO-1
determined later using heteronucle ar NOE also showed
difference from the s tructure of 1A5R [7 ]. Interestingly, this
new SUMO-1 NMR structure is similar to the SMT3 NMR
structure, whereas significant deviations betw een the crystal
structure and solution structure of SMT3 were also
observed [9]. Therefore, the deviations may be due to
different environments and different experimental tech-
niques used in the structure determinations.
Surface potential and functional difference
The mechanisms of protein ubiquitination and SUMOyla-
tion are similar, which involve the activating, conjugating,
and ligation enzymes E1, E2 and E3. A peptidase is also
required to remove the C-terminal peptide of a SUMO
protein to render the mature form, which has the C-terminal
Gly-Gly motif for conjugation with target proteins [4]. In
yeast, an E1-specific for SUMO has been identified as a
large heterodimeric Aos1/Uba2 of 11 0 kDa, and there i s
a heterodimeric homologue SAE1/SAE2 in man. The E2 in
both human and yeast is a highly conserved Ubc9 of
18 kDa, whereas the E3 proteins have a broader definition
and comprise s everal s ubclasses [ 4]. The enzymes Ulp1 a nd
Ulp2 in yeast are located in the nuclear pore complex and
nucleoplasm, and they are the protease and isopeptidase for
processing SUMO precursor and deSUMOylation of target
proteins, whereas in mammals the Ulp1 family comprises
several proteases with various localizations [24]. Despite the
similar mechanism, ubiquitination and SUMOylation path-
ways are different, involving two distinct sets of enzymes,
and i n some aspects they are comp etitive [25]. As first
proposed by studying the SUMO-1 NMR s tructure, t he
functional difference is expressed in the surface charge
distributions [6]. In Fig. 4A, the surface of SUMO-2 protein
shows a region with strong negative c harge potential. I n
contrast, the corresponding region of ubiquitin is mostly
neutral (data not shown). Presumably this is the basis for
them to interact differently with the various enzymes and
other proteins.
The interactions between SUMO-1 a nd Ubc9 have been
studied by NMR chemical shift perturbation experiments
[26,27]. Three major regions had the most significant
changes; these a re indicated in t he sequences of Fig. 1 and
mapped on the surface of our SUMO-1 model in Fig. 4B.
The positively ch arged Lys25 ( Lys21 in SUMO-2) and a
cluster of four negatively charged amino acids Glu83-Glu84-
Glu85-Asp86 (Glu79-Asp80-Glu81-Asp82 in SUMO-2) are
supposed to interact with Ubc9. These two regions are
conserved among four human SUMO proteins as well as the
yeast SMT3 protein (Fig. 1). In the crystal structure of yeast
Ulp1–SMT3 complex, the Ulp1 protein makes direct
contact not only with the C-terminal segment that contains
the functional G ly-Gly motif, but also with the region
Arg64–Arg71 (Fig. 1). These correspond to Arg59–Pro66 in
SUMO-2 and , with an adjacent Arg61 substituting Leu66 in
SMT3, the surface features in this region are also con served.
However, interactions between SUMO and other proteins,
including E3, may be established with other surface regions.
Although the sequences of human SUMO-2 and -3 are
87% i dentical, they a re located in different c ellular com-
partments: SUMO-2 was found in nuclear bodies but
SUMO-3 was located in the cytoplasm [10]. The s urface
charge distribution of SUMO-2/-3 is even more similar.
When these two protei n surfaces a re compared, t he only
visible difference corresponds to residue 77, which is a
negatively charged Glu in SUMO-2, but is a positively
charged A rg in SUMO-3. On the other hand, SUMO-1 is
47% identical to SUMO-2 in sequence, and has a longer
N-terminal arm. The r esulting difference in their surface
properties can be attributed to at least 10 residues. These
include Glu33, Lys48, Glu49, Gln53, Asn60, Leu6 5, Arg70,
Lys78, Gly81 and Glu93 in SUMO-1, whereas t he corres-
ponding surface residues in SUMO-2 are Val29, Met44,
Lys45, Glu49, Arg56, Arg61, Pro66, Ala74, Glu77 and
Gln89, respectively. The most prominent is a concave re gion
shown i n F ig. 4 C a nd D, which i s fl anked b y the helix a1
and the strands b3/b4 (Fig. 3A). This region is neutral in
SUMO-2 but positively charged in SUMO-1, probably
caused by the substitution of Met44 in SUMO-2 with Lys48
in SUMO-1, as shown in Fig. 4 E and F. In particular,
the concave surface is near the C terminus, and thus
may serve as a potential site for d iscrimination between
SUMO-1 and -2 i n humancells. The flexible N-terminal arms
of SUMO-1, -2 and -3 proteins, which have different lengths,
may also be involved in the interactions with other proteins,
whereas ubiquitin does not have such an equivalent.
Crystal packing and oligomeric assembly
The SUMO-2 structure presented in this p aper is the first
high-resolution crystal structure of human SUMO protein.
The two crystal forms of truncated SUMO-2 studied here
are not isomorphous, but the crystal packing is similar. Each
protein molecule is in lattice contact with 10 symmetry-
related molecules via five types of contact interfaces. The
total areas buried by the lattice contact interfaces are
3412 A
˚
2
in type I c rystal (1.6 A
˚
) a nd 2211 A
˚
2
in type II
crystal (1.2 A
˚
), whereas the molecu lar s urface areas of t he
SUMO-2 protein models, containing residues 12–89 and
17–88, are 5264 A
˚
2
and 486 6 A
˚
2
, respectively.
The first and m ost conserved i nterface is between
molecules related by the crystallographic threefold axis.
The buried areas are 856 A
˚
2
and 821 A
˚
2
on each SUMO-2
monomer in the type I and type II crystals, respectively,
corresponding to about o ne-quarter and more than one-
third of the contact surfaces. T he interactions include two
hydrogen bonds between backbone atoms of Gly27(O)–
Lys33*(N) and Val29(N)–Gln31*(O), and a salt bridge
between the side chains o f Asp26 and Arg50*. (Amino acid
residues of t he symmetry-related molecules are denoted by
asterisks.) The latter is also hydrogen b onded to Tyr47(OH)
and Gln51(OE1). Such interactions, particularly those
between the strands b2, may stabilize a possible trimeric
assembly of SUMO-2 in solution, shown in Fig. 5. The
4120 W C. Huang et al. (Eur. J. Biochem. 271) Ó FEBS 2004
other four interfaces are not all conserved, whereas the
buried surface areas are much larger in type I crystal than in
type II. Because the c-axis is significantly shorter, more
lattice interactions were observed i n type I crystal. These
include docking of the flexible N-terminal segment onto a
neighbouring molecule.
Polymers of ubiquitin h ave been studied extensively since
they were discovered [28]. The site of self-conjugation is
Lys48. This residue corresponds to Gln65 in S UMO-2 a nd
is conserved in SUMO-1 and -3 (Fig. 1). Consequently,
SUMO does not form polymers in the same manner as
ubiquitin. However, in a recent study [11], oligomers of
SUMO-2/-3 w ere identified in vitro due to the existence of
VKXE motif, a specific consensus SUMOylation site, in the
N-terminal arm. The distance between C a atoms o f the
N-terminal Thr12 a nd C-terminal Gln89* of neighbouring
SUMO-2 molecules related by the triad axis is 20.6 A
˚
in
type I crystal, and that between His17 and G ln88* in type II
crystal is 20.7 A
˚
, comparable to the distance between Ca
atoms separated by six peptide bonds in extended confor-
mations. Thus, in the trimer, it is possible for the Lys11 of
one SUMO-2 molecule to form an isopeptide bond with the
Gly93 of another.
The crystal structures of diubiquitin and tetraubiquitin
showed some alternatives of the quaternary c onformations
of ubiquitin polymer for e fficient r ecognition by the 26S
proteosome, y et no conclusion has been reached due to the
inherently flexible intermolecular links [29]. The SUMO-2
trimer in Fig. 5 has a completely d ifferent arrangement
from those of ubiquitin polymers, and the sites of conju-
gation are also different. It is uncertain whether the trimer is
an oligomerization motif for SUMO-2, and t his possibility
is currently under investigation.
Acknowledgements
We thank Drs Chia-Cheng Chou, Rey-Ting Guo and Cheng-Chung
Lee for their a ssistance in data collection. We also thank the National
Synchrotron Radiation Research Center for beam time allocation. This
work was supported by grants from National Science Council (NSC
92–3112-B-110 -001 and NSC 93-3112-B-110-001) to SSLL and from
Academia Sinica to A.H.J.W.
References
1. Pickart, C.M. (2004) Back to the future w ith ubiquitin. Cell 116,
181–190.
2. Mu
¨
ller,S.,Hoege,C.,Pyrowolakis,G.&Jentsch,S.(2001)
SUMO, ubiquitin’s mysterious cousin . Nat. Rev. Mol. Cell Biol. 2,
202–210.
3.Mannen,H.,Tseng,H M.,Cho,C L.&Li,S.S L.(1996)
Cloning and e xpression of human homolog H SMT3 to yeast
SMT3 suppressor of MIF2 mutations in a centromere protein
gene. Biochem. Biophys. Res. Commun. 222 , 178–180.
4. Melchior, F. (2000) S UMO-nonclassical ubiquitin. Annu. Rev.
Cell Dev. Biol. 16, 591–626.
5. Bohren, K.M., Nadkarni, V., Song, J.H., Gabbay, K.H. &
Owerbach, D. (2004) A M55V polymorphism in a novel SUMO
gene (SUMO-4) differentially activates h eat s hock transcription
factors and is asso ciated with su sceptibility to t ype I diabetes
mellitus. J. Biol. Chem. 279, 27233–27238.
6. Bayer, P., Arndt, A., Metzget, S., Mahajan, R . & Melchior, F.
(1998) St ructure d etermination of the small ubiquitin-related
modifier SUMO-1. J. M o l. Biol. 280, 275–286.
7. Jin,C.,Shiyanova,T.,Shen,Z.&Liao,X.(2001)Heteronuclear
nuclear magnetic reso nance assignments, s truc ture and d yn amics
of SUMO-1, a human ubiquitin-like protein. Int. J. Biol. Macro-
mol. 28, 2 27–234.
8. Mossessova, E. & Lima, C.D. (2000) Ulp1-SUMO crystal struc-
ture and genetic analysis reveal conserved interactions and a
regulatory element essential for cell g rowth in y east. Mol. Cell 5,
865–876.
9. Shen, Z. & Liao, X. (2002) Solution structure of a yeast ubiquitin-
like p rotein Smt3: the role of structurally less defined sequences in
protein-protein recognitions. Protein Sci. 11, 1 482–1491.
10. Su, H.L. & Li, S.S L. (2002) Molecular features of human
ubiquitin-like SUMO gen es and the ir encoded proteins. Gene 296,
65–73.
11. Tatham, M.H., Jaffray, E., Vaughan, O.A., D esterro, J .M., Bot-
ting, C.H., Naismich, J.H. & Hay, R.T. (2001) Polymeric chains of
SUMO-2 and SUMO-3 are conjugated to protein substrates by
SAE1/SAE2 and Ubc9. J. Biol. Chem. 276, 35368–35374.
12. Otwinowski, Z. & M inor, W. (1997) Processing of X-ray diffrac-
tion data collection in oscillation mode. Methods Enzymol. 276,
307–326.
13. Brunger, A.T., A dams, P.D., Clore, G.M., DeLano, W.L., Gros,
P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M.,
Panu, N.S., Read, R.J., Rice , L.M., Simonson, T. & Werren, G .L.
(1998) Crystallography and NMR system : a new s oftware suite for
macromolecular structure determination. Acta Crystallogr. D54,
905–921.
14. Sheldrick, G.M. & Schnieder, T.R. (1997) SHELXL: high
resolution refinement. Me thods Enzymol. 277 , 319–343.
15. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. (1991)
Improved methods for building protein models in electron density
maps and th e locatio n of errors in models. Act a Crystallogr. A47,
392–400.
16. Barton, G.J. (1993) ALSCRIPT: a tool to format multiple
sequence alignments. Protein Eng. 6, 37–40.
Fig. 5. Trimer of SUMO-2. In both crystal forms the conserved
interactions between three molecules rela ted by the c rystallograph ic
threefold axis suggest the possible existence of a trimeric a ssembly in
solution. A no tew orthy feature is the association of t hree strands b2
around the triad axis, held together by several hydrogen bonds
between backbone atoms.
Ó FEBS 2004 Structure and function of human SUMO-2 (Eur. J. Biochem. 271) 4121
17. Kraulis, P.J. (1991) M OLSCRIPT: a program to produce both
detailed and schematic plots of protein structure. J. Appl. Crys-
tallogr. 24 , 946–950.
18. Esnouf, R.M. (1997) An extensively modified version of MolScri pt
that includes g rea tly enha nced coloring c apabilities. J. Mol. Graph.
15, 132–134.
19. Merrit, E.A. & Murphy, M.E.P. (1994) Raster3D, Version 2.0. A
program for p hotorealistic m olec ular grap hics. Acta Crys tallogr.
D 50, 869–873.
20. Nicholls, A., Sharp, K.A. & Honig, B. (1991) P rotein folding a nd
association: insights from the interfacial and thermodynamic
properties of hydrocarbons. Proteins 11, 281–296.
21. Collaborative Computational P roject Number, 4 (1994) The
CCP4 suite: p rograms for protein crystallography. Acta Crystal-
logr. D 50 , 760–763.
22. Matthews, B.W. (1968) Solvent content of protein crystals. J. Mol.
Biol. 33, 491–497.
23. Luzzati, P.V. (1952) Tra itement statistique des erreurs dans la
determination des structures cristallines. Acta Crystallogr. 5, 802–
810.
24. Melchior, F., Schergaut, M. & Pichler, A. (2003) SUMO:
ligases, i sopeptidases an d nuclear pores. Trends Biochem. Sci. 28,
612–618.
25. Gill, G. (2003) Post-translational modification by the small
ubiquitin-related modifier SUMO has big effects on transcription
factor ac tivity. Curr. Opin. Genet. Dev. 13, 108–113.
26. Liu, Q., Jin, C., Liao, X., Shen,Z.,Chen,D.J.&Chen,Y.(1999)
The binding interface between an E2 (UBC9) and a ubiquitin
homologue (UBL1). J. Biol. Chem. 274, 16979–16987.
27. Tatham, M.H. & Kim, S., YuB., Jaffray, E., Song, J., Zheng, J.,
Rodriguez, M.S., Hay, R.T. & Chen, Y. (2003) Role of an
N-terminal site of Ubc9 in SUMO-1-2, and -3 binding and c on-
jugation. Biochemistry 42, 9959–9969.
28. Chan, N.L. & Hill, C.P. (2001) Defining polyubiquitin chain
topology. Nat. Struct. Biol. 8, 650–652.
29. Philips, C.L., Thrower, J ., Pickar t , C.M. & Hill, C.P. (2001)
Structure of a new crystal form of tetraubiquitin. Acta Crystallogr.
D 57, 341–344.
4122 W C. Huang et al. (Eur. J. Biochem. 271) Ó FEBS 2004