Tải bản đầy đủ (.pdf) (15 trang)

Báo cáo khoa học: Physiological truncation and domain organization of a novel uracil-DNA-degrading factor pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (979.79 KB, 15 trang )

Physiological truncation and domain organization of a
novel uracil-DNA-degrading factor
Ma
´
ria Puka
´
ncsik
1
, Ange
´
la Be
´
ke
´
si
1
,E
´
va Klement
2
,E
´
va Hunyadi-Gulya
´
s
2
, Katalin F. Medzihradszky
2,3
,
Jan Kosinski
4,5


, Janusz M. Bujnicki
4,6
, Carlos Alfonso
7
, Germa
´
n Rivas
7
and Bea
´
ta G. Ve
´
rtessy
1
1 Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
2 Proteomics Research Group, Biological Research Center, Hungarian Academy of Sciences, Szeged, Hungary
3 Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
4 Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland
5 PhD School, Institute of Biochemistry and Biophysics PAS, Warsaw, Poland
6 Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, Poznan, Poland
7 Chemical and Physical Biology, Centro de Investigaciones Biolo
´
gicas, Madrid, Spain
Introduction
The nucleobase uracil is not a normal constituent of
DNA, although it provides the same Watson–Crick
interaction pattern for adenine as does thymine (i.e.
5-methyl-uracil), and is actually used as the adenine-
counterpart base in RNA. Despite its usual absence,
there are two physiological ways for uracil to appear

Keywords
cell death; DNA; nuclease; protein structural
modeling; uracil
Correspondence
B. G. Ve
´
rtessy, Institute of Enzymology,
Biological Research Center, Hungarian
Academy of Sciences, H-1113, Budapest,
Karolina u
´
t 29, Hungary
Fax: +36 1 466 5465
Tel: +36 1 279 3116
E-mail:
(Received 2 April 2009, revised 16
December 2009, accepted 18 December
2009)
doi:10.1111/j.1742-4658.2009.07556.x
Uracil in DNA is usually considered to be an error, but it may be used for
signaling in Drosophila development via recognition by a novel uracil-
DNA-degrading factor (UDE) [(Bekesi A et al. (2007) Biochem Biophys
Res Commun 355, 643–648]. The UDE protein has no detectable similarity
to any other uracil-DNA-binding factors, and has no structurally or func-
tionally described homologs. Here, a combination of theoretical and experi-
mental analyses reveals the domain organization and DNA-binding pattern
of UDE. Sequence alignments and limited proteolysis with different prote-
ases show extensive protection by DNA at the N-terminal duplicated con-
served motif 1A ⁄ 1B segment, and a well-folded domain within the
C-terminal half encompassing conserved motifs 2–4. Theoretical structure

prediction suggests that motifs 1A and 1B fold as similar a-helical bundles,
and reveals two conserved positively charged surface patches that may bind
DNA. CD spectroscopy also supports the presence of a-helices in UDE.
Full functionality of a physiologically occurring truncated isoform in
Tribolium castaneum lacking one copy of the N-terminal conserved motif 1
is revealed by activity assays of a representative truncated construct of
Drosophila melanogaster UDE. Gel filtration and analytical ultracentrifuga-
tion results, together with analysis of predicted structural models, suggest a
possible dimerization mechanism for preserving functionality of the
truncated isoform.
Structured digital abstract
l
MINT-7385914: UDE (uniprotkb:Q961C4) and UDE (uniprotkb:Q961C4) bind (MI:0407)by
cosedimentation in solution (
MI:0028)
Abbreviations
DmUDE, Drosophila melanogaster uracil-DNA-degrading factor; Dm
rc
UDE, recombinant Drosophila melanogaster uracil-DNA-degrading factor;
MQAP, model quality assessment program; TcUDE, Tribolium castaneum truncated uracil-DNA-degrading factor isoform; UDE,
uracil-DNA-degrading factor; UDG, uracil-DNA glycosylase.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1245
in DNA: cytosine deamination and thymine replace-
ment. Cytosine-to-uracil transitions via hydrolytic
deamination are among the most frequently occurring
spontaneous mutations. These generate premutagenic
U:G mispairs [1,2]. Thymine replacement by uracil can
occur if the cellular dUTP ⁄ dTTP ratio increases, as
most DNA polymerases will incorporate either uracil
or thymine against adenine, based solely on the avail-

ability of the corresponding building block nucleotides
[3,4]. Thymine-replacing uracil moieties are not muta-
genic, as they provide the same genomic information,
but may perturb the binding of factors that require the
5-methyl group on the thymine ring for recognition.
There are also two mechanisms to ensure uracil-free
DNA: prevention and excision. dUTPases prevent
uracil incorporation into DNA by removing dUTP
from the DNA polymerase pathway [5]. Uracil in
DNA, produced by either cytosine deamination or
uracil misincorporation, is excised by uracil-DNA gly-
cosylases (UDGs) in the base excision repair pathway
[6,7]. Among the different UDGs, the protein product
of the ung gene (termed UNG) is by far the most effi-
cient in catalyzing uracil excision [8]. UNG is responsi-
ble for most of the repair process, as its mutation in
Escherichia coli, mouse and human has been found to
induce a considerable increase in uracil content [9–11].
Null mutations in the dUTPase gene (dut) result in a
nonviable phenotype that can be rescued by a second
null mutation in the ung gene. The dut
)
ung
)
genotype
presents mass uracil incorporation into DNA [9,12].
Interestingly, an analogous situation, with simulta-
neous lack of dUTPase and UNG activities, arises in
Drosophila larvae under physiological conditions. On
the one hand, the ung gene coding for the major UDG

enzyme is not present in the Drosophila genome [13].
On the other hand, it has been shown that the
dUTPase level is under the limit of detection in larval
tissues, and that the enzyme is present exclusively in
the imaginal disks [14]. Simultaneous lack of UNG
and dUTPase may lead to accumulation of uracil-
substituted DNA in fruitfly larval tissues. A specific
protein termed uracil-DNA-degrading factor (UDE),
which recognizes and degrades uracil-DNA, was
identified in Drosophila late larvae and pupae,
strengthening the hypothesis that Drosophila melanog-
aster may use uracil-DNA as a signal to switch on
metamorphosis-related cell death [15–17].
UDE is the first member of a new protein family
whose members recognize uracil-DNA. It has no
glycosylase activity, and its sequence does not show
any appreciable similarity to those of other nucleases
or uracil-DNA-recognizing proteins [15] (Fig. 1). Sig-
nificantly similar protein sequences were found only in
translated genomes of other pupating insects, but no
structural or functional data have been published on
any of these putative proteins. In all of these sequences
of homologous proteins, four distinct conserved
sequence motifs could be identified (motifs 1–4), the
first of which is substantially longer and is usually
present in two copies (motifs 1A and 1B). Comparison
of these motifs with motifs in UDGs does not offer any
clue regarding the structure and function of UDE, as no
apparent similarity could be observed (Fig. 1B–E) [18].
Investigation of this protein may therefore offer new

insights into the physiological role and catalytic mech-
anism of nucleases.
To this end, in the present study we probed the
domain organization of UDE from D. melanogaster,
expressed as a recombinant protein (Dm
rc
UDE), by
limited proteolysis, and revealed that a specific trun-
cated fragment lacking the N-terminus may fold into a
stable conformation. Interestingly, we also identified
such a truncated physiologically occurring UDE iso-
form from the pupating insect Tribolium castaneum
(TcUDE) [19]. The TcUDE isoform lacks one copy of
the N-terminal duplicated first motif. We generated the
respective segment from Dm
rc
UDE by chemical cleav-
age with hydroxylamine, and found that this truncated
segment retains catalytic specificity and activity. The
structural results therefore offer an explanation for the
physiological existence of the truncated isoform.
De novo modeling was performed using rosetta, and
a 3D structural model was constructed for the tan-
demly duplicated N-terminal motifs 1A and 1B. The
model suggests that both motifs comprise similar
three-helical bundles, with the same topology and rela-
tive orientation of a-helices. A high content of helical
secondary structure in UDE was also independently
confirmed by CD. The predictions, together with the
domain organization studies, offer a model of DNA

binding to an extended surface on the protein along
the conserved motifs.
Results
Identification of a physiologically occurring
truncated isoform of UDE
blast searches indicated that UDE has detectable
homologs only in pupating insects (Fig. 1A). The mul-
tiple sequence alignment shows four conserved motifs
(Fig. 1A,E). The first extended UDE motif is present
in two highly similar copies. The UDE homolog from
T. castaneum contains only one copy of motif 1, sug-
gesting that lack of the first motif may still result in a
functional protein (Fig. 1A).
Protein function preserved in a truncated isoform M. Puka
´
ncsik et al.
1246 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
MUG/TDG
AUDG
UDGX
SMUG
UNG
DRUDG
1 2 3
Common α/β fold
1A 1B 2 3 4
UDG families Motif 1 Motif 2 Motif 3
UNG
VhhLGQDPYH F hFhhWG hhcppH
PSP

MUG/TDG
hhhxGINPGL F/Y hhhFxG haVhPppSh
UDGX
hhhhGShPxx Y hhhxpG hhxLPSTSx
AUDG
hhhhGExPGx F hhhxhG hhhxaHPSh
DRUDG
xLxLLExPGP f VVhxLG xhxxxHPSh
SMUG
hhFhGMNPGP F hhVhVG VxxLxH
PSP
UDE
motifs
1A/1B
GFKDxxxAxxTLxxLxxRDxpYpxxxhxGLhxxAKRVLxxTKxExKhxxIKxAhxxhEpaL
2
GpYKcLRp
3
TWDIxRN
4
KxFpxcxxxPTxxHLxxIxWAYSxpxxKhK
UDE
UDG
families
B
A
D
E
C
Fig. 1. Sequence alignment of UDE homologs in D. melanogaster and T. castaneum, and conserved motifs in UDE and members of the

UDG superfamily. (A) Alignment of D. melanogaster and T. castaneum UDE homologs. Gray background: conserved motifs. Red letters:
strictly conserved residues. (B) Evolutionary relationship and organization of conserved motifs among UDG proteins [18]. Gray background:
uracil-DNA-recognizing proteins present in D. melanogaster. (C) Organization of conserved motifs in UDE. (D, E) Consensus sequences of
UDG (D) and UDE (E) motifs. Upper-case letters: conserved residues. Lower-case letters: residues with conserved characteristics (h, hydro-
phobic; a, aromatic; p, polar ⁄ charged). Nonconserved positions are indicated by x. A conserved F ⁄ Y residue, overlapping with the uracil ring,
is invariably present C-terminal to motif 1 in UDGs. Underlined Asp ⁄ Glu residues in UDG motif 1 are involved in catalysis; the underlined His
in UDG motif 3 is suggested to stabilize reaction intermediates. Note the lack of detectable similarities between UDE and UDG motifs.
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1247
To confirm the in silico prediction of the UDE-like
protein product in Tribolium, extracts of the insect
larvae were investigated by western blot, using the
polyclonal antiserum produced against Dm
rc
UDE. As
expected from the high sequence similarity, the antise-
rum recognized the Tribolium protein (TcUDE) as well
(Fig. 2). The blot clearly indicates that larval extract
from T. castaneum contains a single protein that reacts
with the UDE-specific antibody. This positive band is
found at a position corresponding to a much lower
molecular mass than that of Dm
rc
UDE and that of the
physiological form of D. melanogaster uracil-DNA-
degrading factor (DmUDE). The altered position of
TcUDE was in agreement with the genomic data
(Fig. 1A), and led to the conclusion that the physiolog-

ically occurring TcUDE lacks the N-terminal segment.
These results suggest that an isoform of UDE lacking
motif 1A may fold on its own, and may form a func-
tional protein.
Domain organization studies using limited
proteolysis
To delineate the domain organization of the UDE pro-
tein more precisely, limited proteolysis experiments were
performed. Three proteases with different specificities
were used. Experiments were conducted with Dm
rc
UDE
alone, and also in the presence of added DNA to study
potential DNA-binding protein segments.
Trypsin was selected first, as the UDE protein con-
tains many potential tryptic cleavage sites (i.e. Lys and
Arg residues) scattered throughout the sequence. Fig-
ure 3A indicates fast initial fragmentation leading to
loss of 5–7 kDa fragments from either the N-terminus
or the C-terminus, or both. This initial fragmentation
is not affected by the presence of DNA. Flexibility of
the N-terminal segment (residues 1–47) is also sug-
gested by the drastic overrepresentation of basic resi-
dues, leading to an extremely high pI (11.5) for this
segment. At later stages of proteolysis, DNA protec-
tion is evident, as a specific fragment persists stably in
the presence of DNA, whereas this fragment is rapidly
degraded in the absence of DNA. Several smaller frag-
ments are produced in relatively large amounts in the
absence of DNA, whereas these peptides are practi-

cally absent in the presence of DNA. The data suggest
the presence of an inner folded core, which is sug-
gested to participate in DNA binding, on the basis of
DNA-binding-induced stabilization. The large number
of potential tryptic cleavage sites prevented straight-
forward identification of the fragments, observed
on SDS ⁄ PAGE, by MS.
For further characterization and localization of pro-
tein segments involved in DNA binding to UDE, two
additional sets of experiments were conducted, using
highly specific chymotrypsin [20] and Asp-N endopro-
teinase. These enzymes have considerably fewer poten-
tial cleavage sites in the protein. In both cases,
protection by DNA is again evident (Fig. 3B,C).
Figure 3B shows that, in the absence of DNA, initial
chymotryptic cleavage removes a segment of about
9.6 kDa from UDE, whereas in the presence of DNA,
the removed peptide is much smaller, around 3 kDa.
MS analysis of the initially cleaved fragments revealed
that the C-terminus remained intact, and the two pep-
tide bonds most sensitive to chymotrypsin could there-
fore be localized at the N-terminus at Trp10 and
Tyr69 in the presence and in the absence of DNA,
respectively (Fig. 3D). DNA binding is therefore asso-
ciated with significant protection at the Tyr69-Arg70
peptide bond located within the conserved motif 1A.
In addition, DNA-binding-induced conformational
changes are also reflected at the Phe104-Glu105 and
Tyr311-Ile312 peptide bonds, which become exposed in
the presence of DNA (Fig. 3D).

To characterize the potential involvement of the
C-terminal region of UDE in DNA binding, Asp-N
endoproteinase was also used for limited proteolysis,
as the C-terminus of the protein is rather rich in Asp
residues (Fig. 3C,D). When it is digested by Asp-N
endoproteinase, the primary cleavage removes a short
fragment of about 3.4 kDa, independently of the
55 kDa
36 kDa
DmUDE Dm
rc
UDE TcUDE
1A
1B 2 3 4
1 2 3 4
DmUDE
TcUDE
Fig. 2. Immunodetection of UDE homolog from T. castaneum.
Western blot indicates that polyclonal anti-DmUDE serum recog-
nizes the UDE homolog from T. castaneum that appeared at a
lower position than physiological DmUDE or Dm
rc
UDE. Lane 1:
D. melanogaster larval extract. Lane 2: purified Dm
rc
UDE. Lane 3:
T. castaneum larval extract.
Protein function preserved in a truncated isoform M. Puka
´
ncsik et al.

1248 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
presence of DNA. This loss is in good agreement with
a C-terminal cleavage (at Asp333) leading to the loss
of 2.6 kDa; cleavage at the first N-terminal Asp
(Asp44) would remove a peptide of 6.6 kDa, which is
much larger than estimated from the gel electropho-
retic analysis. It is evident that, in the absence of
DNA, additional cleavages can also occur, yielding
23–25 and 17 kDa polypeptides, as observed on
SDS ⁄ PAGE. Binding of DNA induces significant pro-
tection against all of these cleavages, except at the
Asp333 site, which shows the same highly exposed
character for Asp-N endoproteinase digestion in the
presence and in the absence of DNA.
The results of proteolytic experiments are summa-
rized in Fig. 3D. It is obvious that the segment encom-
passing motifs 2–4 is a well-folded part of the protein,
even in the absence of DNA that lacks exposed prote-
olytic sites [despite the presence of numerous potential
tryptic, chymotryptic and Asp-N sites (Figs 1A and
3D)]. Motifs 1A and 1B, on the other hand, are signifi-
cantly more prone to proteolysis, especially in the
absence of DNA. DNA binding provides significant
protection against proteolytic cleavage along motifs
1A and 1B, indicating either DNA-binding-induced
conformational changes or covering of otherwise
exposed proteolytic sites by DNA binding to these
segments.
Motif 1A is dispensable for UDE function
To produce a specific truncated DmUDE isoform

mimicking the physiologically occurring protein in
T. castaneum, we selected a chemical agent, hydroxyl-
amine, that cleaves peptide bonds exclusively between
36 kDa
28 kDa
17 kDa
11 kDa
55 kDa
w/o U-DNA w/U-DNA
MM 0´ 30´ 60´ 120´ 180´ 0´ 30´ 60´ 120´
1A
1B 2 3 4
His

tag
W10 Y69 W107 F136 Y156 F194 F198 Y311
D44 D66 D126 D179 D193 D333
N111
36 kDa
28 kDa
17 kDa
11 kDa
55 kDa
w/o U-DNA w/U-DNA
0´ 60´ 180´ 300´ MM 0´ 60´ 180´ 300´
36 kDa
w/o U-DNA w/U-DNA
0´ 15´ 30´ 60´ 0´ 15´ 30´ 60´ MM
45 kDa
29 kDa

24 kDa
20 kDa
14.2 kDa
F104
Trypsin digestion
Chymotrypsin digestion
Asp-N proteinase digestion
AC
B
D
Fig. 3. Initial domain analysis of DmUDE by limited proteolysis. (A) Tryptic digestion pattern. Arrows indicate fragments that are preferen-
tially produced in the absence of DNA; the star shows the detected position of stable fragment persisting in the presence of DNA. (B, C)
Limited digestion patterns obtained using high-specificity chymotrypsin and Asp-N endoproteinase. The timescale of limited digestion and
the presence or absence of added ligand are indicated at the top of the gel. MM, molecular markers. (D) Summary of cleavage sites identi-
fied by MS. Top row: chymotryptic sites. Bottom row: Asp-N sites. Solid arrows indicate cleavage sites that are similarly observable in both
the presence and the absence of DNA. Dashed arrows indicate sites protected in the presence of DNA. Dotted arrows indicate cleavage
sites detected only in the presence of DNA. The cleavage site of hydroxylamine is marked with a bold arrow.
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1249
Asn and Gly [21]. There is only one such peptide bond
in DmUDE, at Asn111-Gly112 (Fig. 3D), located
between motifs 1A and 1B. Figure 4A shows that, in
agreement with the previously determined exposed
character of the linker segment between motifs 1A and
1B, hydroxylamine cleaved the protein into an N-ter-
minal Met1–Asn111 and a C-terminal Gly112–Glu355
fragment, as verified by MS. The molecular masses of
the cleavage products are 14 and 28 kDa as calculated

from the sequence, whereas values of 16 and 32 kDa
were estimated from the SDS ⁄ PAGE gels. The C-ter-
minal fragment closely corresponds to the physiologi-
cal TcUDE isoform. The presence of the N-terminal
His-tag on Dm
rc
UDE allowed straightforward separa-
tion of N-terminal and C-terminal hydroxylamine-
cleaved segments by Ni
2+
–nitrilotriacetic acid chroma-
tography (Fig. 4A).
To check whether the removal of motif 1A alters the
specific function of the protein, we performed catalytic
assays and electrophoretic mobility shift assays with
the purified Gly112–Glu355 C-terminal fragment. Fig-
ure 4B shows that the C-terminal segment preserves
catalytic activity and specificity for uracil-substituted
DNA that do not depend on the presence or absence
of available divalent metal ions. The gel shift indicates
the DNA-binding capability of the C-terminal frag-
ment, and also demonstrates the specific DNA-cleaving
activity (Fig. 4C).
N111
His-
tag
1A 1B 2 3 4
Dm
rc
UDE

N-terminal M1-N111 C-terminal G112-E355
Intact
HA
digested
Purified
C-term
Intact
G112-
E355
M1-N111
ds U-oligo ss U-oligo
ds U-oligo ss U-oligo
0′ 30′ 60′ 120′ 0′ 30′ 60′ 120′ 0′ 30′ 60′ 120′
31-mer
0′ 30′ 60′ 120′
0 50 100
G112-E355 Dm
rc
UDE
31-me
r
U-DNA Control DNA
0′ 30′ 60′ 90′ 0′ 30′
60′
G112-E355 Dm
rc
UDE
G112-E355 Dm
rc
UDE

Full-length Dm
rc
UDE
A
BC
DE
Fig. 4. (A) Production and characterization
of the truncated UDE isoform. Cleavage
with hydroxylamine (HA) generates the
expected fragments. In the schematic repre-
sentation, the single cleavage site at
Asn111 between the 1A and 1B motifs is
marked with an arrow. Gel images show
gelectrophoretic analysis of hydroxylamine
cleavage and purification of the C-terminal
motif to homogeneity. (B) Electrophoretic
mobility shift assay. The concentration of
Dm
rc
UDE Gly112–Glu355 segment used in
the experiment is given at the top of the
lanes (lgÆmL
)1
). Uracil-DNA plasmid,
20 lgÆmL
)1
, was used in all mixtures. (C–E)
Truncated UDE lacking motif 1A retains
uracil-DNA-degrading activity. (C) Uracil-DNA
or control DNA linearized plasmid was

incubated for the indicated time periods
with truncated Dm
rc
UDE (Gly112–Glu355
segment). Note degradation (as well as
shift) of the uracil-containing DNA plasmid
substrate. (D, E) Activities of full-length UDE
and Gly112–Glu355 truncated Dm
rc
UDE
constructs were compared using uracil-
containing fluorescently labeled synthetic
double-stranded (ds) and single-stranded (ss)
oligonucleotide substrates (incubation times
are indicated). Note the specific degradation
product very close to the 31mer standard
position, indicating that cleavage of the
oligonucleotide only occurred at the uracil-
containing position. The catalytic activity of
the truncated enzyme is still present, but
is detectable only on single-stranded
substrate.
Protein function preserved in a truncated isoform M. Puka
´
ncsik et al.
1250 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
To clearly identify the cleavage site of the UDE pro-
tein and its truncated form on uracil-containing DNA
substrate, we performed cleavage experiments using
synthetic 60mer single-stranded and double-stranded

oligonucleotides, containing one single uracil moiety in
one of the strands, at the 32nd position. The uracil-
containing strand was labeled with a fluorescent dye to
aid visualization of the reaction (Fig. 4D,E).
Quaternary protein structure of full-length and
truncated proteins
To determine whether the absence of the N-terminus
has any effect on the quaternary structure organization
of UDE, the native molecular masses for the
full-length protein and the C-terminal fragment were
determined by analytical gel filtration. The full-length
protein eluted at a position corresponding to 52 kDa,
which is somewhat larger than the full-length mono-
mer calculated molecular mass of 41.446 kDa. This
alteration may indicate partial rapid equilibrium
dimerization and ⁄ or the anomalous gel permeation
behavior may suggest that the proteins contain signifi-
cant amounts of natively unfolded, highly flexible seg-
ments. To check this suggestion, we performed an
in silico analysis using several servers for sequence-
based prediction of structural disorder [22–24]. The
results are shown in Fig. 5, and indicate that the dif-
ferent predictors suggest, in agreement, considerably
high flexibility at the N-terminus and C-terminus, as
well as in the region between motifs 1A and 1B. Inter-
estingly, the C-terminal Gly112–Glu355 fragment
eluted from the gel filtration column at practically the
same position as observed for the full-length UDE,
corresponding to 52 kDa. As the calculated molecular
mass of the monomeric Gly112–Glu355 fragment is

28 kDa, the elution profile strongly suggests that this
fragment forms a dimer.
Analytical ultracentrifugation was also applied to
corroborate the results from the gel filtration studies.
The sedimentation equilibrium technique is reported to
be optimal for determining native molecular masses
[25]. In fact, our results with full-length Dm
rc
UDE
indicate that the determined molecular mass was
42.8 ± 2 kDa, in very close agreement with the mass
calculated from the amino acid sequence (Fig. 6). For
the truncated Gly112–Glu355 construct, the deter-
mined native molecular mass was 49 ± 1.2 kDa, cor-
responding rather closely to a dimer of the truncated
segment (for which the calculated masses are 28 kDa
for the monomer and 56 kDa for the dimer). These
results, in agreement with the gel filtration data, argue
for a native monomer of the full-length protein and a
native dimer for the truncated construct.
Sedimentation velocity experiments revealed that
full-length Dm
rc
UDE has a main sedimenting species
(82% of the loading concentration) with a standard
sedimentation value of 2.6S ± 0.1S, which, together
0 50 100 150 200 250 300 350
–0.1
0.0
0.1

0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
RONN
IUPred
DISOPRED
Number of amino acids
Disorder probability
Fig. 5. Disorder profile of Dm
rc
UDE. The plot shows sequence
position against probability of disorder. Segments of the sequence
at the N-terminus and C-terminus and between motif 1A and motif
1B were classified as disordered by three predictor programs
(
IUPRED, RONN, and DISOPRED).
6.95 7.00 7.05 7.10 7.15 7.20
–0.03
0.00
0.03
02468101214
0.0
0.2
0.4

0.6
0.8
1.0
C (s)
Sedimentation coefficient (S)
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Residuals A
280
Radius (cm)
Fig. 6. Determination of UDE oligomer status by analytical ultracen-
trifugation. Top panel: sedimentation equilibrium gradients of
0.53 mgÆmL
)1
full-length Dm
rc
UDE (s) and 0.8 mgÆmL
)1
for
Dm
rc
UDE Gly112–Glu355 (h) at 13 400 g as described in Experi-
mental procedures. The solid line shows the fit of the experimental
data to single ideal species. Bottom panel: residual distribution as a

function of the sedimentation distance (this plot corresponds to the
difference between the experimental data and the fitted data for
each point). Inset panel: sedimentation coefficient distributions of
full-length Dm
rc
UDE (solid line) and Dm
rc
UDE Gly112–Glu355
(dashed line).
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1251
with the sedimentation equilibrium data, is compatible
with a protein monomer whose hydrodynamic behav-
ior deviates from the expected for a globular species
[calculated frictional ratio (f ⁄ f
0
) = 1.6]; the rest of the
protein sediments as faster oligomeric species. The
truncated Gly112–Glu355 protein construct showed
significant polydispersity, with main peaks at 2.5S,
4.3S, and 6.0S, representing approximately 70%, 20%,
and 6%, respectively, of the loading concentration.
The 2.5S peak is compatible with a protein globular
monomer (f ⁄ f
0
= 1.3). These data argue for potential
monomer self-association into dimers and higher-order
oligomers.

Structure prediction of UDE reveals a
pseudosymmetrical arrangement of two a-helical
bundles
For structural prediction, the DmUDE full-length
sequence was submitted to the genesilico metaserv-
er [26], which is the gateway providing a unified inter-
face to several servers for secondary and tertiary
structure predictions. The analysis of predictions of
domain composition suggested that UDE contains an
N-terminal helical region of approximately 30 residues
and at least three structural domains corresponding to
motifs 1A and 1B, and the C-terminus, encompassing
motifs 2, 3 and 4. The C-terminus of 40–50 residues
and the loop connecting motifs 1A and 1B (between
residues 109 and 137) are predicted to be mostly dis-
ordered. All three domains are predicted to be mainly
helical, although the secondary structure predictions
for the third domain were uncertain, as there was no
agreement between alternative servers.
The fold recognition analysis did not reveal any con-
fident matches with known protein structures, suggest-
ing that the UDE 3D structure may exhibit a novel
fold. Therefore, to predict at least partially the tertiary
structure of UDE, we performed de novo modeling of
the region encompassing motifs 1A and 1B, using the
rosetta program [27]. In total, about 500 000 differ-
ent models (also known as decoys) were generated,
and 10% of the lowest-energy structures were clustered
on the basis of their similarity. The representatives of
the best clusters were refined with the rosetta full

atom refinement protocol, and scored with the model
quality assessment programs (MQAPs) proq [28] and
metamqap [29]. Evaluation of the largest clusters
revealed that both motifs 1A and 1B comprise similar
three-helical bundles, with the same topology and rela-
tive orientation of the helices. Nevertheless, the top
clusters differed in relative orientation of the two heli-
cal bundles to each other (data not shown). Among
these clusters, one single cluster contained the specific
topology that exhibited pseudosymmetrical orientation
of the two motifs. Importantly, members of this cluster
exhibited low energy levels and were well scored by
MQAPs (proq – predicted LGscore in the range
1.2–3.2, and metamqap – predicted rmsd in the range
3–4.2 A
˚
, for the five lowest-energy representatives of
the cluster), which indicates a high probability that
they resemble the currently unknown native structure.
Figure 7 depicts the predicted model in several
different orientations. The two homologous motifs (1A
and 1B) form a four-helix bundle interaction surface
(Fig. 7A,B). On the surface of the model, a well-con-
served, positively charged surface is well defined. This
may serve as the nucleic acid-binding surface, in agree-
ment with the limited proteolysis data.
Estimation of secondary structural elements by
CD spectroscopy
To verify structural predictions, CD spectroscopy mea-
surements were performed, as CD spectra in the far-

UV wavelength (190–240 nm) range are very indicative
of different secondary structural elements [30]. Spectra
of the intact protein and of the C-terminal fragment
Gly112–Glu355 showed double negative maxima at
208 and 222 nm, which are characteristic for the pres-
ence of a-helices (Fig. 8). Quantitative evaluation of
the spectral data was performed with k2d and selcon
[24,31,32]. The estimated percentages of protein sec-
ondary structures from CD spectra reveal 37% a-heli-
ces and 18–26% b-structure.
Discussion
The potential signaling role of deoxyuridine moieties in
genomes of pupating insects was first suggested by
Deutsch et al. [16], on the basis of the lack of UDG
activity in these insects. The hypothesis stating that
uracil-DNA might be present transiently in larval
stages and that its degradation at the end of larval
stages may contribute to cell death during metamor-
phosis was much debated, owing to independent find-
ings from several laboratories showing the presence of
UDG activity in some developmental stages of Dro-
sophila [33–37]. This debate was resolved by the fully
annotated Drosophila genome, which clearly indicated
the lack of the major UDG gene ung but the presence
of several other genes that encode catalytically much
less efficient UDGs. The absence of dUTPase in larval
stages [14] and our recent discovery of the strictly regu-
lated UDE [15] reinforced the hypothesis on the possi-
ble role of uracil-DNA in Drosophila and suggested a
Protein function preserved in a truncated isoform M. Puka

´
ncsik et al.
1252 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
role for UDE in programmed cell death during meta-
morphosis. Functional analysis of UDE identified this
protein as a novel uracil-recognizing factor [15], with
no similarities to either UDGs [18] or the Exo-
III ⁄ Mth212 nuclease [38]. Multiple sequence alignments
of UDE homologs from all available pupating insect
genomes indicated the presence of conserved motifs in
most species, with the same distribution (Fig. 1).
The UDE homolog in T. castaneum lacks one copy
of the N-terminal duplicated first motif (Figs 1 and 2).
TcUDE showed reactivity with the antiserum produced
against Dm
rc
UDE, suggesting that the truncated
TcUDE isoform is a well-folded UDE-like protein. It
was also observable on the blot that the physiological
forms of the proteins from both Drosophila and Tribo-
lium extracts were detected at much higher electro-
phoretic positions than expected from the calculated
molecular mass values: molecular masses estimated
200 210 220 230 240 250 260
–10 000
–8000
–6000
–4000
–2000
0

2000
4000
6000
Intact Dm
rc
UDE
G112-E355 Dm
rc
UDE
Wavelength (nm)
Θ
MRE
(deg cm
2
·dmol
–1
)
Fig. 8. CD spectra of intact UDE (solid line) and C-terminal frag-
ment (dashed line) confirm the presence of a-helices. MRE, mean
residue molar ellipticity.
Orange – strictly conserved
Yellow
– conserved
Green – variable
Cartoon model colored
by motifs
(blue and red – protease
cleavage sites)
Cartoon model colored by
sequence conservation

Surface model colored by
sequence conservation
Surface model colored by
electrostatic potential
AB
Motif 1A
Motif 1B
CD
–3 kT/e + 3 kT/e
Orange – strictly conserved
Yellow
– conserved
Green – variable
Fig. 7. Structural model of DmUDE duplication fragment. Structures are shown in two views: front (upper panel) and top (bottom panel). (A)
Cartoon representation. Duplicated motifs 1A and 1B are colored green and orange, respectively, and the nonconserved linker is colored
gray. Peptide bonds protected from proteolytic cleavage on DNA binding are colored blue. The peptide bond between residues 104 and 105,
cleaved only on DNA binding, is colored red. Note that the duplicated fragments are only approximately symmetrical, as the model is of low
resolution and the local conformation of the backbone is uncertain. (B, C) Sequence conservation mapped onto the ribbon diagram (B) or the
molecular surface (C) (conserved residues are colored orange and yellow; variable residues are colored green). (D) Electrostatic potential
mapped onto the molecular surface (positively and negatively charged regions are colored blue and red, respectively). Arrows indicate the
positively charged conserved patches that may accommodate DNA.
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1253
UDE sequence
BLAST
Multiple sequence
alignment
T. cas predicted

protein product
Western
blotting
Secondary structure
prediction
Modeling
Identification
of conserved
surface patches
Mapping
of electrostatic
potential
Prediction
of DNA binding site
Limited
proteolysis
Peptide
identification
by MS
Trypsin
Asp-N
endoproteinase
Chymotrypsin
Hydroxylamine
cleavage
Analysis
of C-ter fragment
DNA binding
by EMSA
DNA cleavage

assay
Quaternary
structure
by gel flitration
Theoretical analysis
Experimental analysis
Domain
organization
Analytical
ultracentrifugation
Circular
dichroism
Fig. 10. Flowchart scheme of bioinformatics and experimental approaches.
Motifs 2,3,4 Motifs 2,3,4 Motifs 2,3,4
Motif 1 B
Motif 1 A
Motif 1
Motif 1
DmUDE TcUDE
AB
C
Fig. 9. Structural models of DmUDE pseu-
dodimer (A) and TcUDE dimer (B). Struc-
tures are shown in cartoon representation
and colored by motif (motif 1A in DmUDE
and motif 1 in TcUDE, dark red; motif 1B,
dark gray; nonconserved segments, light
gray). Residues 1–11 of TcUDE are not
shown (the conformation of this fragment is
very uncertain). C-terminal parts correspond-

ing to motifs 2, 3 and 4 are shown schemat-
ically only. (C) Alignment between motif 1
residues for DmUDE and TcUDE. Identical
and conserved residues are colored red and
green, respectively. The helical prediction is
indicated. Note the numerous conserved
hydrophobic and polar residues that may
form the dimerization surface.
Protein function preserved in a truncated isoform M. Puka
´
ncsik et al.
1254 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
from the electrophoretic experiment are 53.3 and
41.7 kDa for DmUDE and TcUDE homologs, respec-
tively, whereas the sequence-based theoretical masses
are 39.9 and 28 kDa. Recombinant Dm
rc
UDE,
expressed in E. coli, did not show such a large devia-
tion from its expected position (the calculated molecu-
lar mass is 41.446 kDa, and the electrophoretic
estimation is 44.2 kDa). The large shift of the physio-
logical samples to the higher apparent molecular mass
positions on SDS ⁄ PAGE may be indicative of some
post-translational modification, the identification of
which is in progress.
Limited proteolysis experiments indicated that DNA
binding may occur along the conserved motifs 1A and
1B, as binding induces significant protection against
multiple proteases at peptide bonds scattered through-

out these segments (Fig. 3). Secondary structure predic-
tion revealed that the duplicated fragments at the
N-terminal end of UDE are mainly helical, and this
observation has also been confirmed by CD measure-
ments (Fig. 8). According to the predicted structural
model of the fragment encompassing motifs 1A and 1B,
the duplicated motifs together form a conserved helical
bundle (Fig. 7A,B). The pseudodimer contains a large
conserved segment on the surface, composed of two
positively charged patches separated by small region of
negative potential (Fig. 7C,D, arrows). These patches
may correspond to the DNA-binding site and ⁄ or cata-
lytic site. However, on the basis of this predicted model,
the DNA-binding mode and catalytic site residues
cannot be confidently predicted. Interestingly, many
protease cleavage sites that are protected after DNA
binding are located on the opposite, nonconserved and
negatively charged, side of the structure (Fig. 7A).
Therefore, these sites are not likely to be directly steri-
cally protected from cleavage by bound DNA. Instead,
they are probably localized in a region that is partially
flexible or disordered in the absence of the DNA.
The primary structure of TcUDE implied that the
lack of one copy of motif 1 does not necessarily per-
turb the formation of a functional protein. This
hypothesis was confirmed by producing the respective
truncated isoform with chemical cleavage from
Dm
rc
UDE (Fig. 4). We therefore conclude that the

physiological form of TcUDE could have the same
unique function and the same putative physiological
role. Native molecular mass estimation by gel filtration
and analytical ultracentrifugation indicated that the
truncated Dm
rc
UDE, representing TcUDE, forms a
homodimer (Fig. 6). A structural model for this
homodimer is shown in Fig. 9. The two homologous
motifs, 1A and 1B, in the Dm
rc
UDE monomer may
lead to the formation of a partial pseudodimer, and
this interaction pattern may be preserved in the sug-
gested TcUDE dimer (Fig. 9). The predicted model
could not provide direct assessment of the pattern of
interaction between the two copies of motif 1 (1A ⁄ 1B).
Nevertheless, numerous hydrophobic interactions are
very probable between the conserved apolar residues,
and hydrogen bonds can also easily form between the
polar side chains within the three-helical bundles. In
the dimerized modules, the nucleic acid-binding surface
can also be formed in a manner very similar that in
the full-length UDE proteins.
Conclusions
The significance of UDE is two-fold: (a) it may be
developed into a versatile molecular biotechnological
tool [39]; and (b) its targeting may yield species-specific
insecticides to be used against, for example, malaria
mosquitoes. Here, we employed a multidisciplinary set

of theoretical and experimental approaches (schemati-
cally described in Fig. 10) to reveal structural and func-
tional characteristics. The present data provide insights
into the domain structure and nucleic acid-binding site
of this novel DNA-degrading protein in the context of
sequence motifs that have previously not been
described in nucleases or uracil-recognition proteins.
Experimental procedures
DNA cloning and recombinant protein expression
Recombinant His-tagged UDE corresponding to the
DmUDE (Q961C4) sequence (Dm
rc
UDE) was expressed
and purified as described previously [15]. The truncated
Gly112–Glu355 construct of Dm
rc
UDE corresponding to
the T. castaneum UDE homolog sequence was generated
from pET–HisUDE with the following primers: 5¢-GAG
ATA TAC ATA TGG GCG GAG GGG CGT CCA GCA
AG-3¢ and 5¢-AAG CTT GAG CTC GAG CTC CTC CCT
CTT CTT CTT CC-3¢. The DNA fragment was cloned into
pET22b (Novagene, Merck, Darmstadt, Germany), using
NdeI and XhoI sites. The recombinant construct included a
His
6
tag and a linker segment at the C-terminus.
Western blotting
Western blotting was performed as described in [15], using
anti-DmUDE serum at 1 : 180 000 dilution as primary anti-

body, and peroxidase-labeled secondary antibody. Extracts
from D. melanogaster and T. castaneum larvae were pre-
pared with the addition of protease inhibitor cocktail
(Sigma-Aldrich, Budapest, Hungary). The same amount of
total protein from each extract was loaded on SDS ⁄ PAGE
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1255
gels. This was confirmed by running the SDS ⁄ PAGE gels
in duplicate, and using one gel for Coomassie protein quan-
tification by laser densitometry, and the other gel for blot-
ting. Densitometry data indicated that the protein content
in the lanes differed by less than 4%. Blot results were
detected by enhanced chemiluminescence.
Limited proteolysis
Dm
rc
UDE at 0.5 mgÆmL
)1
was incubated with 50 ngÆmL
)1
trypsin (Sigma-Aldrich), or 1.25 lgÆmL
)1
chymotrypsin
(EMP Biotech GmbH, Berlin, Germany), or 2.5 lgÆmL
)1
Asp-N endoprotease (Sigma-Aldrich), in the absence or in
the presence of 0.5 mgÆmL
)1

uracil-DNA plasmid (prepared
in a dut
)
ung
)
K12 CJ236 E. coli strain [15]) 20 mm Hepes
(pH 7.5) containing 150 mm KCl and 1 mm dithiothreitol,
or 50 mm Tris ⁄ HCl (pH 8.0) containing 1 mm CaCl
2
,or
100 mm Tris ⁄ HCl (pH 8.5) containing 5 mm MgCl
2
(for
trypsin, chymotrypsin or Asp-N endoprotease digestions,
respectively). Reactions were run at room temperature and
terminated after different time intervals by the addition
of 1 mm phenylmethanesulfonyl fluoride (for trypsin and
chymotrypsin digestions), or by addition of 5 mm EDTA
and immediate freezing (for Asp-N endoprotease digestion).
Hydroxylamine digestion
Hydroxylamine is a chemical reagent that specifically induces
peptide bond cleavage at the Asn–Gly bond [21]. There is
only one such site in UDE, between residues 111 and 112.
UDE at 2 mgÆmL
)1
was incubated overnight with 2 m
hydroxylamine in 0.2 m NH
4
HCO
3

at 37 °C. The two prod-
ucts (N-terminal and C-terminal fragments) were dialyzed in
25 mm Hepes (pH 7.5) containing 150 mm KCl and 1 mm
protease inhibitor cocktail, and this was followed by separa-
tion on Ni
2+
–nitrilotriacetic acid resin. The C-terminal
Gly112–Glu355 fragment was stored in 25 mm Hepes (pH
7.5) containing 150 mm KCl and protease inhibitor cocktail,
according to the manufacturer’s suggestion (Sigma-Aldrich).
MS
Analysis of the limited proteolysis fragments was performed
either without fractionation or after 1D SDS⁄ PAGE sepa-
ration. The unfractionated fragments were analyzed on a
Bruker Reflex III MALDI-TOF mass spectrometer in a sin-
apinic acid matrix in positive linear mode. SDS ⁄ PAGE-sep-
arated fragments were in-gel digested by trypsin, and the
digests were analyzed by LC-MS ⁄ MS analysis as in [40–42].
Catalytic assay
For plasmid substrates, uracil-containing plasmid DNA
was prepared by amplification of normal plasmid DNA
(pSUPERIOR-puro; Invitrogene, Csertex, Budapest, Hun-
gary) in the dut
)
ung
)
K12 CJ236 E. coli strain [15]. Control
plasmid was prepared in the XL1Blue E. coli strain. Plas-
mids were purified with a Qiagen plasmid isolation kit, and
linearized with NotI restriction endonuclease. DNA,

50 lgÆmL
)1
, was incubated with 50 lgÆmL
)1
UDE C-termi-
nal fragment. The nuclease assay was performed in 25 mm
Tris ⁄ HCl (pH 7.5), also containing 0.1 mgÆmL
)1
BSA and
either 1 mm MgCl
2
or 1 mm EDTA at 37 ° C. At given
reaction times, aliquots were withdrawn and incubated at
65°C for 15 min (for Mg
2+
-containing reaction mixtures);
or at room temperature in the presence of 60 mm NaOH
for 15 min (for EDTA-containing reaction mixtures). This
treatment resulted in cleavage of abasic sites. Products were
detected by standard ethidium bromide staining after
agarose gel electrophoresis.
Alternatively, assays were also run using synthetic uracil-
containing single-stranded and double-stranded oligonucleo-
tides (purchased from Eurofins MWG Operon, Ebersberg,
Germany). The uracil-containing oligonucleotide was labeled
at the 5¢-end with Cye3 fluorescent dye, and contained one
single uracil moiety at the 32nd position. Its complementary
strand (to be used for constructing the double-stranded sub-
strate) did not contain either uracil or fluorescent label.
The uracil-containing oligonucleotide labeled with Cye3

was 5¢-CTC GCA AAT GAA CTG GGC GAT GCG GTC
GCA C
UA CTT CAC CTC GAA ATC AAC ATC TGA
GTG-3¢ (with the uracil position underlined).
The complementary oligonucleotide was 5¢-CAC TCA
GAT GTT GAT TTC GAG GTG AAG T
AG TGC GAC
CGC ATC GCC CAG TTC ATT TGC GAG-3¢ (with the
adenine position opposite to uracil in the double-stranded
oligonucleotide underlined).
For preparation of double-stranded substrates, equal
amounts of uracil-containing oligonucleotide and its com-
plementary strand were incubated at 95 °C for 5 min. For
the assay, 25 pmol of single-stranded or double-stranded
oligonucleotides was incubated with 50 l g Æ mL
)1
full-length
or truncated UDE in a final volume of 10 lL, in 25 mm
Tris ⁄ HCl (pH 7.5), also containing 0.1 mgÆmL
)1
BSA and
1mm EDTA, at 37 °C. At given reaction times, reaction
mixtures were run on 10% Tris ⁄ borate ⁄ EDTA ⁄ PAGE gel.
Products were visualized under UV light.
The activity assays performed using the Gly112–Glu355
construct, either produced by cloning or by hydroxylamine
digestion, gave very similar results.
Electrophoretic mobility shift assay of DNA
binding
The protein concentration used is listed in micrograms per

milliliter at the top of the figure. Plasmid DNA,
20 lgÆmL
)1
, was used in all mixtures. The buffer was
25 mm Tris ⁄ HCl (pH 7.5), also containing 0.1 mgÆmL
)1
Protein function preserved in a truncated isoform M. Puka
´
ncsik et al.
1256 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
BSA and 1 mm EDTA. Protein and DNA were mixed, and
the mixtures were loaded on agarose gel.
Analytical gel filtration analysis
Analytical gel filtration was conducted on Superdex 200HR
column calibrated with BSA, ovalbumin, chymotrypsin,
and RNase (molecular masses of 67, 43, 25 and 13.7 kDa,
respectively). Calibrating proteins or UDE samples were
applied in a total volume of 500 lL, at a concentration of
1–6 mgÆmL
)1
.
Analytical ultracentrifugation analysis
An Optima XL-A analytical ultracentrifuge (Beckman-
Coulter, Palo Alto, CA, USA) was used to perform the
analytical ultracentrifugation experiments. Detection was
performed by means of a UV–visible absorbance detection
system. Experiments were conducted at 20 °C, using an
AnTi50 eight-hole rotor and epon–charcoal standard double-
sector centerpieces (12 mm optical path). Absorbance scans
were taken at the appropriate wavelength (280 nm). Proteins

were used at 0.53 mgÆmL
)1
for full-length Dm
rc
UDE and
0.8 mgÆmL
)1
for the Gly112–Glu355 construct. Sedimenta-
tion velocity was determined using 400 lL samples, and
the selected speed was 16 800 g. Differential sedimentation
coefficient distributions, c( s), were calculated by least-
squares boundary modeling of sedimentation velocity data
using the program sedfit [25,43]. From this analysis, the
experimental sedimentation coefficients were corrected for
solvent composition and temperature with the program
sednterp to obtain the corresponding standard sedimenta-
tion values (Fig. 6, inset).
Short-column (85 lL) sedimentation equilibrium runs
were performed at multiple speeds (7900, 13 400 and
20 260 g). After the equilibrium scans, a high-speed centri-
fugation run (140 000 g) was performed to estimate the
corresponding baseline offsets. Weight-average buoyant
molecular masses were determined by fitting a single-species
model to the experimental data (Fig. 6), using the hetero-
analysis program [44]. The molecular masses of proteins
Dm
rc
UDE and Dm
rc
UDE Gly112-Glu355 was determined

from the experimental buoyant values, using 0.737 and
0.734 mLÆg
)1
as their partial specific volumes (calculated
from the amino acid composition, using sednterp [45]).
CD spectroscopy
Far-UV CD spectra (190–240 nm) were recorded on a JAS-
CO 720 spectropolarimeter, using 1 mm pathlength cuvettes
thermostatted at 25 °C and a Neslab RTE-100 computer-
controlled thermostat. Full-length Dm
rc
UDE and Gly112–
Glu355 Dm
rc
UDE fragment at 0.2 mgÆmL
)1
were measured
in 20 m m potassium phosphate buffer (pH 7.5). Three scans
of every spectrum were averaged. Spectral data processing
was performed with the built-in jasco software of the spec-
tropolarimeter. Far-UV CD spectra were processed by k2d
and selcon to estimate the fraction of secondary structural
elements.
Bioinformatics analyses
Sequences of putatively homologous UDE proteins were
identified by the tblastp program and genomic blast at the
NCBI, used for similarity searches. A multiple sequence
alignment of UDE proteins was calculated by clustalw [46].
Prediction of domain boundaries, secondary structure
and fold recognition were conducted via the genesilico

metaserver [26], which is the gateway providing a unified
interface to several servers for secondary and tertiary struc-
ture predictions. Independent runs were performed for the
full-length UDE sequence (NCBI GI number: 28572066), a
variant without the N-terminus (amino acids 42–355), a
region encompassing motifs 1A and 1B (amino acids 42–
205) as well as both motifs separately (amino acids 42–111
or 129–198), and the C-terminal region alone, without the
disordered C-terminus (amino acids 210–320).
Structural modeling was performed with rosetta, using
a standard low-resolution ab initio procedure followed by a
full atom refinement. In order to explore broader confor-
mational space, several independent folding simulations
were performed for UDE from D. melanogaster (GI:
28572066; amino acids 41–198) and its two homologs from
Anopheles gambiae (GI: 58377038; amino acids 3–160) and
Bombyx mori [the protein sequence was reconstructed from
two contigs using NCBI ORF prediction on contig 6156
(GI: 46642882) and contig 413277 (GI: 46731897); amino
acids 53–209]. This protocol is similar to the general de
novo rosetta protocol used in casp6 [47]. Here, in the first
low-resolution stage, about 100 000 decoys (i.e. preliminary
models) were generated for each homolog. Additionally,
separate simulations were run with options promoting more
compact structures. Decoy sets from all simulations were
clustered independently using a quality threshold algorithm,
in such a way as to obtain the biggest cluster of minimal
size 25 and a maximal rmsd threshold of 5 A
˚
. Then, the

centroid decoys (i.e. cluster members closest to the average
structure of the cluster) and the five lowest-energy decoys
from each cluster of size greater than 10 members were
selected for the full atom refinement. Next, the refined
structures of homologs were used as templates for modeling
the corresponding set of structures of UDE from D. mela-
nogaster. All resulting structures were scored with proq [28]
and metamqap [29], and the final model was selected on
the basis of the scores and evaluation of the approximate
pseudosymmetry between duplicated fragments.
The electrostatic potential was calculated using apbs [48]
and mapped on the molecular surface with pymol [49].
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1257
Acknowledgements
This work was supported by the following grants: the
Hungarian Scientific Research Fund (OTKA K68229),
Howard Hughes Medical Institutes (#55005628 and
#55000342), Alexander von Humboldt Foundation,
GVOP-3.2.1 2004-05-0412 ⁄ 3.0, JA
´
P_TSZ_071128_TB_
INTER from the National Office for Research and
Technology of Hungary, FP6 STREP 012127, FP6
SPINE2c LSHG-CT-2006-031220, and TEACH-SG
LSSG-CT-2007-037198 from the EU to B. G. Ve
´
rtessy.

A. Be
´
ke
´
si was supported by NKTH-OTKA H07-
BEL74200. Research in the laboratory of J. M.
Bujnicki has been supported by a FP6 grant from the
European Union (‘DNA ENZYMES’ MRTN-CT-
2005-019566) and by an NIH (grant 1R01GM081680).
J. Kosinski was a PhD student at the Postgraduate
School of Molecular Medicine, Medical University of
Warsaw, and had a START fellowship from the Foun-
dation for Polish Science. Kromat Ltd is gratefully
acknowledged for providing the use of an Agilent 1100
nanoLC-XCT Plus IonTrap system. C. Alfonso and
G. Rivas are holders of grant BIO2008-04478-C03-03
from the Spanish Ministerio de Ciencia e Innovacio
´
n.
References
1 Lindahl T (1993) Instability and decay of the primary
structure of DNA. Nature 362, 709–715.
2 Krokan HE, Drablos F & Slupphaug G (2002) Uracil
in DNA – occurrence, consequences and repair. Onco-
gene 21, 8935–8948.
3 Pearl LH & Savva R (1996) The problem with pyrimi-
dines. Nat Struct Biol 3, 485–487.
4 Mosbaugh DW (1988) Purification and characterization
of porcine liver DNA polymerase gamma: utilization of
dUTP and dTTP during in vitro DNA synthesis.

Nucleic Acids Res 16, 5645–5659.
5 Vertessy BG & Toth J (2008) Keeping uracil out of
DNA: physiological role, structure and catalytic
mechanism of dUTPases. Acc Chem Res 42, 97–106.
6 Krokan HE, Standal R & Slupphaug G (1997) DNA
glycosylases in the base excision repair of DNA.
Biochem J 325, 1–16.
7 Dogliotti E, Fortini P, Pascucci B & Parlanti E (2001)
The mechanism of switching among multiple BER path-
ways. Prog Nucleic Acid Res Mol Biol 68, 3–27.
8 Kavli B, Sundheim O, Akbari M, Otterlei M, Nilsen H,
Skorpen F, Aas PA, Hagen L, Krokan HE & Slupph-
aug G (2002) hUNG2 is the major repair enzyme for
removal of uracil from U:A matches, U:G mismatches,
and U in single-stranded DNA, with hSMUG1 as a
broad specificity backup. J Biol Chem 277,
39926–39936.
9 Lari SU, Chen CY, Vertessy BG, Morre J & Bennett
SE (2006) Quantitative determination of uracil residues
in Escherichia coli DNA: contribution of ung, dug, and
dut genes to uracil avoidance. DNA Repair (Amst) 5,
1407–1420.
10 Nilsen H, Rosewell I, Robins P, Skjelbred CF, Ander-
sen S, Slupphaug G, Daly G, Krokan HE, Lindahl T &
Barnes DE (2000) Uracil-DNA glycosylase (UNG)-
deficient mice reveal a primary role of the enzyme
during DNA replication. Mol Cell 5, 1059–1065.
11 Kavli B, Andersen S, Otterlei M, Liabakk NB, Imai K,
Fischer A, Durandy A, Krokan HE & Slupphaug G
(2005) B cells from hyper-IgM patients carrying UNG

mutations lack ability to remove uracil from ssDNA and
have elevated genomic uracil. J Exp Med 201, 2011–2021.
12 el-Hajj HH, Wang L & Weiss B (1992) Multiple mutant
of Escherichia coli synthesizing virtually thymineless
DNA during limited growth. J Bacteriol 174, 4450–4456.
13 Drysdale R (2003) The Drosophila melanogaster genome
sequencing and annotation projects: a status report.
Brief Funct Genomic Proteomic 2, 128–134.
14 Bekesi A, Zagyva I, Hunyadi-Gulyas E, Pongracz V,
Kovari J, Nagy AO, Erdei A, Medzihradszky KF &
Vertessy BG (2004) Developmental regulation of
dUTPase in Drosophila melanogaster. J Biol Chem
279, 22362–22370.
15 Bekesi A, Pukancsik M, Muha V, Zagyva I, Leveles I,
Hunyadi-Gulyas E, Klement E, Medzihradszky KF,
Kele Z, Erdei A et al. (2007) A novel fruitfly protein
under developmental control degrades uracil-DNA.
Biochem Biophys Res Commun 355, 643–648.
16 Deutsch WA (1995) Why do pupating insects lack an
activity for the repair of uracil-containing DNA? One
explanation involves apoptosis Insect Mol Biol 4, 1–5.
17 Dudley B, Hammond A & Deutsch WA (1992) The
presence of uracil-DNA glycosylase in insects is
dependent upon developmental complexity. J Biol Chem
267, 11964–11967.
18 Aravind L & Koonin EV (2000) The alpha ⁄ beta fold
uracil DNA glycosylases: a common origin with diverse
fates. Genome Biol 1 , RESEARCH0007, doi:10.1186/
gb-2000-1-4-research0007.
19 Richards S, Gibbs RA, Weinstock GM, Brown SJ,

Denell R, Beeman RW, Gibbs R, Bucher G, Friedrich
M, Grimmelikhuijzen CJ et al. (2008) The genome of
the model beetle and pest Tribolium castaneum. Nature
452, 949–955.
20 Keil B (1992) Specificity of Proteolysis. Springer-Verlag,
Berlin-Heidelberg-New York.
21 Bornstein P & Balian G (1977) Cleavage at Asn-Gly
bonds with hydroxylamine. Methods Enzymol 47,
132–145.
22 Dosztanyi Z, Csizmok V, Tompa P & Simon I (2005)
IUPred: web server for the prediction of intrinsically
Protein function preserved in a truncated isoform M. Puka
´
ncsik et al.
1258 FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
unstructured regions of proteins based on estimated
energy content. Bioinformatics 21, 3433–3434.
23 Yang ZR, Thomson R, McNeil P & Esnouf RM (2005)
RONN: the bio-basis function neural network technique
applied to the detection of natively disordered regions
in proteins. Bioinformatics 21, 3369–3376.
24 Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF & Jones
DT (2004) Prediction and functional analysis of
native disorder in proteins from the three kingdoms of
life. J Mol Biol 337, 635–645.
25 Schuck P, Perugini MA, Gonzales NR, Howlett GJ &
Schubert D (2002) Size-distribution analysis of proteins
by analytical ultracentrifugation: strategies and applica-
tion to model systems. Biophys J 82, 1096–1111.
26 Kosinski J, Cymerman IA, Feder M, Kurowski MA,

Sasin JM & Bujnicki JM (2003) A ‘FRankenstein’s
monster’ approach to comparative modeling: merging
the finest fragments of fold-recognition models and
iterative model refinement aided by 3D structure
evaluation. Proteins 53(Suppl 6), 369–379.
27 Mitrophanous K, Yoon S, Rohll J, Patil D, Wilkes F,
Kim V, Kingsman S, Kingsman A & Mazarakis N
(1999) Stable gene transfer to the nervous system
using a non-primate lentiviral vector. Gene Ther 6,
1808–1818.
28 Wallner B, Fang H & Elofsson A (2003) Automatic
consensus-based fold recognition using Pcons, ProQ,
and Pmodeller. Proteins 53(Suppl 6), 534–541.
29 Kosinski J, Gajda MJ, Cymerman IA, Kurowski MA,
Pawlowski M, Boniecki M, Obarska A, Papaj G,
Sroczynska-Obuchowicz P, Tkaczuk KL et al. (2005)
FRankenstein becomes a cyborg: the automatic recom-
bination and realignment of fold recognition models in
CASP6. Proteins 61(Suppl 7), 106–113.
30 Whitmore L & Wallace BA (2008) Protein secondary
structure analyses from circular dichroism spectroscopy:
methods and reference databases. Biopolymers 89,
392–400.
31 Andrade MA, Chacon P, Merelo JJ & Moran F (1993)
Evaluation of secondary structure of proteins from UV
circular dichroism spectra using an unsupervised learn-
ing neural network. Protein Eng 6, 383–390.
32 Deleage G & Geourjon C (1993) An interactive graphic
program for calculating the secondary structure content
of proteins from circular dichroism spectrum. Comput

Appl Biosci 9, 197–199.
33 Deutsch WA & Spiering AL (1982) A new pathway
expressed during a distinct stage of Drosophila develop-
ment for the removal of dUMP residues in DNA. J Biol
Chem 257, 3366–3368.
34 Deutsch WA (1987) Enzymatic studies of DNA repair
in Drosophila melanogaster. Mutat Res 184, 209–215.
35 Morgan AR & Chlebek J (1989) Uracil-DNA glycosy-
lase in insects. Drosophila and the locust. J Biol Chem
264, 9911–9914.
36 Green DA & Deutsch WA (1983) Repair of alkylated
DNA: Drosophila have DNA methyltransferases but
not DNA glycosylases. Mol Gen Genet 192, 322–325.
37 Breimer LH (1986) A DNA glycosylase for oxidized
thymine residues in Drosophila melanogaster. Biochem
Biophys Res Commun 134, 201–204.
38 Georg J, Schomacher L, Chong JP, Majernik AI, Raa-
be M, Urlaub H, Muller S, Ciirdaeva E, Kramer W &
Fritz HJ (2006) The Methanothermobacter thermautot-
rophicus ExoIII homologue Mth212 is a DNA uridine
endonuclease. Nucleic Acids Res 34, 5325–5336.
39 Be
´
ke
´
si A, Felfo
¨
ldi F, Puka
´
ncsik M, Zagyva I & Ve

´
rtessy
GB (2008) USA Patent Application No. 11 ⁄ 160040.
40 Varga B, Barabas O, Kovari J, Toth J, Hunyadi-Gulyas
E, Klement E, Medzihradszky KF, Tolgyesi F, Fidy J
& Vertessy BG (2007) Active site closure facilitates
juxtaposition of reactant atoms for initiation of
catalysis by human dUTPase. FEBS Lett 581,
4783–4788.
41 Nemeth-Pongracz V, Barabas O, Fuxreiter M, Simon I,
Pichova I, Rumlova M, Zabranska H, Svergun D,
Petoukhov M, Harmat V et al. (2007) Flexible segments
modulate co-folding of dUTPase and nucleocapsid
proteins. Nucleic Acids Res 35, 495–505.
42 Dubrovay Z, Gaspari Z, Hunyadi-Gulyas E,
Medzihradszky KF, Perczel A & Vertessy BG (2004)
Multidimensional NMR identifies the conformational
shift essential for catalytic competence in the 60-kDa
Drosophila melanogaster dUTPase trimer. J Biol Chem
279, 17945–17950.
43 Schuck P (2000) Size-distribution analysis of macromol-
ecules by sedimentation velocity ultracentrifugation and
Lamm equation modeling. Biophys J 78, 1606–1619.
44 Cole JL (2004) Analysis of heterogeneous interactions.
Methods Enzymol 384, 212–232.
45 Laue TM, Shah BD, Ridgeway TM & Pelletier
SL(1992) Computer-aided interpretation of analytical
sedimentation data for proteins. In Analytical
Ultracentrifugation in Biochemistry and Polymer
Science. pp. 90–125, Royal Society of Chemistry,

Cambridge.
46 Larkin MA, Blackshields G, Brown NP, Chenna R,
McGettigan PA, McWilliam H, Valentin F, Wallace
IM, Wilm A, Lopez R et al. (2007) Clustal W and
Clustal X version 2.0. Bioinformatics 23, 2947–2948.
47 Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver
B, Markow TA, Kaufman TC, Kellis M, Gelbart W,
Iyer VN et al. (2007) Evolution of genes and genomes
on the Drosophila phylogeny. Nature 450 , 203–218.
48 Studebaker AW, Balendiran GK & Williams MV
(2001) The herpesvirus encoded dUTPase as a potential
chemotherapeutic target. Curr Protein Pept Sci 2,
371–379.
49 DeLano WL (2002) The PyMOL Molecular Graphics
System. DeLano Scientific, San Carlos, CA.
M. Puka
´
ncsik et al. Protein function preserved in a truncated isoform
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS 1259

×