Tải bản đầy đủ (.pdf) (15 trang)

Báo cáo khoa học: Drosophila proteins involved in metabolism of uracil-DNA possess different types of nuclear localization signals pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.08 MB, 15 trang )

Drosophila proteins involved in metabolism of uracil-DNA
possess different types of nuclear localization signals
Ga
´
bor Mere
´
nyi
1
, Emese Ko
´
nya
1
and Bea
´
ta G. Ve
´
rtessy
1,2
1 Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
2 Department of Applied Biotechnology, Budapest University of Technology and Economics, Budapest, Hungary
Introduction
In eukaryotic organisms, proteins with cognate nuclear
function must penetrate the nuclear envelope after
translation in the cytoplasm. Nuclear import and
export of proteins can proceed by active or passive
transport, or as a member of protein complex actively
targeted into the nucleus [1–3]. For this latter mecha-
nism, which is the major one for proteins larger than
30–35 kDa, specific and direct nuclear targeting
requires the presence of a nuclear localization signal
(NLS), which is the relevant sequence information in


Keywords
cellular trafficking; Drosophila melanogaster;
dUTPase; nuclear localization signal;
uracil-DNA degrading factor
Correspondence
B.G. Ve
´
rtessy, Institute of Enzymology,
Biological Research Center, Hungarian
Academy of Sciences, Karolina u
´
t 29,
H-1113 Budapest, Hungary
Fax: +36 1 466 5465
Tel: +36 1 279 3116
E-mail:
(Received 4 August 2009, revised 23
February 2010, accepted 1 March 2010)
doi:10.1111/j.1742-4658.2010.07630.x
Adequate transport of large proteins that function in the nucleus is indis-
pensable for cognate molecular events within this organelle. Selective pro-
tein import into the nucleus requires nuclear localization signals (NLS)
that are recognized by importin receptors in the cytoplasm. Here we inves-
tigated the sequence requirements for nuclear targeting of Drosophila pro-
teins involved in the metabolism of uracil-substituted DNA: the recently
identified uracil-DNA degrading factor, dUTPase, and the two uracil-DNA
glycosylases present in Drosophila. For the uracil-DNA degrading factor,
NLS prediction identified two putative NLS sequences [PEKRKQE(320–
326) and PKRKKKR(347–353)]. Truncation and site-directed mutagenesis
using YFP reporter constructs showed that only one of these basic

stretches is critically required for efficient nuclear localization in insect
cells. This segment corresponds to the well-known prototypic NLS of SV40
T-antigen. An almost identical NLS segment is also present in the
Drosophila thymine-DNA glycosylase, but no NLS elements were pre-
dicted in the single-strand-specific monofunctional uracil-DNA glycosylase
homolog protein. This latter protein has a molecular mass of 31 kDa,
which may allow NLS-independent transport. For Drosophila dUTPase,
two isoforms with distinct features regarding molecular mass and subcellu-
lar distribution were recently described. In this study, we characterized the
basic PAAKKMKID(10–18) segment of dUTPase, which has been pre-
dicted to be a putative NLS by in silico analysis. Deletion studies, using
YFP reporter constructs expressed in insect cells, revealed the importance
of the PAA(10–12) tripeptide and the ID(17–18) dipeptide, as well as the
role of the PAAK(10–13) segment in nuclear localization of dUTPase. We
constructed a structural model that shows the molecular basis of such rec-
ognition in three dimensions.
Abbreviations
NLS, nuclear localization signal; LD-DUT, long isoform of dUTPase; NTT-DUT, N-terminally truncated short isoform of dUTPase; SMUG1,
single-strand-specific monofunctional uracil-DNA glycosylase 1; T-ag, T-antigen; TDG, thymine-DNA glycosylase; UDE, uracil-DNA degrading
factor.
2142 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
the primary structure of proteins. Several NLS
sequence motifs have been identified to date, and there
is no unique well-defined consensus amino acid
sequence for all NLS [4,5]. However, major common
characteristics of these sequences are (i) a high content
of basic amino acid residues such as lysine (K) and
arginine (R), and (ii) the presence of conserved pro-
line(s) (P) potentially involved in breaking secondary
structural elements within the NLS. One group of sim-

ple NLS includes monopartite motifs, generally defined
as a short amino acid region consisting of 4–6 basic
residues in a row, like the classic NLS of SV40 large
T-antigen (SV40 T-ag) [6]. Another type of NLS, such
as the NLS of nucleoplasmin in Xenopus laevis, com-
prises bipartite motifs, which contain two distinct
stretches of positively charged clusters separated by a
mutation-tolerant linker region [7]. In addition,
sequences containing several neutral or even negatively
charged conserved residues may also act as functional
monopartite NLS, with the negatively charged aspar-
tate ⁄ glutamate (D ⁄ E) also contributing to NLS func-
tion [8]. Interestingly, the NLS of human RanBP3 [9]
is an unusual signal with close homology to the NLS
of c-Myc [10].
Nuclear proteins containing NLS motifs could enter
into the nucleus via the nuclear pore, utilizing a strictly
organized mechanism maintained by karyopherin mol-
ecules and the nuclear pore complex [11]. The nuclear
pore complex is a large protein complex consisting of
multiple subunits and located in the nuclear mem-
brane. It is also the main possibility for exchange of
small particles, e.g. ions, nucleotides, etc., between the
nuclear and cytosolic compartments. Importin b,
a type of karyopherin molecule, is a nuclear transport
receptor, which can bind its molecular cargo either
directly or indirectly through adaptor proteins such as
importin a. Importin b is unable to bind directly to
classical nuclear targeting motifs such as the NLS of
SV40 T-ag or the NLS of nucleoplasmin, but could

mediate nuclear import indirectly in association with
importin a. Importin a possesses two major domains
for its adaptor function, the importin b binding (IBB)
domain in its N-terminus and the C-terminal NLS-
binding domain. In the absence of importin b,an
auto-inhibiting part of the IBB domain forms an intra-
molecular interaction with the NLS-binding domain,
preventing the association with NLS on the cargo pro-
tein. Thus, the presence or absence of importin b regu-
lates the NLS binding ability of importin a. The
relatively large NLS-binding domain of importin a
consists of ten armadillo repeats, each constituting
three a-helices. In association with each other, the
armadillo repeats form a large concave superhelical
molecular surface. The NLS peptide of the cargo binds
in extended conformation to the binding pockets of
the superhelical surface of importin a. These binding
pockets contain several conserved residues (e.g. aspara-
gine, tryptophan and negatively charged residues)
involved in hydrophobic and electrostatic interactions
with the positively charged residues of the NLS (see [1]
for recent review).
Here, we wished to identify and characterize NLS
for Drosophila melanogaster proteins involved in
uracil-DNA metabolism. Four such major proteins
have been described to date: (i) the newly identified
uracil-DNA degrading factor (UDE) [12,13], (ii) dUT-
Pase, which is responsible for prevention of uracil
incorporation into DNA [14], and (iii) two DNA gly-
cosylases, thymine-DNA glycosylase (TDG) [15] and

the single-strand-specific monofunctional uracil-DNA
glycosylase 1 (SMUG1) homolog protein.
The UDE protein, encoded by the CG18410 gene in
the D. melanogaster genome, was recently identified in
a pull-down screen on uracil-DNA from larval extracts
[12]. In vitro studies have shown that this protein spe-
cifically degrades uracil-containing DNA, but lacks
any appreciable homology to previously described ura-
cil-DNA-recognizing proteins. CG18410 gene expres-
sion may be under developmental control, and the
protein has been suggested to play a role in metamor-
phosis in Drosophila. The subcellular localization of
this protein had not been characterized.
dUTPase catalyzes the cleavage of dUTP into
dUMP to control cellular dUTP ⁄ dTTP ratios, and is
an essential enzyme in both prokaryotes and eukary-
otes [16,17]. Lack of dUTPase leads to uracil-substi-
tuted DNA that perturbs base excision repair,
resulting in DNA fragmentation and thymine-less cell
death [14]. Most dUTPases are homotrimers with
native molecular masses of approximately 50–65 kDa
[18–24]. Both human and D. melanogaster cells contain
a nuclear isoform of dUTPase, and the NLS segment
of the human enzyme has been investigated in detail
[25]. In D. melanogaster dUTPase, a similar N-terminal
segment was recently proposed as the NLS region [26].
In D. melanogaster, two physiological isoforms of the
enzyme were identified, with apparent molecular
masses of 69 and 63 kDa for the native homotrimers
(termed long isoform, LD-DUT, and the N-terminally

truncated short isoform, NTT-DUT, respectively) [27].
Only LD-DUT contains the complete putative NLS
sequence [PAAKKMKID(10–18)], while NTT-DUT
lacks 14 residues at the N-terminus. This segment
shows a high degree of flexibility and cannot be
located in the 3D structure of the protein determined
by X-ray crystallography (PDB ID 3ECY) [21]).
G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2143
Uracil-DNA glycosylases are the key repair enzymes
that remove uracil from DNA by catalyzing cleavage
of the N-glycosidic bond [28]. To perform this function
in eukaryotic cells, these enzymes must reside in the
nuclear or mitochondrial compartments ([29]. There
are four or five major families of uracil-DNA glycosy-
lases, but only two of these are encoded in the
D. melanogaster genome [30]. The molecular mass of
these two glycosylases, based on reported sequences
[15], are 191 kDa for TDG and 31 kDa for the
SMUG1 homolog. No quantitative data are available
indicating potential oligomerization for the monomeric
species, and the family member uracil-DNA glycosylase
is a monomer [31].
In the present study, we aimed to (i) determine the
subcellular distribution of UDE, (ii) identify sequence
determinants essential for nuclear translocation in pro-
teins involved in uracil-DNA metabolism in Drosoph-
ila, and (iii) functionally characterize these NLS. Based

on in silico prediction, we fused various sequence seg-
ments from the ORF of UDE and dUTPase to the
yellow fluorescent protein (YFP) and generated chime-
ric reporter constructs. In addition, to characterize the
essential and sufficient amino acids of the NLS, we
performed deletion studies and site-directed mutagene-
sis on the putative NLS regions. For transient transfec-
tion studies, we used the Sf9 homogeneous insect cell
line, which has superior characteristics for subcellular
sorting analysis compared with the Drosophila Schnei-
der 2 cell line, including convenient generation time,
and its morphology allows straightforward microscopic
detection of cellular compartments.
Results and Discussion
Subcellular targeting of UDE
Nuclear targeting of UDE may be critical for perfor-
mance of the suggested degradation function on geno-
mic DNA containing uracil [12]. In silico prediction
(using PSORTII [32]; />suggested two individual clusters of residues as a puta-
tive NLS region, separated by 21 amino acids, in the
C-terminus of the protein (Fig. 1A, and Tables 1 and
2). The first cluster (NLS1),
PEKRKQE(320–326),
consists of both positively and negatively charged resi-
dues. The second stretch (NLS2),
PKRKKKR(347–
353)E, is located at the very end of the C-terminus and
has a high proportion of positively charged amino
acids. Underlined residues are predicted to be part of
the NLS. Each sequence starts with the neutral amino

acid proline and ends its context with glutamic acid.
We fused the full-length UDE, containing these two
predicted sequences, to the N-terminus of YFP. After
Sf9 cell transfection using the chimera construct, fluo-
rescence was observed on samples of fixed cells. The
22.2 kDa YFP alone, used as a control, could pene-
trate non-selectively through the nuclear pore, most
probably because its smaller molecular mass allows
passive diffusion. Fluorescence microscopy analysis
showed that the YFP-tagged UDE has an exclusive
nuclear localization in Sf9 cells (Fig. 2A and Table 3).
In the control experiment, YFP alone was observed
throughout the cell (Fig. 2K and Table 3). These data
demonstrate that the wild-type UDE is targeted specifi-
cally and exclusively into the nucleus, in agreement
with its putative nuclear function in insect cells.
Subcellular distribution of C-terminal truncated
forms of UDE
To test whether the nuclear import of UDE requires
any or both of the predicted signals, various C-termi-
nally truncated UDE species were linked to the N-ter-
minus of the YFP reporter (Fig. 1). In the first
construct, UDED(316)355)–YFP, a large part of the
C-terminus was deleted, including both putative NLS
segments. In the second construct, UDED(346)355)–
YFP, the last ten residues of the C-terminus were
removed, including the PKRKKKR(347–353)
sequence. The reporter constructs were introduced into
Sf9 cells and subsequently analyzed by fluorescent
microscopy. The results show that lack of the full-

length flexible C-terminal region, containing both of
the predicted signals, totally abolished the nuclear dis-
tribution, causing significant cytoplasmic retention of
UDE (Fig. 2B and Table 3). When the last ten residues
of the C-terminus, including only the second predicted
NLS, were deleted, the pattern of subcellular distribu-
tion was also exclusively cytoplasmic (Fig. 2C). These
results suggest that the PEKRKQE(320–326) sequence
on its own is not able to translocate the protein into
the nuclear compartment. In contrast, the presence of
the PKRKKKR(347–353) sequence, consisting of six
contiguous positively charged amino acids, is critical
for exclusive nuclear localization of UDE. The
PKRKKKR(347–353) segment is almost identical to
the NLS of SV40 T-ag, indicating a powerful capabil-
ity for function as an NLS.
Subcellular targeting of UDE containing specific
site mutations in the NLS sequence
To extend our investigations, we generated separate
mutations to identify amino acids responsible for the
nuclear targeting function of the PKRKKKR sequence
Characterization of NLS segments G. Mere
´
nyi et al.
2144 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
(Fig. 1). The K350A ⁄ K351A double mutation slightly
altered the pattern of subcellular distribution, indicat-
ing attenuation of the nuclear targeting effect (Fig. 2D
and Table 3). The K350A ⁄ K351A ⁄ K352A ⁄ R353A
quadruple mutation also perturbed the exclusive

nuclear targeting of UDE, resulting in significant cyto-
plasmic retention (Fig. 2E). Based on these results, the
PKRKKKR(347–353) sequence is suggested to be a
strong NLS sequence with high mutation tolerance. In
accordance with the putative segments defined by
in silico prediction (Table 1), it was found that the
presence of the KPKR(346–349) segment is sufficient
for partial nuclear localization of the protein.
Subcellular targeting potential of the predicted
UDE NLS1 and NLS2 sequences
To determine whether either of the two predicted NLS
sequences possess strong nuclear targeting potential on
their own, the PEKRKQE (NLS1) and PKRKKKR
(NLS2) coding sequences were fused as a C-terminal tag
to YFP protein (Fig. 1). The constructs YFP–UDE-
NLS1 and YFP–UDE-NLS2 were transiently transfect-
ed into Sf9 cells. Expression and intracellular appear-
ance of the fluorescent proteins were observed by
fluorescent microscopy. The results show that the NLS2
segment has selective and powerful targeting potential
for accumulation of YFP in the nucleus (Fig. 2F and
Table 3). The pattern of subcellular distribution of
YFP–UDE-NLS1 was not exclusively nuclear or cyto-
plasmic, although some accumulation was observed
within the nuclear compartment compared to the YFP
control (compare Fig. 2G and K). Further, the C-termi-
nal portion of UDE was fused to YFP and expressed in
Sf9 cells. This UDED(1)319)–YFP reporter construct
containing both predicted NLS sequences was
exclusively retained in the nucleus (Fig. 2H). After

introducing quadruple mutations (K350A ⁄ K351A ⁄
K352A ⁄ R353A) into this construct UDED[1)319
(350AAAA353)]–YFP, the exclusive nuclear distribu-
tion was highly perturbed, but increased nuclear accu-
mulation was observed compared to YFP alone
A
B
C
UDEΔ(316–355)-YFP
UDE
WT
-YFP
UDEΔ(346–355)-YFP
UDEΔ(1–319)-YFP
UDEΔ[1–319 (350AAAA353)]-YFP
YFP-UDE-NLS2Δ350–353
YFP-UDE-NLS2
YFP-UDE-NLS1
UDE(350AA351)-YFP
UDE(350AAAA353)-YFP
Fig. 1. Scheme of D. melanogaster UDE constructs used in the present study. (A) Position and context of putative nuclear localization
sequences (underlined) within the flexible C-terminus of D. melanogaster UDE are indicated. (B) Schematic representation of various UDE–
YFP reporter constructs. The wild-type (wt), flexible C-terminally truncated [D(316)355)] and the NLS truncated [D(346)355)] coding
sequences were fused in-frame to the N-terminus of YFP protein, resulting in UDE
WT
–YFP, UDED(316)355)–YFP and UDED(346)355)–YFP
reporter constructs. The UDE(350AA351)–YFP reporter construct contains the K350A and K351A mutations, and the UDE(350AAAA353)–
YFP reporter construct contains the K350A, K351A, K352A and R353A mutations. The truncated reporter constructs UDED(1)319)–YFP and
UDED[1)319(350AAAA353)]–YFP are also indicated. The relevant regions, positions and mutations of the NLS of UDE are indicated by differ-
ently shaded boxes. (C) The predicted NLS sequences (NLS1 and NLS2) and the deleted variant of NLS2 were fused in-frame to the C-termi-

nus of the YFP ORF generating the YFP–UDE-NLS1, YFP–UDE-NLS2 and YFP–UDE-NLS2D(350)353) reporter constructs. Establishment of
vector constructs was performed as described in Experimental procedures.
G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2145
(Fig. 2I,K). The last examined reporter construct YFP–
UDE-NLS2D(350)353), which possesses only three
basic residues [KPKR(346–349)] from the NLS2 seg-
ment fused to YFP, also showed localization in the
nucleus and the cytoplasm, with some accumulation
within the nucleus (Fig. 2J).
These observations indicate that the NLS2 segment
is a strong monopartite NLS, and that the contribu-
tion of the predicted NLS1 to nuclear localization is
negligible. Within the NLS2 segment, both the KPKR
and the KKKR tetrapeptides contribute to nuclear
localization.
Prediction of NLS signals in Drosophila
uracil-DNA glycosylases
Table 1 lists the predicted NLS signals for the TDG
protein. Several clusters of putative localization
signals were observed. Among these, the PKKRG
RKKK(711–719) sequence is almost identical to the
NLS of the SV40 T-ag and also to the UDE NLS
segment. As the SV40 T-ag has been extensively char-
acterized [33] and we also found in our present experi-
ments that such a sequence has very strong nuclear
localization potential, we propose that this sequence
also acts as an NLS in the TDG protein. For the

SMUG1 homolog protein, no nuclear localization sig-
nal was predicted by the PSORTII program (Table 1).
Lack of predicted signals cannot be taken as evidence
for the actual absence of NLS segments, as prediction
performs well only for classical NLS. It is also worth-
while noting that the molecular size of SMUG1 may
allow passive translocation to the nucleus.
Subcellular distribution of the D. melanogaster
dUTPase isoforms
For D. melanogaster dUTPase, prediction identified
the underlined segment within
PAAKKMK(10–16)ID
as a conventional NLS comprising a short cluster of
non-polar and basic residues (Fig. 3, and Tables 1 and
2). To determine the subcellular distribution of
D. melanogaster dUTPase isoforms in the Sf9 cell line,
Table 1. In silico predictions of putative nuclear localization signals
of Drosophila dUTPase, UDE, TDG and SMUG1 homolog proteins.
To identify the putative nuclear localization sites, the full length
open-reading frame sequences of the proteins were obtained from
the UniProt database () and analyzed using
PSORTII ( Putative signal sequences,
defined as potential NLS regions, are shown, with the number in
parentheses indicating the number of the first residue.
Protein
Uniprot ID
of protein
(UniProtKB ⁄
TrEMBL)
ORF length

(amino acids
Sequences defined
as putative NLS
segments
UDE Q961C 355 KPKR (346)
PKRK (347)
KRKK (348)
RKKK (349)
KKKR (350)
PEKRKQE (320)
PKRKKKR (347)
dUTPase Q9V3I1 188 PAAKKMK (10)
TDG Q9V4D8 1738 PKKR (711)
RKKK (716)
RKKH (760)
KKKR (1088)
RPKK (1093)
PKKK (1141)
KKKR (1142)
RPKK (1147)
PNNRKRQ (114)
PMPKKRG (709)
PKKRGRK (711)
PKERKKH (757)
PLEKKKR (1085)
PKKIKGQ (1094)
PKKKRGR (1141)
PKKLKPA (1148)
SMUG1 homolog Q9VEM1 280 None
Table 2. Comparison of UDE, dUTPase and TDG NLS segments

with NLS sequences of various proteins. The monopartite
sequences listed show close similarity to either the SV40 T-ag NLS
or the c-Myc NLS segments. The NLS sequences of UDE and TDG
show close homology to the SV40 T-ag NLS, but the D. melanogas-
ter dUTPase NLS belongs to the c-Myc group. Interestingly, the
NLS segment of human dUTPase is more similar to the first group
of sequences. For comparison, the classic bipartite NLS sequence
of X. laevis nucleoplasmin is shown, which possesses an additional
short cluster of basic residues separated by 10 amino acids from
the basic stretch, which has close homology with the NLS of SV40
T-ag. SV40 T-ag, simian virus 40 large T-antigen [6]; v-Jun, sarcoma
virus 17 oncogene homolog [39]; H2B, histone 2B [40]; UDE, uracil-
DNA degrading factor; human dUTPase [25]; c-Myc, myelocytoma-
tosis cellular oncogene [10]; RanBP3, Ran binding protein 3 [9].
Protein NLS sequence
Monopartite
SV40 T-ag PKKKRKV
UDE of D. melanogaster PKRKKKR
TDG of D. melanogaster PKKRGRKKK
v-Jun of Homo sapiens SKSRKRKL
dUTPase of Homo sapiens PSKRARP
H2B of Saccharomyces cerevisiae GKKRSKV
c-Myc of Homo sapiens PAAKRVKLD
dUTPase of D. melanogaster PAAKKMKID
c-Myc of Xenopus laevis VSSKRAKLE
RanBP3 of Homo sapiens PPVKRERTS
Bipartite
Nucleoplasmin KRPAATKKAGQAKKKKLDK
Characterization of NLS segments G. Mere
´

nyi et al.
2146 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
reporter constructs were created by N-terminal fusion
to YFP (Fig. 3B and Table 3). Cellular targeting of
both isoforms was subsequently determined via cell
transfection experiments followed by fluorescent micro-
scopic detection. The results show that the long iso-
form of dUTPase (LD-DUT) is specifically targeted
into the nucleus, but the short one (NTT-DUT) was
not able to enter into the nuclear compartment and
remained exclusively in the cytoplasm (Fig. 4A,B).
This is in agreement with studies performed in Dro-
sophila Schneider S2 cells [26].
These results indicated that the presence of the pre-
dicted complete targeting sequence is necessary and
sufficient for exclusive nuclear targeting of the long
F
G
H
I
J
K
B
A
C
D
E
UDEΔ(316–355)-YFP
UDEΔ(346–355)-YFP
UDE(350AA351)-YFP

UDEΔ(1–319)-YFP
UDE(350AAAA353)
-YFP
UDEΔ[1–319
(350AAAA353)] -YFP
YFP-UDE-
NLS2Δ(350–353)
Fig. 2. Subcellular localization of D. melanogaster UDE protein and its various sequence derivatives. Fluorescence microscopy observations
show the subcellular distribution of chimeric UDE constructs. (A) Wild-type UDE (UDE
WT
–YFP) was targeted exclusively to the nucleus.
(B,C) Deletion studies showed that removal of the entire flexible C-terminus or the last ten residues of the C-terminus of the UDE ORF
results in exclusive cytoplasmic localization of chimeric constructs UDED(316)355)–YFP and UDED(346)355)–YFP, respectively. (D) The
reporter construct UDE(350AA351)–YFP, which contains a double K ⁄ A mutation, is predominantly located in the nucleus and slightly in the
cytoplasm. (E) Quadruple mutations in the reporter construct [UDE(350AAAA353)–YFP] have an attenuating effect on nuclear localization,
with most of the construct accumulating within the nucleus, although cytoplasmic localization was also observed. (F) The YFP–UDE-NLS2
reporter localized almost exclusively in the nucleus. (G) The YFP–UDE-NLS1 construct was seen in both the nuclear compartment and the
cytoplasm. (H) The UDED(1)319)–YFP reporter, which contains both predicted NLS sequences, was exclusively retained in the nucleus.
(I) The UDED[1)319(350AAAA353)]–YFP construct was seen in both the nucleus and the cytoplasm, but seemed to accumulate in the
nucleus. (J) The reporter construct YFP–UDE-NLS2D(350)353), which possesses only three basic residues from the NLS segment, did not
show any selective compartmentalization, and was distributed almost equally in the nucleus and the cytoplasm. (K) YFP alone was used as
a negative control. The cellular distribution of YFP was approximately the same within the nuclear and cytoplasmic compartments.
G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2147
Table 3. Summary of results for the subcellular distributions of reporter constructs. Details of the reporter constructs for dUTPase and
UDE are shown in the first three columns. The observed subcellular localizations of reporter constructs are indicated by plus and minus
signs. Two plus signs indicate distribution between the nuclear and cytoplasmic compartments; one plus sign indicates exclusion from either
the nucleus or the cytoplasm.

Protein Name of reporter construct NLS sequence present in reporter construct
Localization
Nucleus Cytoplasm
UDE UDE
WT
–YFP PEKRKQE; KPKRKKKR + )
UDED(316)355)–YFP None ) +
UDED(346)355)–YFP PEKRKQE ) +
UDE(350AA351)–YFP PEKRKQE; KPKRAAKR ++
UDE(350AAAA353)–YFP PEKRKQE; KPKRAAAA ++
UDED(1)319)–YFP PEKRKQE; KPKRKKKR + )
UDED[1)319(350AAAA353)]–YFP PEKRKQE; KPKRAAAA ++
YFP–UDE-NLS1 PEKRKQE ++
YFP–UDE-NLS2 KPKRKKKR + )
YFP–UDE-NLS2D(350)353) KPKR ++
YFP YFP None + +
dUTPase LD-DUT
WT
–YFP PAAKKMKID + )
NTT-DUT
WT
–YFP MKID ) +
DUT-NLS–YFP PAAKKMKID + )
DUT-NLSD(10)12)–YFP KKMKID ++
DUT-NLSD(10)13)–YFP KMKID ++
DUT-NLSD(17)18)–YFP PAAKKMK ++
DUT-NLSD(10)12,17)18)–YFP KKMK ++
DUT-NLSD(10)13,17)18)–YFP KMK ++
A
B

C
DUT-NLS-YFP
DUT-NLSΔ(10–12)-YFP
DUT-NLSΔ(10–13)-YFP
DUT-NLSΔ(17–18)-YFP
DUT-NLSΔ(10–12,17–18)-YFP
DUT-NLSΔ(10–13,17–18)-YFP
Fig. 3. Scheme of D. melanogaster dUTPase constructs used in the present study. (A) The position and context of putative nuclear localiza-
tion signals (underlined) are indicated in the N-terminus of the long isoform of D. melanogaster dUTPase. (B) The long (LD-DUT
WT
) and short
(NTT-DUT
WT
) isoforms of the D. melanogaster dUTPase coding sequences were fused in-frame to the N-terminus of the YFP ORF to gener-
ate the LD-DUT–YFP and NTT-DUT–YFP chimeric constructs, respectively. The relevant motifs, regions and positions of the NLS of dUTPase
are indicated by differently shaded boxes. (C) The NLS sequence (PAAKKMKID) and its truncated sequence variants (KKMKID and KMKID)
were fused in-frame to the N-terminus of the YFP ORF generating the DUT-NLS–YFP, DUT-NLSD(10)12)–YFP and the DUT-NLSD(10)13)–
YFP reporter constructs. Further reporter constructs, DUT-NLSD(17)18)–YFP, DUT-NLSD(10)12,17)18)–YFP and DUT-NLSD(10)13,17)18)–
YFP, are also indicated, which were generated in the way, but all lack the ID(17–18) dipeptide. Establishment of vector constructs was
performed by the general cloning method described in Experimental procedures.
Characterization of NLS segments G. Mere
´
nyi et al.
2148 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
isoform (LD-DUT). The partial segment MKID(15–18),
present on the short isoform, cannot drive nuclear
import. In the case of the short isoform (NTT-DUT),
absence of the first 14 residues of the N-terminus,
including the PAAKK(10–14) segment, dramatically
alters the translocation pattern of dUTPase.

Nuclear targeting potential of the dUTPase NLS
sequence and its truncated derivatives
To confirm that the complete putative NLS sequence
has nuclear targeting potential of its own, the
PAAKKMKID coding sequence was fused as an
N-terminal tag to YFP protein (Fig. 3). The construct
(DUT-NLS–YFP) was transiently transfected into Sf9
cells. After cell fixation, the expression and intracellu-
lar localization of the fluorescent protein were
observed by fluorescent microscopy. The results show
that this putative NLS sequence was able to confer
nuclear localization to the YFP protein (Fig. 4C and
Table 3). DUT-NLS–YFP is found predominantly in
the nuclear compartment, demonstrating that this
sequence, which possesses a cluster of basic amino
acids flanked by non-polar and acidic residues, is a
powerful NLS.
In order to identify amino acid residues that are
essential for NLS function, we constructed trun-
cated derivatives of the NLS sequence linked to the
A
B
C
D
E
F
G
H
I
NLS-YFP

NTT-DUT
WT
-YFP
LD-DUT
WT
-YFP
NLSΔ(10–12)-YFP
NLSΔ(10–13)-YFP
DUT-NLSΔ(17–18)
-YFP
DUT-NLSΔ(10–12,17–18)
-YFP
DUT-NLSΔ(10–13,17–18)
-YFP
Fig. 4. Subcellular localization of the isoforms of D. melanogaster dUTPase and its various NLS sequence derivatives. Fluorescence micros-
copy observations reveal the subcellular distribution of chimeric constructs. (A,B) The long isoform of dUTPase (LD-DUT
WT
–YFP) was local-
ized to the nucleus exclusively, and the short isoform (NTT-DUT
WT
–YFP) was present exclusively in the cytoplasm. (C) NLS sequence
studies show that, in the presence of the complete nuclear localization signal, the reporter construct DUT-NLS–YFP is located in the nucleus.
(D) Deletion of the first three residues (PAA), producing construct DUT-NLSD(10)12)–YFP) slightly perturbed exclusive nuclear localization,
with some cytoplasmic localization observed. (E) Deletion of the first four residues (PAAK), producing the reporter construct DUT-
NLSD(10)13)–YFP, resulted in localization to the nucleus and the cytoplasm in an approximately equal ratio. (F) The subcellular localization of
the reporter construct DUT-NLSD(17)18)–YFP, lacking the ID(17–18) dipeptide, was nuclear, with some infiltration into the cytoplasm. (G)
The DUT-NLSD(10)12,17)18)–YFP construct, which lacks the tripeptide PAA and the ID(17–18) dipeptide, shows an almost equal distribu-
tion in the nucleus and the cytoplasm. (H) The subcellular targeting of the DUT-NLSD(10)13,17)18)–YFP reporter was also not selective,
showing close to equal distribution in the nucleus and the cytoplasm. (I) YFP alone was used as a negative control. The cellular distribution
of YFP was approximately the same within the nuclear and cytoplasmic compartments.

G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2149
N-terminus of the YFP reporter. In the first construct,
DUT-NLSD
10)12
–YFP, the neutral PAA tripeptide
was removed and the remaining part of the sequence,
KKMKID, was fused to the YFP reporter. In the sec-
ond construct, the PAAK residues were deleted and
the KMKID stretch was fused to the reporter, result-
ing in the chimeric fluorescent construct DUT-
NLSD(10)13)–YFP. After transfection and subsequent
fixation of Sf9 cells, the NLS potential of the individ-
ual truncated derivatives was monitored by fluorescent
microscope. Observations show that deletion of the
PAA tripeptide slightly perturbs nuclear localization,
as cytoplasmic fluorescence was also observed
(Fig. 4D). Although the PAA neutral tripeptide alone
may not define subcellular compartmentalization for
proteins, its position upstream of the short cluster of
basic residues may be essential to relax the secondary
structure of polypeptide chain, facilitating the molecu-
lar interaction with importins. Removal of these three
non-basic residues of the dUTPase NLS resulted in
moderate perturbation of nuclear import and accumu-
lation. In the truncated construct lacking the PAAK
segment, we observed greatly increased cytoplasmic
localization of the fluorescent reporter construct

(Fig. 4E). This observation indicates that removal of
only one positively charged residue in addition to the
PAA tripeptide strongly alters recognition characteris-
tics within the nuclear import machinery.
Furthermore, we established and examined three
additional NLS–reporter constructs lacking the ID(17–
18) dipeptide of the putative NLS sequence. The subcel-
lular distribution of the DUT-NLSD(17)18)–YFP
construct was nuclear, with some infiltration in the cyto-
plasm (Fig. 4F). The DUT-NLSD(10)12,17)18)–YFP
construct, which lacks the first PAA tripeptide, shows
an almost equal distribution within the nucleus and the
cytoplasm (Fig. 4G). The subcellular targeting of the
third reporter construct, DUT–NLSD(10)13,17)18)–
YFP, which lacks the PAAK residues, was also close
to equal distribution between the nucleus and the cyto-
plasm (Fig. 4H). These results indicate that the lack of
ID(17–18) might slightly decrease the exclusive nuclear
localization potential of the predicted NLS sequence.
Additional oligopeptide deletions (PAA and PAAK)
have a further negative effect on the nuclear targeting
potential of the NLS sequence examined.
Structural model of the Drosophila dUTPase NLS
segment in complex with importin a protein
Binding of the NLS segment to importin a has been
characterized by in-depth structural studies that allow
molecular insight into the specific interactions. Based on
the published 3D structure of yeast importin a in com-
plex with the c-Myc NLS segment peptide (PDB ID
1EE4) [34], and the close similarity between the NLS

segments of c-Myc and Drosophila dUTPase (Table 2),
we modeled this latter peptide onto the c-Myc peptide
in the NLS peptide–yeast importin a structure. Fig-
ure 5A shows the alignment of the yeast and Drosophila
importin a protein sequences, which show 69% similar-
ity and 54% identity within the ten armadillo domains
responsible for NLS recognition. Figure 5A also shows
the aligned sequence of a mammalian importin a
(mouse importin a, which is 94% identical to the human
sequence) (PDB ID 1IAL) [35]. For mammalian impor-
tins, 3D structures of complexes with other types of
NLS peptides have been reported [36–38]. The align-
ments in Fig. 5A show the high degree of conservation
of helical structure and residues interacting with NLS
peptides. Figure 5B shows the structural models of the
two NLS peptides in complex with yeast importin a
(c-Myc NLS peptide in turquoise, Drosophila dUTPase
NLS peptide in green), indicating very close superposi-
tion of the two NLS segments. The close overlap is indi-
cated by the observation that the two colors (green and
turquoise) overlap considerably, and it is mostly the
green color that is seen as the dUTPase NLS peptide
was selected to be the ‘upper’ one in pymol. Conse-
quently, most of the molecular interactions are equally
present in both NLS peptides. Importantly, all impor-
tin a amino acids that contain atoms within 4 A
˚
of the
NLS peptides (displayed in orange in Fig. 5A,B) are
conserved between the yeast and Drosophila importin a

proteins, strengthening the assumption that the modeled
recognition does take place in the physiological com-
plex. There are two noteworthy differences between the
NLS peptides of c-Myc and D. melanogaster dUTPase:
lysine at position 14 in the dUTPase NLS is an arginine
in c-Myc, while methionine at position 15 in the
dUTPase NLS is a valine in c-Myc. With regard to
the important role of the PAAK(10–13) segment in the
NLS peptide, it is noteworthy that the e-NH
2
group of
the lysine residue at position 13 makes numerous con-
tacts: it is within H-bonding distance to three oxygen
atoms of conserved amino acids within importin a (the
main-chain oxygen of glycine at position 168, the side-
chain hydroxyl oxygen of threonine at position 173, and
the side-chain carboxylate oxygen of aspartate at posi-
tion 210; the numbering of the Drosophila sequence is
used). However, the subsequent lysine residue at posi-
tion 14 (arginine in c-Myc) cannot establish polar inter-
actions with the carboxylate oxygen of aspartate at
position 237 (the electrostatic bonding partner of the
arginine residue in the c-Myc peptide) due to its shorter
side chain. The methinone residue at position 15,
Characterization of NLS segments G. Mere
´
nyi et al.
2150 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
A
B

Fig. 5. Modeling the interactions between
the D. melanogaster dUTPase NLS segment
and importin a protein. (A) Sequence align-
ment for armadillo domains of Mus muscu-
lus (M. mus.), D. melanogaster (D. mel.)
and yeast importin a. Residues within the
a-helices constituting the armadillo domains
are shown on a pink background; residues
that contain atoms within 4 A
˚
of the NLS
peptides of c-Myc or D. melanogaster dUT-
Pase (see Fig. 5B) are on an orange back-
ground. Asterisks indicate identical residues,
semicolons and dots show highly conserved
or conserved replacements, respectively.
Ten armadillo domains (ARM) are shown.
(B) Three-dimensional structural model of
the NLS peptide–importin a complex. The
protein surface is shown for the first five
armadillo domains in either pink (for the
a-helices) or brown (for other protein parts).
The NLS peptides of c-Myc or D. melanog-
aster dUTPase and importin a residues that
contain atoms within 4 A
˚
of the peptides
are shown as stick models with atomic
coloring (red, oxygen; blue, nitrogen; yellow,
sulfur; orange, green or turquoise, carbon

atoms of importin a, dUTPase NLS and
c-Myc NLS, respectively). For orientation,
most residues of the dUTPase NLS are
labeled, together with four residues of
importin a (see text for details). Note that
the dUTPase NLS peptide can adopt a dock-
ing conformation equivalent to that of the
c-Myc peptide on the importin protein
surface.
G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2151
although larger than the valine in the c-Myc peptide,
can be accommodated without any steric constraints.
Conclusions
Adequate cellular sorting of proteins is a vital step in
maintenance of the normal homeostatic function of
cells. We identified and characterized two types of
monopartite nuclear localization sequences of D. mela-
nogaster proteins involved in uracil-DNA metabolism
(Tables 2 and 3). The C-terminus of UDE possesses
two predicted NLS segments, but experimental analysis
showed that one of these is sufficient for exclusive
nuclear localization. Several point mutations in the
major critical NLS sequence (which is almost identical
to the SV40 T-ag NLS) altered the subcellular distribu-
tion patterns only moderately, suggesting that this
NLS sequence has very high nuclear targeting potential
based on the high number of positively charged amino

acids in a row. Enzyme activity measurements per-
formed on a truncated UDE derivative (UDE Q310X)
showed that the protein function is not perturbed by
removal of the C-terminal NLS segments (Fig. 6). This
result suggests that folding is not much perturbed by
the C-terminal truncation, in agreement with a recent
study showing a high degree of disorder in the
C-terminus [13]. A very different short cluster of non-
polar and basic residues (PAAKKMKID), present on
the long isoform of D. melanogaster dUTPase, was
found to perovide highly efficient NLS function in the
dUTPase protein. This segment is localized in a very
flexible part of the protein that is not visible in the
crystal structure of D. melanogaster dUTPase [21,24].
Deletion studies showed the importance of the pres-
ence of the PAA tripeptide and the ID dipeptide, and
the role of the PAAK segment in nuclear targeting.
These studies also explained why the N-terminally
truncated dUTPase isoform is excluded from the
nucleus. A structural model of the Drosophila
dUTPase NLS segment in complex with importin a
protein indicates an important role for the lysine of
the PAAK segment by revealing its multiple interac-
tions with importin a. Comparing the dUTPase NLS
and the ‘classical’ sequence of UDE NLS, we conclude
that mutation tolerance may depend on the predomi-
nance of basic residues within the wild-type NLS seg-
ment. Importantly, however, neutral and proline
residues also contribute to the targeting potential.
Experimental procedures

Materials
Restriction enzymes, T4 DNA ligase and DNA polymerases
were from Fermentas (Ontario, Canada), New England
Biolabs (Ipswich, MA, USA) and Finnzyme (Espoo,
Finland), respectively. The pIZ ⁄ V5-His (pIZ) plasmid was
purchased from Invitrogen (Carlsbad, CA, USA). Oligos
were synthesized by Eurofin MWG Synthesis GmbH
(Ebersberg, Germany). Other materials were obtained from
Sigma-Aldrich and Calbiochem (Merck KGaA, Darmstadt,
Germany).
Cell line and culture
The Sf9 cell line (derived from Spodoptera frugiperda) was
purchased from Gibco (Invitrogen, Carlsbad, CA, USA).
Cells were cultured at the temperature 26 °C in SFM med-
ium (serum- and protein-free insect medium from Life
Technologies Inc., Carlsbad, CA, USA) supplemented with
10% FBS (Gibco), 1 mml-glutamine and 5 mLÆL
)1
penicil-
lin ⁄ streptomycin.
Generation of UDE reporter constructs
ude
WT
-yfp-pIZ, udeD(346)355)-yfp-pIZ and
udeD(316)355)-yfp-pIZ, pIZ-yfp vector constructs
To generate the various reporter constructs for UDE, full-
length (UDE
WT
) and truncated derivatives [UDED(316)355)
and UDED(346)355)] of the UDE coding sequence were

amplified by PCR using the ude-pET19b plasmid [12] as the
template, and the following primer pairs: ude-For (5¢-CTA
GCTAGCAT GCCGTCGAGT TGGAGAC GGCTAC-3¢)
with ude-Rev (5¢-GTTTA
GCGGCCGCTGCTCCTCC
CTCTTCTTCTTCC-3¢), ude-For with udeD(346)355)-Rev
(5¢-GTT TA
GCGGCCGCGCATCCTCGCC ATCGGAAT CC
TG-3¢), and ude-For with ud eD(316)355)-Rev (5¢-GT TTA
GC
GGCCGCCTCGAGATGGCCAGCTTTTCGATGTACT
GC-3¢), respectively. After DNA digestion with NheI and
NotI restriction enzymes (recognition sites underlined),
amplicons were cloned in-frame to N-terminus of the coding
sequence of YFP into the yfp-pRM vector [26]. The resulting
vector constructs [ude
wt
-yfp-, udeD(346)355)-yfp- and
0′ 15′ 30′ 60′ 90′
M
Fig. 6. Uracil-DNA-degrading activity of the Q310X C-terminally
truncated mutant UDE protein. Digestion times are given at the
top. Samples were denatured at 65 °C for 15 min.
Characterization of NLS segments G. Mere
´
nyi et al.
2152 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
udeD(316)355)–yfp-pRM] were used as template in PCR
reactions with the cloning primers ude-yfp-For (5¢-AACTT
AAGCTTACCACCATGGCGTCGAGTTGGAGACGGCT

ACGC-3¢) and yfp-Rev (5¢-GCTCTGC
TCTAGACTCGAG
TCACGCTTGTACAGCTCGTCCATGC-3¢). Each PCR
product [ude
wt
-yfp, udeD(346)355)-yfp and udeD(316)355)-
yfp] and the empty pIZ vector were digested using HindIII
(restriction site underlined) and XbaI enzymes. After diges-
tion, amplicons were cloned into linearized pIZ vector. In
addition, a pIZ-yfp vector construct was also generated
by PCR using the pRM-yfp vector as the DNA template, and
the yfp-For and yfp-Rev cloning primers containing HindIII
and XbaI restriction sites, respectively, followed by the gen-
eral cloning method.
ude(350AA351)-yfp-pIZ vector construct
The K350A and K351A site mutations were produced in the
UDE ORF using a QuikChange
Ò
site-directed mutagenesis
kit (Stratagene, Agilent Technologies Co., La Jolla, CA,
USA) according to the manufacturer’s instructions. PCR
reaction was performed using the ude
wt
-yfp-pIZ plasmid as
the template using primers NLS-mut
2A
-For (5¢-GATAA
GCCCAAAAGG
GCGGCGAAGAGGGAGGAG-3¢) and
NLS-mut

2A
-Rev (5¢-CTCCTCCCTCTTCGCCGCCCTTTT
GGGCTTATC-3¢), which contain the desired mutations
(underlined).
ude(350AAAA351)-yfp-pIZ vector construct
The ude(350AAAA351)-yfp-pIZ vector construct containing
the four K350A, K351A, K352A and R353A site mutations
was produced by PCR reaction using ude(350AA351)-yfp-
pIZ as the template and the cloning primers ude-yfp-For
and mut
4A
-Rev (5¢-TTTTTGCGGCCGCTGCTCCTCCGC
CGCCGCCGCCCTTTTG-3¢). The mut
4A
-Rev primer
sequence contains the desired mutation sites (underlined)
and a NotI recognition site (italic). After digestion, the
amplicon was cloned in-frame into the yfp-pIZ vector using
the HindIII and NotI restriction sites.
Further UDE reporter constructs
The reporter constructs udeD(1)319)-yfp-pIZ and ude-
D[1)319(350AAAA353)]-yfp-pIZ were generated by PCR
amplification using UDEd1-319-For (5¢-CTAGCAAGCTTA
CCACCATGGCGCCCGAAAAGCGCAAGCAGGAG-3¢)
and yfp-Rev as primers, and Ude
wt
-yfp-pIZ and
Ude(350AAAA351)-yfp-pIZ as templates, respectively. The
constructs pIZ-yfp-ude-NLS1, pIZ-yfp-ude-NLS2 and pIZ-
yfp-ude-NLS2D(350)353) were generated by PCR amplifica-

tion using UDE320–326-YFP-Rev (5¢-CGGCATGGACGA
GCTGTACAAGGCACCCGAAAAGCGCAAGCAGGAG
GCGTGATTCTAGACAGAGC-3¢), UDE346–353-YFP-
Rev (5¢-CGGCATGGACGAGCTGTACAAGAAGCCCA
AAAGGAAGAAGAAGAGGGAGGAGGCGGCGTGAT
TCTAGACAGAGC-3¢), UDE346–349-YFP-Rev (5-CGGC
ATGGACGAGCTGTACAAGGCAAAGCCCAAAAGGA
AGGCGGCGTGATTCTAGACAGAGC-3¢) and yfp-For
(5¢-CTAGCAA GCTTA CCACCA TGG TGAGC AAGGG C
GAGGAG-3¢) as the primers, and pIZ-yfp as the template.
Cloning processes were performed by digestion with HindIII
(forward primers) and XbaI (reverse primers) cloning method.
The generated constructs possess a Kozak sequence to
enhance the translation efficiency of fusion proteins.
Generation of dUTPase reporter constructs
ld-dut
WT
-yfp-pIZ and yfp-pIZ vector constructs
To fuse the D. melanogaster dUTPase protein to the N-termi-
nus of YFP, the full-length coding sequence was amplified by
PCR using the LDdut
WT
-pET22b recombinant construct as
the template [24], together with the forward primer
5¢-CTA
GCTAGCATGCCATCAACCGATTTCGC-3¢, and
reverse primer 5¢-GTTTAT
GCGGCCGCGTAGCAACAG
GAGCCGGAGC-3¢, containing NheI and NotI restriction
recognition sites (underlined), respectively. After restriction

digestion of DNA, the amplicon was cloned in-frame to the
coding sequence of YFP in the yfp-pRM vector [26]. The
resulting construct (ld-dut
WT
-yfp-pRM) was used as the tem-
plate in a subsequent PCR reaction using primers LD-For (5¢-
AACTT
AAGCTTACCACCATGGCATCAACCGATTTCG
CCGACATTC-3¢) and yfp-Rev (5¢-GCTCTGC
TCTAGA
CTCGAGTCACGCTTGTACAGCTCGTCCATGC-3¢)
containing HindIII and XbaI restriction sites (underlined),
respectively. The digested amplicon was cloned into pIZ
vector linearized with the same restriction enzymes, generat-
ing the ld-dut
WT
-yfp-pIZ construct. In addition, a yfp-pIZ
vector construct was also generated by PCR using the yfp-
pRM vector as the DNA template, and the yfp-For and yfp-
Rev cloning primers containing HindIII and XbaI restriction
sites, respectively, followed by the general cloning method.
ntt-dut
WT
-yfp-pIZ vector construct
To produce the NTT-DUT
WT
–YFP reporter construct,
NTT-DUT
WT
–YFP coding cDNA was amplified by PCR

reaction from the ld-dut
WT
-yfp-pIZ plasmid using ntt-For
(5¢-AACTT
AAGCTTACCACCATGAAGATCGACACGT
GCG-3¢) and yfp-Rev cloning primers containing Hin dIII
and XbaI restriction sites (underlined), respectively. After
digestion, the amplicon was inserted into the pIZ vector.
dut-NLS-yfp-pIZ, dut-NLSD(10)12)-yfp-pIZ and
dut-NLSD(10)13)-yfp-pIZ vector constructs
To generate the series of NLS–YFP reporter constructs,
individual PCR reactions were performed using yfp-pIZ
G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2153
plasmid as the template, with the forward primers: NLSwt-
For (5¢-CTAGC
AAGCTTACCACCATGGCGCCAGCTG
CCAAGAAGATGAAGATCGACATGGTGAGCAAGG
GCGAGGAGCTG- 3¢), NLSD(10)12)-For (5¢- CTAGC
AAGC
TTACCACCATGGCG AAGAA GAT GAAGAT CGACA T
GGTGAGCAAGGGCGAGGAGCTG-3¢)andNLSD(10)13)-
For (5¢-CTAGC
AAGCTTACCACCATGGCGAAGATGA
AGATCGACATGGTGAGCAAGGGCGAGGAGCTG-3¢),
respectively, and the yfp-Rev primer. The first part of each
forward primer consists of the wild-type or truncated [dut-
NLSD(10)12) and dut-NLSD(10)13)] dUTPase NLS region,

and the second part contains a complement segment hybrid-
izing to the start of YFP cDNA. HindIII and XbaIsites
were used to clone the digested amplicons into linearized
pIZ plasmid. We also generated three additional reporter
constructs that lack the ID(17–18) dipeptide from the NLS
sequence, dut-NLSD
17)18
-yfp-pIZ, dut-NLSD(10)12,17)18)-
yfp-pIZ and dut-NLSD(10)13,17)18)-yfp-pIZ. To amplify
the desired coding sequences, primers NLSwt-For* (5¢-CT
AGCAAGCTTACCACCATGG CGCCAGC TGC CAAGA
AGATGAAGATGGTGAGCAAGGGCGAGGAGCTG-3¢),
NLSD(10)12)-For* (5¢-CTAGCAAGCTTACCACCATGGC
GAAGAAGATGAAGATGGTGAGCAAGGGCGAGGA
GCTG-3¢) and NLSD
10)13
-For* (5¢-CT AGC AAGC TTACC A
CCATGGCGAAGATGAAGATGGTGAGCAAGGGCGA
GGAGCTG’-3), respectively, were used as the forward
primers, with yfp-Rev as the reverse primer and pIZ-yfp
vector as the template. Cloning processes were performed
using HindIII and XbaI digestion according to the general
method described above. All constructs except ld-dut
WT
-
yfp-pIZ possess a Kozak sequence to enhance the transla-
tion efficiency of the fusion proteins. For these NLS-con-
taining fusion proteins, the molecular mass is only slightly
altered (YFP alone is 22.2 kDa, and fused to the NLS
sequences its molecular mass is less than 23.2 kDa). There-

fore, NLS-containing YFP constructs are still capable of
passive entry into the nucleus, and the NLS potential of
these constructs is reflected in the extent of nuclear accu-
mulation of the constructs (as compared to equal distribu-
tion resulting from passive diffusion).
Transfection procedure, fixation and DAPI staining
For transient transfection of the Sf9 cell line, cells were
plated on circular slides in 24-well plates. After 4 h attach-
ment, cells were transfected using LipofectamineÔ 2000
(Invitrogen) reagent according to the manufacturer’s proto-
col. One day after transfection, the transfected cells were
washed (2 · 1 min) with 1· NaCl ⁄ P
i
and fixed in 3% para-
formaldehyde ⁄ NaCl ⁄ P
i
for 15 min. After washing
(2 · 1 min) with 1· NaCl ⁄ P
i
, cells were permeabilized with
0.1% Triton X-100 ⁄ -NaCl ⁄ P
i
for 5 min. To stain the
nuclei, the cells were incubated for 5 min at room tempera-
ture in 1 lgÆmL
)1
DAPI (4¢6¢-diamidino-2-phenylindole)
stain solution. Finally, cells were rinsed three times with 1·
NaCl ⁄ P
i

, and once with water. Circular slides were then
mounted on microscope slides using FluoroSave reagent
(Calbiochem, Merck KGaA, Darmstadt, Germany).
Fluorescence microscopy
Fluorescent microscopic images were obtained using a
Leica DMLS fluorescence microscope (Leica Microsystems
Inc., Bannockburn, IL, USA). The samples were visualized
using excitation and emission wavelengths of 485 and
530 nm, respectively, for YFP, and DAPI fluorescence was
visualized by excitation with UV light (355 nm) and
detected at 450 nm. The samples were observed using a 60·
oil immersion objective.
Construction of the Q310X C-terminally truncated
UDE mutant fragment (containing an N_terminal 6·
His tag)
We performed this truncation using a QuikChange
Ò
site-
directed mutagenesis kit (Stratagene). PCR was performed
using pET19b-UDE as a template [12] with the primers 5¢ -
CCGCAGCCGGACAAGCTGAAGTAGTACATCGAAA
GGC-3¢ (forward) and 5¢-CGAAAAGCTACATGATGAA
GTCGAACAGGCCGACGCC-3¢ (reverse). The DNA
sequence was confirmed by sequencing.
Expression and purification of Q310X UDE
Expression and purification of Q310X UDE used in the
plasmid DNA processing assay were performed by Ni-NTA
affinity chromatography as described previously [12].
Plasmid DNA processing assays
Twenty micrograms per mililiter linearized uracil containing

plasmid DNA or control plasmid DNA (prepared as
described previously [12]) and 10 lgÆmL
)1
Q310X truncated
UDE protein were incubated in assay buffer (25 mm Tris-
HCl, pH 7.5, 1 mm MgCl
2
, 0.1 mgÆmL
)1
albumin) for 0,
15, 30, 60, 90 minutes at 37 °C. After the appropriate reac-
tion time, samples were incubated at 65 °C for 15 min.
Products were detected by standard ethidium bromide
staining after agarose gel electrophoresis on 0.75% w ⁄ w
agarose gels [12].
Structural modeling
swiss-pdb viewer and modeler software (http://spdbv.
vital-it.ch/) were used to generate the model structure of
D. melanogaster dUTPase NLS peptide based on the c-Myc
peptide–importin a complex structure (PDB ID 1EE4) [34].
Residues that are altered between the two NLS segments
were mutated in silico, and the optimization module of the
Characterization of NLS segments G. Mere
´
nyi et al.
2154 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS
Modeler software was used to find a suitable conformation
for these residues. Figure 5B was prepared using pymol
(www.pymol.org).
Acknowledgements

This work was supported by grants from the Hungar-
ian Scientific Research Fund (OTKA K68229 and
CK-78646), the Howard Hughes Medical Institute
(55005628 and 55000342), the Alexander von Humboldt
Foundation, the National Office for Research and
Technology, Hungary (JA
´
P_TSZ_071128_TB_INTER),
and the European Union (FP6 SPINE2c LSHG-CT-
2006-031220 and TEACH-SG LSSG-CT-2007-037198)
to B.G.V.
References
1 Cook A, Bono F, Jinek M & Conti E (2007) Structural
biology of nucleocytoplasmic transport. Annu Rev Bio-
chem 76, 647–671.
2 Hoshino A, Hirst JA & Fujii H (2007) Regulation of
cell proliferation by interleukin-3-induced nuclear trans-
location of pyruvate kinase. J Biol Chem 282, 17706–
17711.
3 Vertessy BG, Bankfalvi D, Kovacs J, Low P, Lehotzky
A & Ovadi J (1999) Pyruvate kinase as a microtubule
destabilizing factor in vitro. Biochem Biophys Res
Commun 254, 430–435.
4 Christophe D, Christophe-Hobertus C & Pichon B
(2000) Nuclear targeting of proteins: how many differ-
ent signals? Cell Signal 12, 337–341.
5 Fries T, Betz C, Sohn K, Caesar S, Schlenstedt G &
Bailer SM (2007) A novel conserved nuclear localization
signal is recognized by a group of yeast importins.
J Biol Chem 282, 19292–19301.

6 Kalderon D & Smith AE (1984) In vitro mutagenesis of
a putative DNA binding domain of SV40 large-T.
Virology 139, 109–137.
7 Robbins J, Dilworth SM, Laskey RA & Dingwall C
(1991) Two interdependent basic domains in nucleoplas-
min nuclear targeting sequence: identification of a class
of bipartite nuclear targeting sequence. Cell 64 , 615–623.
8 Makkerh JP, Dingwall C & Laskey RA (1996) Compar-
ative mutagenesis of nuclear localization signals reveals
the importance of neutral and acidic amino acids. Curr
Biol 6, 1025–1027.
9 Welch K, Franke J, Kohler M & Macara IG (1999)
RanBP3 contains an unusual nuclear localization signal
that is imported preferentially by importin-a3. Mol Cell
Biol 19, 8400–8411.
10 Dang CV & Lee WM (1988) Identification of the
human c-myc protein nuclear translocation signal. Mol
Cell Biol 8, 4048–4054.
11 Madrid AS & Weis K (2006) Nuclear transport is
becoming crystal clear. Chromosoma 115, 98–109.
12 Bekesi A, Pukancsik M, Muha V, Zagyva I, Leveles I,
Hunyadi-Gulyas E, Klement E, Medzihradszky KF,
Kele Z, Erdei A et al. (2007) A novel fruitfly protein
under developmental control degrades uracil-DNA.
Biochem Biophys Res Commun 355, 643–648.
13 Pukancsik M, Bekesi A, Klement E, Hunyadi-Gulyas E,
Medzihradszky KF, Kosinski J, Bujnicki JM, Alfonso
C, Rivas G & Vertessy BG (2010) Physiological trunca-
tion and domain organization of a novel uracil-DNA-
degrading factor. FEBS J 277, 1245–1259.

14 Vertessy BG & Toth J (2009) Keeping uracil out of
DNA: physiological role, structure and catalytic mecha-
nism of dUTPases. Acc Chem Res 42, 97–106.
15 Hardeland U, Bentele M, Jiricny J & Schar P (2003)
The versatile thymine DNA-glycosylase: a comparative
characterization of the human, Drosophila and fission
yeast orthologs. Nucleic Acids Res 31, 2261–2271.
16 el-Hajj HH, Zhang H & Weiss B (1988) Lethality of a
dut (deoxyuridine triphosphatase) mutation in Escheri-
chia coli. J Bacteriol 170, 1069–1075.
17 Gadsden MH, McIntosh EM, Game JC, Wilson PJ &
Haynes RH (1993) dUtp pyrophosphatase is an essen-
tial enzyme in Saccharomyces cerevisiae. EMBO J 12,
4425–4431.
18 Vertessy BG, Persson R, Rosengren AM, Zeppezauer
M & Nyman PO (1996) Specific derivatization of the
active site tyrosine in dUTPase perturbs ligand binding
to the active site. Biochem Biophys Res Commun 219,
294–300.
19 Nemeth-Pongracz V, Barabas O, Fuxreiter M, Simon I,
Pichova I, Rumlova M, Zabranska H, Svergun D,
Petoukhov M, Harmat V et al. (2007) Flexible segments
modulate co-folding of dUTPase and nucleocapsid pro-
teins. Nucleic Acids Res 35, 495–505.
20 Whittingham JL, Leal I, Nguyen C, Kasinathan G, Bell
E, Jones AF, Berry C, Benito A, Turkenburg JP,
Dodson EJ et al. (2005) dUTPase as a platform for
antimalarial drug design: structural basis for the
selectivity of a class of nucleoside inhibitors. Structure
13, 329–338.

21 Takacs E, Barabas O, Petoukhov MV, Svergun DI &
Vertessy BG (2009) Molecular shape and prominent
role of b-strand swapping in organization of dUTPase
oligomers. FEBS Lett 583, 865–871.
22 Varga B, Barabas O, Takacs E, Nagy N, Nagy P &
Vertessy BG (2008) Active site of mycobacterial
dUTPase: structural characteristics and a built-in sen-
sor. Biochem Biophys Res Commun 373, 8–13.
23 Varga B, Barabas O, Kovari J, Toth J, Hunyadi-Gulyas
E, Klement E, Medzihradszky KF, Tolgyesi F, Fidy J
& Vertessy BG (2007) Active site closure facilitates
juxtaposition of reactant atoms for initiation of
G. Mere
´
nyi et al. Characterization of NLS segments
FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS 2155
catalysis by human dUTPase. FEBS Lett 581, 4783–
4788.
24 Kovari J, Barabas O, Takacs E, Bekesi A, Dubrovay Z,
Pongracz V, Zagyva I, Imre T, Szabo P & Vertessy BG
(2004) Altered active site flexibility and a structural
metal-binding site in eukaryotic dUTPase: kinetic char-
acterization, folding, and crystallographic studies of the
homotrimeric Drosophila enzyme. J Biol Chem 279,
17932–17944.
25 Tinkelenberg BA, Fazzone W, Lynch FJ & Ladner RD
(2003) Identification of sequence determinants of human
nuclear dUTPase isoform localization. Exp Cell Res
287, 39–46.
26 Muha V, Zagyva I, Venkei Z, Szabad J & Vertessy BG

(2009) Nuclear localization signal-dependent and -inde-
pendent movements of Drosophila melanogaster
dUTPase isoforms during nuclear cleavage. Biochem
Biophys Res Commun 381, 271–275.
27 Bekesi A, Zagyva I, Hunyadi-Gulyas E, Pongracz V,
Kovari J, Nagy AO, Erdei A, Medzihradszky KF &
Vertessy BG (2004) Developmental regulation of dUT-
Pase in Drosophila melanogaster. J Biol Chem 279,
22362–22370.
28 Pearl LH (2000) Structure and function in the uracil-
DNA glycosylase superfamily. Mutat Res 460, 165–181.
29 Muller-Weeks S, Mastran B & Caradonna S (1998) The
nuclear isoform of the highly conserved human uracil-
DNA glycosylase is an M
r
36,000 phosphoprotein.
J Biol Chem 273, 21909–21917.
30 Aravind L & Koonin EV (2000) The alpha ⁄ beta fold
uracil DNA glycosylases: a common origin with diverse
fates. Genome Biol 1 , RESEARCH0007.
31 Mol CD, Arvai AS, Sanderson RJ, Slupphaug G, Kavli
B, Krokan HE, Mosbaugh DW & Tainer JA (1995)
Crystal structure of human uracil-DNA glycosylase in
complex with a protein inhibitor: protein mimicry of
DNA. Cell 82, 701–708.
32 Nakai K & Horton P (1999) PSORT: a program for
detecting sorting signals in proteins and predicting their
subcellular localization. Trends Biochem Sci 24, 34–36.
33 Conti E, Uy M, Leighton L, Blobel G & Kuriyan J
(1998) Crystallographic analysis of the recognition of a

nuclear localization signal by the nuclear import factor
karyopherin a. Cell 94, 193–204.
34 Conti E & Kuriyan J (2000) Crystallographic analysis
of the specific yet versatile recognition of distinct
nuclear localization signals by karyopherin a. Structure
8, 329–338.
35 Kobe B (1999) Autoinhibition by an internal nuclear
localization signal revealed by the crystal structure
of mammalian importin a. Nat Struct Biol 6, 388–
397.
36 Fontes MR, Teh T, Jans D, Brinkworth RI & Kobe B
(2003) Structural basis for the specificity of bipartite
nuclear localization sequence binding by importin-a.
J Biol Chem 278, 27981–27987.
37 Chen M-H, Ben-Efraim I, Mitrousis G, Walker-Kopp
N, Sims PJ & Cingolani G (2005) Phospholipid scramb-
lase 1 contains a nonclassical nuclear localization signal
with unique binding site in importin a. J Biol Chem
280, 10599–10606.
38 Chida K & Vogt PK (1992) Nuclear translocation of
viral Jun but not of cellular Jun is cell cycle dependent.
Proc Natl Acad Sci USA 89, 4290–4294.
39 Moreland RB, Langevin GL, Singer RH, Garcea RL
& Hereford LM (1987) Amino acid sequences that
determine the nuclear localization of yeast histone 2B.
Mol Cell Biol 7, 4048–4057.
Characterization of NLS segments G. Mere
´
nyi et al.
2156 FEBS Journal 277 (2010) 2142–2156 ª 2010 The Authors Journal compilation ª 2010 FEBS

×