Tải bản đầy đủ (.pdf) (9 trang)

Báo cáo Y học: Characterization of the self-splicing products of two complex Naegleria LSU rDNA group I introns containing homing endonuclease genes pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.03 MB, 9 trang )

Characterization of the self-splicing products of two complex
Naegleria
LSU rDNA group I introns containing homing endonuclease
genes
Peik Haugen
1
, Johan F. De Jonckheere
2
and Steinar Johansen
1
1
RNA Research group, Department of Molecular Biotechnology, Institute of Medical Biology, University of Tromsø, Tromsø,
Norway;
2
Protozoology Laboratory, Scientific Institute Public Health – Louis Pasteur, Brussels, Belgium
The two group I introns Nae.L1926 and Nmo.L2563, found
at two different sites in nuclear LSU rRNA genes of
Naegleria amoebo-flagellates, have been characterized
in vitro. Their structural organization is related to that of the
mobile Physarum intron Ppo.L1925 (PpLSU3) with ORFs
extending the L1-loop of a typical group IC1 ribozyme.
Nae.L1926, Nmo.L2563 and Ppo.L1925 RNAs all s elf-
splice in vitro, generating ligated exons and full-length intron
circles as well as internal processed excised intron RNAs.
Formation of full-length intron cir cles is found to be a
general feature in RNA processing of ORF-containing
nuclear group I introns. Both Naegleria LSU rDNA introns
contain a conserved polyadenylation signal at exactly the
same position in the 3¢ end of the ORFs close to the internal
processing sites, indicating an RNA polymerase II-like
expression pathway of intron proteins in vivo. The intron


proteins I-NaeIandI-NmoI encoded by N ae.L1926 and
Nmo.L2563, respectively, correspond to His-Cys homing
endonucleases of 148 and 175 amino acids. I-NaeIcontains
an additional sequence motif homologous to the unusual
DNA binding motif of three antiparallel b sheets found in
the I -PpoI endonuclease, the product of t he Ppo.L1925
intron ORF.
Keywords: group I ribozyme; mobile intron; ribosomal
DNA; R NA processing.
About 3% of the  850 nuclear group I i ntrons in the
database contain large ORFs or ORF-like sequences
inserted into peripheral loop regions of their corresponding
group I ribozymes. Whereas ORFs encoding proteins with
a possible structural r ole have been noted in the green alga
Scenedesmus [1] and the fungus Protomyces [2], most
nuclear group I intron ORFs correspond to endonucleases
(Table 1). All the nuclear endonucleases contain a con-
served histidine and cysteine rich motif [3,4] directly
involved in zinc-binding and the active site of the enzymes
[5,6]. The biological r ole of group I intron endonucleases
appears to b e i n i ntron h oming a t the D NA l evel [7].
Homing is initiated by a double-strand break made by the
endonuclease at an i ntron-less cognate site, proceeds by
host-dependent gene conversion, and results in insertion of
the group I intron by replication into the intron-less site.
The endonucleases I-PpoIandI-DirI from nuclear group I
introns Ppo.L1925 and Dir.S956-1 in the myxomycetes
Physarum polycephalum and D idymium iridis have been
reported to mediate intron homing in genetic crosses [8,9].
All known nuclear group I introns interrupt the highly

expressed small ribosomal subunit (SSU) or large ribosomal
subunit (LSU) rRNA genes, and have to be spliced out from
the R NA polym erase I trans cribed p recursor r RNA. An
intriguing question is thus how intron proteins encoded by
nuclear group I i ntrons are expressed from an RNA
polymerase I transcript. A protein encoding gene in a
eukaryotic nucleus is in general t ranscribed by RNA
polymerase II as premRNA. Here, pre-mRNA matu ration
includes the addition of a methylated guanine to the 5¢ end
(capping), the removal of spliceosomal introns, and poly-
adenylation at the 3¢ end (reviewed in [10]). In vivo expres-
sion analyses of the group I intron endonucleases I-PpoI,
I-DirI, and I-NgrI indicate different s trategies and solutions
[11–13]. Based on Ppo.L1925 trans-integration in yeast
rDNA, I-PpoI mRNA was shown to be transcribed by
RNA polymerase I and subsequently translated from the
excised, but unprocessed, intron RNA [11]. Furthermore,
the messenger appeared not to be polyadenylated [14], a nd
sequences downstream the I- PpoI O RF RNA , preceding
the group I ribozyme, were found to be important in both
splicing a nd protein expression [15]. Expression of I-DirI
and I-NgrI from twin-ribozyme introns [16] is dependent on
novel group I-like ribozymes responsible for the formation
of the 5¢end of their mRNAs [12,13], and examination of
polysome a ssociated I- DirI m RNA s upports that matur-
ation also includes the removal of a 51 nucleotide spliceo-
somal intron and polyadenylation [12].
Many group I introns self-splice as naked RNA in vitro,
catalyzed by intron-encoded group I ribozymes. The intron
sequences are excised from precursor RNA by a two step

trans-esterification reaction, w ith a subsequent ligation of
flanking exon sequences [17]. Additional ribozyme-cata-
lyzed RNA processing reactions, including intron circular-
ization a nd internal processing, have b een characterized in
Correspondence to S. Johansen, Department of Molecular Biotech-
nology, Institute of Medical Biology, University of Tromsø, N-9037
Tromsø, Norway. Fax: + 47 77 64 53 50, Tel.: + 47 77 64 53 67,
E-mail:
Abbreviations: LSU, large ribosomal subunit; premRNA, precursor
messenger RNA; rDNA, ribosomal DNA; rRNA, ribosomal RNA;
SSU, small ribosomal subunit.
(Received 26 October 2001, revised 7 January 2002, accepted 22
January 2002)
Eur. J. Biochem. 269, 1641–1649 (2002) Ó FEBS 2002
some intron systems that include the ORF-containing
group I introns (Table 2). In vivo analyses of I-PpoI, I-DirI
and I-NgrI expression in their original hosts and/or in yeast
indicate an essential role of ribozyme-mediated intron RNA
processing [11–13,15]. Two Naegleria species, Naegleria sp.
NG874 and N. morganensis, contain large nuclear group I
introns within the LSU rDNA encoding the putative
homing endonucleases I-NaeIandI-NmoI, respectively. In
order to gain insight into cellular maturation and processing
patterns of the I-NaeIandI-NmoImRNAs,weanalysed
their corresponding intron RNAs for self-splicing and self-
processing in vitro.
MATERIALS AND METHODS
Plasmid construction, DNA sequencing, and computer
analyses
Introns are named according to the new proposed nomen-

clature o f group I introns in ribosomal DNA [18] that
include information of intron insertion site in the SSU (S) or
LSU (L) ribosomal DNA genes. The Nae.L1926,
Nmo.L2563 and Ppo.L1925 introns were PCR amplified
from the corresponding Naegleria sp. (NG874 isolate),
N. morganensis (NG236 isolate) and P. polycephalum
(Carolina isolate) LSU rDN A segments using the primer
sets OP460 (5¢-AATTAATACGACTCACTATAGGTCC
TGCACACCTTGT-3¢)/O P461 (5¢-CGCCAGACTAGAG
TCA-3¢), OP454 (5¢-AATTAATACGACTCACTATAGG
CGGATAAGGCCAAT-3)/OP451 (5¢-GCTCACGTTCC
CTGT-3¢), and OP452 (5¢-AATTAATA CGACTCACT AT
AGGAACTTACAAAGGCTA-3¢)/ OP442 (5¢-GCCTTTC
GAACGTCA-3¢), respectively. The PCR products, which
contain the introns, some flanking exon sequences, and a
primer generated T7 promoter, were cloned into pU C18
using the SureClone Ligation kit (Amersham Pharmacia
Table 1. Nuclear group I introns with His-Cys box motif.
Intron
a
and host
Intron size
(bp)
ORF size
(aa)
b
ORF
location
c
Acc. no.

Nja.S516 (Naegleria jamiesoni) 1307 245 P6/sense U80250
Nan.S516 (Naegleria andersoni) 1309 245 P6/sense Z15417
Nit.S516 (Naegleria italica) 1319 245 P6/sense U80249
Ngr.S516 (Naegleria gruberi) 1316 245 P6/sense X78278
Ncl.S516 (Naegleria clarki) 1305 245 P6/sense X78281
Nca.S516 (Naegleria carteri) 1324 245 P6/sense Y10190
Nae.S516 (Naegleria sp.NG 872) 1318 244 P6/sense AJ001399
Pte.S516 (Porphyra tenera) 972 162 P2/sense AB013175
Bfu.S516 (Bangia fuscopurpurea) 996 Pseudo P2/sense AF342745
Asp.S516 (Acanthamoeba sp.KA/E4) 957 Pseudo P2/sense AF349045
Emy.S943 (Ericoid Mycelia) 1755 Pseudo P8/sense AF158831
Mte.S943 (Monoraphidium terrestre) 1611 277 P8/a-sense ref [47]
Dir.S956-1 (Didymium iridis) 1436 244* P2/sense X71792
Dir.S956-2 (Didymium iridis) 1203 192* P8/a-sense ref [46]
Nga.S1199 (Nectria galligena) 1423 Pseudo P9/a-sense Y16424
Emy.S1199 (Ericoid Mycelia) 1330 Pseudo P9/a-sense Y158838
Pte.S1506 (Porphyra tenera) 960 Pseudo P1/a-sense AB013175
Psp.1506 (Porphyra spiralis) 1056 Pseudo P1/a-sense L26177
Bat.1506 (Bangia atropurpurea) 1038 Pseudo P1/a-sense L36066
Cal.L1923 (Candida albicans) 962 Pseudo P2.1/a-sense AB049125
Ppo.L1925 (Physarum polycephalum) 944 163 P1/sense L03183
Nae.L1926 (Naegleria NG874) 867 148 P1/sense AJ311176
Nmo.L2563 (Naegleria morganensis) 940 175 P1/sense AJ311175
a
Named according to [18].
b
Putative endonuclease Pseudogenes (Pseudo) due to frame-shifts/truncations. Estimated protein size after the
removal of a small spliceosomal intron from pre-mRNA (*).
c
Group I ribozyme paired element (Pn) interrupted by endonuclease-like

ORFs. ORF encoded by the same strand (sense) or opposite strand (a-sense) to that encoding the intron ribozyme and pre-rRNA.
Table 2. RNA processing of ORF-containing nuclear group I introns.
Intron
Ligated
exons
a
Full-length
circles
b
In vitro/
in vivo
Internal
processing
sites
c
in vitro/
in vivo Reference
Nja.S516 + +/NA +/NA [43]
Nan.S516 + +/NA +/NA [43]
Nit.S516 + NA/NA +/NA [43]
Ngr.S516 + NA/+ +/+ [13,43]
Dir.S956-1 + +/+ +/+ [12,44,48]
Dir.S956-2 + +/+ –/NA [46]
Psp.S1506 + +/NA –/NA [1]
Ppo.L1925 + +/NA +/+ [11,15,20],
this work
Nae.L1926 + +/NA +/NA This work
Nmo.L2563 + +/NA +/NA This work
a
Confirmed (+) ligated exons (LE) by experimental approaches.

b
Confirmed (+) intron full-length circles (FLC) by experimental
approaches.
c
Present (+) or absence (–) of ribozyme-cata-
lyzed internal processing sites (IPS). Introns not analyzed
(NA).
1642 P. Haugen et al. (Eur. J. Biochem. 269) Ó FEBS 2002
Biotech), yielding pT7Nae.L1926, pT7Nmo.L2563, and
pT7Ppo.L1925, respectively. The inserts were sequenced
using the Thermo Sequenase sequencing kit (Amersham
Pharmacia Biotech) and [a-
33
P]ddNTPs (GATC;
450 lCiÆmL
)1
). The Nmo.L2563 i ntron w as found to be
identical to the previously reported sequence [19], except for
the addition of 21 nucleotides at the 3¢ end of the intron.
The Nae.L1926 and Nmo.L2563 introns have been assigned
the EMBL/GenBank Data library accession numbers
AJ311176 and AJ311175, respectively. Computer analyses
of nucleic-acid and amino-acid sequences were performed
using the
GCG
software package programs from the Genetic
Computer Group (Version 10; Madison, WI, USA),
PHDSEC
(Version 1.96; EMBL-Heidelberg, Germany), and
PSIPRED

(Version 2.0; Protein Bioinformatics Group,
Brunel, UK).
In vitro
transcription and splicing
The intron RNA was transcribed in vitro by T7 RNA
polymerase from the linearized te mplates pT7Nae.L1926
(BamHI), pT7Nmo.L2563 (HindIII), and pT7Ppo.L1925
(HindIII). The RNA was uniformly labelled u sing
[a-
35
S]CTP (10 lCiÆlL
)1
; Amersham Pharmacia Biotech),
and subjected to self-splicing conditions (40 m
M
Tris
pH 7.5, 0.2
M
KCl, 2 m
M
spermidine, 5 m
M
dithiothreithol,
10 m
M
MgCl
2
and 0.2 m
M
GTP) at 50 °C for 0–30 min, all

essentially as described previously [1]. Self-spliced RNA was
subjected to electrophoresis in a 5% polyacrylamide, 8
M
urea gel, and visualized by autoradiography.
RNA circle and exon junction determination
Ligated exon and circular i ntron RNAs were isolated from
polyacrylamide gels and incubated in 400 lLelutionbuffer
(0.3
M
NH
4
Ac, 0.1 % SDS, 10 m
M
Tris pH 8 and 2.5 m
M
EDTA pH 8) on a rotating wheel at 4 °C over night. The
RNA was purified using a 0.45-l
M
filter (Millipore), and
ethanol precipitated. PAGE-purified RNA was subse-
quently subjected to reverse transcription u sing the First
Strand cDNA Synthesis kit (Amersham Pharmacia
Biotech) and a downstream primer. Products were amplified
by adding an upstream p rimer, then cloned into pUC18,
and finally several clones of each product were DNA
sequenced. Nae.L1926 intron RNA was analysed for ligated
exon, full-length circle, and )15 circle using t he primer
sets OP460 /OP461, OP460/OP463 (5 ¢-TAGAGCGGTAC
TATA-3¢), and OP460/OP463, respectively. Nmo.L2563
intron RNA was analysed for ligated exon, full-length circle,

and )551 circle using the primer sets OP450 (5¢-GCG
GATAAGGCCAAT-3¢)/OP451, OP456 ( 5¢-GAGGCTAA
ATCTCTTA-3¢)/OP494 (5¢-AGCTTTACTACACCT-3¢),
and OP456/OP558 (5¢-CCCTACCTTACAGAT-3¢), res-
pectively. Finally, the Ppo.L1925 f ull-length intron RNA
circle was analysed by using the primer set OP444
(5¢-GGGTG C AGTTCACAGACT-3 ¢)/OP443 ( 5¢-ATGG
TACATGGT GCGTTA-3¢).
Mapping of internal processing sites
The 5¢ ends of the internal p rocessing sites w ere mapped by
primer extension as described previously [20,21]. The
linearized plasmids pT7Nae.L1926 and pT7Nmo.L2563
were in vitro transcribed and submitted to self-splicing
conditions for 60 min. The transcribed R NA was subse-
quently p urified i n s everal steps including pheno l/chloro-
form extraction, RQ1 DNase (Promega) digestion for
20 min at 37 °C followed by enzyme inactivation for
10 min at 70 °C, and finally separation in a MicroSpin
S-400 HR column (Amersham Pharmacia Biotech). Purified
Nae.L1926 and Nmo.L2563 intron RNAs were annealed to
the oligo primers OP463 a nd OP558. The reverse transcrip-
tion reactions were performed using the SuperScript II
(Gibco BRL) enzyme with 10 lCi [a-
35
S]dCTP (Amersham
Pharmacia Biotech) as the label. DNA sequencing ladders
were prepared from pT7Nae.L1926 and pT7Nmo.L2563 in
parallel using the s ame p rimers and r un adjacent to the
primer extension products as markers.
RESULTS

Large ORF-containing group IC1 introns
in the LSU rDNA from two
Naegleria
species
Screening analyses of LSU rDNA from a number of
Naegleria species and lineages revealed large group I introns
at two distinct l ocations of the Naegleria sp. isolate NG874
and N. morganensis isolate NG236 [19,22,23]. The 867-bp
NG874 intron (named Nae.L1926) is inserted at position
1926 in the LSU rDNA (according to the Escherichia coli
LSU rDNA sequence numbering), at the same site as
reported in the distantly related protists Rotaliella and
Skeletonema [24,25] and only one nucleotide downstream of
the well studied nuclear group I introns in Physarum and
Tetrahymena [20,26]. Four out of nine analyzed strains of
this particular Naegleria species contain almost identical
versions of Nae.L1926 [23]. The 940-bp group I intron in
N. morganensis (named Nmo.L2563) has the same location
in LSU rDNA (position 2563) as introns found in the f ungi
Beauveria and Gaeumannomyces [see19]. Nae.L1926,
Nmo.L2563 and the Physarum intron Ppo.L1925 (PpLSU3)
are the only known nuclear LSU r DNA group I introns
harboring ORFs ( Table 1). Secondary structure models of
the Naegleria LSU rDNA introns are presented in
Fig. 1,A,B, and are based upon known t wo- and three-
dimensional features of g roup I intron structures [ 27–31].
These introns are typical group IC1 introns with a structural
organization resembling the introns in Physarum and
Tetrahymena (Fig. 1C,D). Despite being inserted at differ-
ent positions in LSU rDNA, Nae.L1926 and Nmo.L2563

are close relatives sharing about 95% sequence identity in
the catalytic core of the group I ribozymes. Nae.L1926 and
Nmo.L2563 harbor ORFs as exten sion sequences in the P1
loop segment.
ORF-proteins from Nae.L1926 and Nmo.L2563
are members of the His-Cys homing endonuclease
family
The Nae.L1926 and Nmo.L2563 encoded proteins appear
to be 148 and 175 amino acids in size, respectively (Fig. 2A).
Both proteins harbor the c onserved His-Cys box motif
(Fig. 2 B) present in all nuclear intron homing endonucleases
[3,4,6,7,19,32,33], and have been named I-NaeIandI-NmoI.
Detailed structural a nd functional analyses o f the related
I-PpoI homing endonuclease, encoded by the Ppo.L1925
Ó FEBS 2002 Complex group I introns in Naegleria LSU rDNA (Eur. J. Biochem. 269) 1643
intron, support the hypo thesis that the His-Cys motif is
directly involved in the two zinc ion coordination sites [5].
Whereas I-NaeI contains a His-Cys box typical of the two
zinc binding motifs, the I-Nm oIseemstolackthemost
C-terminal motif.
DNA binding and target recognition of I-PpoI have been
characterized by biochemical and structural approaches
[5,34,35], and revealed an unu sual DNA binding motif
consisting of three antiparallel b sheets (b-3, b-4 and b-5).
This motif has so far only been recognized in I-PpoIandin
the Tn916 integrase [32,36]. Nae.L1926 and Ppo.L1925
introns are located at an almost identical site in the LSU
rDNA, suggesting that I-NaeI r ecognizes a nd binds to the
same DNA target sequence a s I-PpoI. Interestingly, I-NaeI
was found to contain a sequence motif, l ocated approxi-

mately 15–40 residues N -terminal of the His-Cys box, with



Fig. 1. Secondary structure models of LSU rDNA group IC1 introns.(A,B)Naegleria;(C)Physarum;and(D)Tetrahymena. The paired segments
P1–P10 are indicated according to Cech et al . [28]. ORFs are located as P1 extension sequences. I ntron positions are num bered starting with th e first
nucleotide of the intron as number 1. Upper case letters represent intron sequences and lowercase letters represent exon sequences. Arrows indicate
internal pr ocessing sites (IPS) derived f rom prim er e xtension analysis (PE) or circle ju nction determination (C). Ppo.L 1925 and Tth.L1925 are
synonyms for PpLSU3 and TtLSU1, respectively (see Materials and methods).
1644 P. Haugen et al. (Eur. J. Biochem. 269) Ó FEBS 2002
several similarities to the I-Ppo I DNA binding domain
(Fig. 2A,C). Critical residues such as Q63 and K65 in b-4 of
I-PpoI, known to directly contact the target sequence, are
conserved in I-NaeI. Furthermore, structural predictions
using the
PREDICTPROTEIN
and
PSIPRED
servers (see Mate-
rials and methods) support with h igh probability b sheet
configurations w ithin t he motif (Fig. 2C). Thes e findings
resemble that of the LAGLI-DADG family of homing
endonucleases, where residues constituting the DNA bind-
ing domain show low sequence conservation among
enzymes that recognize the same DNA sequence [37].
The Nae.L1926 and Nmo.L2563 introns self-splice
in vitro
from their precursor RNAs
Group I intron self-splicing proceeds by two sequential
trans-esterification reactions resulting in exon ligation and

intron excision, and has been well studied in the Tetrahym-
ena intron Tth.L1925 (reviewed in [38]) and its cognate
Physarum intron Ppo.L1925 [11,20,39,40]. The structural
features of Nae.L1926 and Nmo.L2563 (Fig. 1A,B) have
significant similarities to Ppo.L1925 and Tth.L1925
(Fig. 1C,D), and we predicted that both the Naegleria
LSU rDNA introns self-splice in vitro as naked RNA. To
test for self-splicing activity, the corresponding linearized
plasmids (see Materials and methods) containing the introns
and some flanking exon sequences were transcribed using
T7 RNA polymerase, and the corresponding RNAs were
subjected to splicing conditions. Representative time course
experiments from gel analyses are shown in Fig. 3A. Here,
the precursor RNA (RNA 2) and the two products from the
self-splicing reaction, e xcised intron ( RNA 3) and ligated
exon (RNA 6), can be identified by size. Several additional
RNA species appeared on the gels, corresponding to
nonligated 5¢ and 3¢ exons (RNAs 8 and 7), circular intron
sequences (RNA 1), ORF-containing RNA (RNA 4), a nd
free ribozyme (RNA 5). Ligated exons (RNA 6) from both
splicing reactions were eluded and purified from the
polyacrylamide gels, amplified by RT-PCR and then cloned
into plasmid vectors. DNA sequencing of four independent
clones from each of the introns confirmed that both
Nae.L1926 and Nmo.L2563 excise from their corresponding
precursor RNAs and correctly ligate the exons (Fig. 3B).
Formation of full-length intron circles is a general
feature in the RNA processing of complex group I
introns
Gel analysis of the Nae.L1926 and Nmo.L2563 splicing

reactions (Fig. 3A) indicates that the slow-migrating RNA
species (RNA 1) represent i ntron c ircles. T o a nalyze the
intron circle junctions, RNA 1a from Nae.L1926 and RNAs
1a and 1b from Nmo.L2563 were eluted from the polyacryl-
amide gels, purified and amplified by RT-PCR, and finally
cloned into plasmid vectors. DNA sequencing of 10
independent clones showed that RNA 1a from the
Nae.L1926 intron corresponds to two equally represented
circular species (Fig. 4A); a full-length intron circle (five of
10 clones) and an intron circle lacking the first 15 nucleotides
(five of 10 clones). Interestingly, the sequence flanking the
)15 circularization site (UGUCUAflAAGAA) is almost
identical t o that of the intron 5¢-splice site region (UCU
CUUflAAGAA), suggesting that a P1-like structure may be
formed prior to the )15 circle formation. The results from
the Nmo.L2563 RNA1 are presented in Fig. 4B. Here, the
RNA 1a represents full-length intron circles (three of t hree
clones). The RNA 1b species migrates slightly slower than
the 1.2-kb precursor RNA (Fig. 3A), but consists only of the
389-nucleotide IC1 ribozyme lacking the fi rst 551 nucleotides
of the intron (four of four clones; Fig. 4B). The RNA 1b
circle resembles the well characterized Tetrahymena intron
Tth.L1925 circles lacking 15 or 19 nucleotides (including the
exogenous guanosine) at the 5¢ end of the intron [17].
The Ppo.L1925 self-splicing products have previously
been reported [20,39,40], but circular intron RNAs were not
characterized. I n order to test for Ppo.L1925 circ le forma-
tion during self-splicing, linearized p T7Ppo.L1925 plasmid
containing the i ntron a nd som e flanking exon sequences
were transcribed using T7 RNA polymerase, and t he

corresponding RNA was subjected to s plicing conditions
for 90 min. The results from g el analysis of the splicing
reactions corroborates the findings repo rted previously
[20,39], including a slow-migrating RNA species presumed
to be a circular RNA (data not shown). By the same
experimental approach as described above based on puri-
fication, RT-PCR and DNA sequencing, we conclude that
Ppo.L1925 generates full-length intron RNA c ircles during
incubation in vitro (four of four clones, Fig. 4C). Although
full-length intron RNA circles have been rarely reported
among the majority of nuclear group I introns studied, all
Fig. 2. Sequence features of endonuclease-like O RF-proteins from the
Naegleria introns. (A) Primary sequences of I-NaeIandI-NmoI.
Putative DNA binding domain and zinc binding motifs (Zn-I and
Zn-II) are underlined. (B) Sequence comparison of I-NaeI, I-NmoI,
I-PpoI [5] and I-NjaI [33] His-Cys boxes. Conserved zinc coordination
residues are enlarge d an d bold. T he asterisk indicates a discont inuity in
the seque nce. (C) Structural prediction of a DNA binding motif i n
I-NaeI, based on a comparison to crystal stru cture features of I- PpoI
[5]. Ide ntical positions are indicated by dots and deletions b y dashes.
The DNA binding motif was predicted using the two structural pre-
diction s ervers
PSIPRED
and
PREDICTPROTEIN
(PHDsec). Secondary
structural elements shown are isolated b bridge (B), extended strand
(participates in b lad der ; E), 3
10
helix (G), hydrogen bonded turn (T)

and bend (S).
Ó FEBS 2002 Complex group I introns in Naegleria LSU rDNA (Eur. J. Biochem. 269) 1645
ORF-containing introns analyzed to date generate these
circles both in vitro and in vivo (Table 2).
Both
Naegleria
LSU rDNA introns have internal
processing sites separating ORF RNAs
from the group I ribozymes
The splicing reaction of Nae.L1926 and Nmo.L2563,
presented in Fig. 3A, generates two major RNA species
(RNAs 4 and 5) that could not be explained by the regular
splicing pathway. We predicted t hese RNA species to
represent the 5¢ half and the 3¢ half of the excised intron, an
assumption based on the estimated s ize and on the fact that
the similarly organized Ppo.L1925 intron harbours strong
internal processing sites separating the ORF RNA from the
ribozyme [11,20]. To precisely define the internal processing
sites, the Naegleria intron RNAs were incubated under self-
splicing conditions for 60 min and subjected to primer
extension analysis. The results from the reactions are
presented in Fig. 5 and indicate one major processing site
located 3¢ of position U469 of Nae.L1926 (Fig. 5A) and two
sites 3¢ of positions C549 and C550 of Nmo.L2563
(Fig. 5B). These sites correspond very well to the reported
internal processing sites of the Ppo.L1925 intron [11,20] and
circularization sites of Tth.L1925 [17], all located close to, or
within, the internal guide sequences (see Fig. 1).
DISCUSSION
We have characterized in vitro RNA processing of two

Naegleria group I introns, Nae.L1926 and Nae.L2563, both
harbouring ORFs within the L1-loop of group IC1 ribo-
zymes. The intron ORFs correspond to His- Cys h o ming
endonucleases and are named I-NaeIandI-NmoI, respect-
ively. I-Na eI has a motif similar to the antiparallel b sheet
DNA binding domain found in I-PpoI. Whereas almost a ll
His-Cys homing endonucleases have two zinc coordination
domains, the C-terminal domain appears to be missing in
I-NmoI. In vitro analyses show that both introns self-splice,
generate full-length RNA circles, and harbour internal
Fig. 3. Gel analysis of the in vitro self-spli-
cing products of Nae.L1926 and Nmo.L2563.
(A) RNA was incubated at self-splicing
conditions for 0–30 min and analysed on an
8
M
urea/5% polyacrylamide gel. The
observed RNAs after 30 min incubation are
full-length intron circles (RNA 1a), circles
containing only the group IC1 ribozyme
(RNA 1b), precursor (RNA 2), excised
intron (RNA 3), intron ORF (RNA 4),
intron ribozyme (RNA 5), ligated exon
(RNA 6), free 3¢ exon (RNA 7), and fre e 5¢
exon (RNA 8). M, RNA size marker. The 3¢
exon RNA of Nmo.L2563 was run off the
gel. (B) Sequencing ladder of amplified
ligated exon generated from Nae.L1926 and
Nae.L2563 intron splicing. The RNA was
purified from a gel, subjected to R T-PCR

amplification, plasmid cloned and
sequenced. The corresponding ligated exon
RNA sequences are p resented below.
Arrows indicate exon junctions.
1646 P. Haugen et al. (Eur. J. Biochem. 269) Ó FEBS 2002
processing sites close to, or at, their internal guide
sequences.
Full-length intron circles
In organisms like Naegleria, group I intron splicing is an
essential reaction to the host in order to generate functional
rRNAs. However, intron processing reactions like internal
cleaving, 3¢ SS hydrolysis and intron circularization are
more likely to be selfish features of the nucleolar group I
introns. A ll ORF-containing or complex n uclear group I
introns tested so far generate full-length intron RNA circles
in viv o or when incubated at self-splicing conditions in v itro
(Table 2). The biological role of full-length circularization
of intron RNA is not clear, but one possibility is a s a n
intermediate in endonuclease expression. I-PpoI is reported
to be expressed from the full-length intron and not from the
internally process ed RNA [15], an observation consistent
with the idea that full-length intron circles might be
involved. Because I-PpoI mRNA is not polyadenylated
in vivo [14] a circular RNA could increase the stability, or
translocation from the nucleus to the cytoplasm, prior to
translation. Alternatively, full-length intron circles may be
involved in intron horizontal transfer at the RNA level
[21,41,42].
Internal processing of intron RNA
There are strong links b etween internal intron RNA

processing and the expression of nuclear group I intron
homing endonucleases. Functional studies both in vitro and
in vivo of twin-ribozyme group I introns in Didymium and
Naegleria im plies that i nternal p rocessing, catalysed by a
second internal group I-like ribozyme, is an essential step in
the expression of the corresponding homing endonuclease
genes [12,13,43–45]. In vitro studies of the Ppo.L1925 intron
mapped a n internal processing site 53 nucl eotides down-
stream of the I- PpoI ORF stop codon, proximal to the
internal guide sequence o f the splicing ribozyme [20].
Analyses in yeast s how that I- PpoI i s expressed f rom an
RNA polymerase I transcribed full-length intron RNA, but
not from the internal processed RNA [11,15]. Thus, in
contrast to the twin-ribozyme intron the internal processing
of Ppo.L1925 intron RNA appears to down-regulate
endonuclease expression. The Naegleria LSU rDNA introns
have several features in common to Ppo.L1925. They have
all large insertions at the s ame l ocation in P 1 (L1-loop)
within their g roup IC1 ribozyme s tructures. The L 1-loop
Fig. 5. Mapping of the internal processing sites. (A) Nae.L1926 and (B)
Nmo.L2563. Primer e xtension p roducts (PE) generated f rom s elf-
spliced Nae.L1926 and Nmo.L2563 intron RNAs were analysed
together with the corresponding DNA s equence marker. The DNA
sequence is complementary to the RNA sequence shown in the lower
panels. Processing sites are indicated by arrows. The internal guide
sequences (IGS) are underlined.
Fig. 4. Analysis of intron RNA circle junctions. from (A) Nae.L1926 (B) Nmo.L2563 and (C) Ppo.L1925. Regions correspon ding to circle junctions
of isolated intron RNAs were amplified by RT-PCR and sequenced. Circle junctions are indicated 3¢ to the last residue of the intron (xG). RNA
sequences of junctions corresponding to full-length intron circles (FL), )15 nucleotide circles ()15), and )551 nucleotide circles ()551) are presented
in the lower panels.

Ó FEBS 2002 Complex group I introns in Naegleria LSU rDNA (Eur. J. Biochem. 269) 1647
extension sequences contain ORFs with the characteristic
histidine a nd cysteine motifs common among al l nuclear
homing endonucleases [3]. Finally, Ppo.L1925, Nae.L1926
and Nmo.L2563 generate full-length intron RNA circles as
well as internal processing sites at ap proximately the same
positions within the ribozyme structure. These similarities in
sequence, organization and in vitro processing might
indicate a similar biological role in down-regulation of
endonuclease expression of the internal processing sites.
However, Nae.L1926 and Nmo.L2563 differ significantly
from Ppo.L1925 in several important aspects. Although the
I-NaeIandI-NmoI ORFs appear unrelated in primary
sequence, sequence similarities are present (Fig. 6) at the 5¢
and 3¢ untranslated regions. Here, the Naegleria 3¢ untrans-
lated regions are only 13 nucleotides compared to the
corresponding 53-nucleotide structured region in Ppo.L1925
[14]. Whereas the I-PpoI RNA harbours no polyadenylation
signal and seems not to be polyadenylated in vivo [14], both
Naegleria introns contain the AAUAAA consensus poly-
adenylation signal located exactly 12 and 28 nucleotides
upstream of the stop codons (UAG/UAA) and internal
processing sites, respectively (Fig. 6). Polyadenylation of
homing e ndonuclease m RNAs has been reported i n two
different nuclear group I i ntrons in Didymium [12,46]. Both
I-DirIandI-DirII mRNAs contain AAUAAA polyadeny-
lation signals  15 nucleotides upstream of the polyadeny-
lation tails. These observations suggest that both the I-NaeI
and I-NmoI mRNAs appear polyadenylated in vivo,prob-
ably at their internal processing s ites, and implies that

internal processing stimulates endonu clease expression.
ACKNOWLEDGEMENT
This work was supported by grants to S. J. from T he Norwegian
Research Council, The Norwegian Cancer Society, and The Aakre
Foundation for Cancer Research.
REFERENCES
1. Haugen, P., Huss, V.A., Nielsen, H. & Johansen, S. (1999)
Complex group-I introns in nuclear SSU rDNA of red and green
algae: evidence of homing-endonuclease pseudogenes in the Ban-
giophyceae. Curr. Genet. 36, 345–353.
2. Nishida, H., Tajiri, Y. & Sugiyama, J. (1998) Multiple origins of
fungal group I introns located in the same position o f nuclear SSU
rRNA gene. J. Mol. Evol. 46 , 442–448.
3. Johansen, S., Embley, T.M. & Willassen, N.P. (1993) A family of
nuclear homing endonucleases. Nucleic Acids Res. 21 , 4405.
4. Chevalier, B.S. & Stoddard, B.L. (2001) Homing endonucleases:
structural and functional insight into the catalysts of intron/intein
mobility. Nucleic Acids Res. 29, 3757–3774.
5. Flick, K.E., Jurica, M.S., Monnat, R.J. Jr & Stoddard, B.L. (1998)
DNA binding and c leavage by the n uclear int ron-encode d homing
endonuclease I-PpoI. Nature 394, 96–101.
6. Elde, M., Willassen, N.P. & Johansen, S. (2000) Functional
characterization of isosch izomeric His-Cys b ox homing en do -
nucleases from Naegleria. Eur. J. Biochem. 267, 7257–7266.
7. Lambowitz, A.M. & Belfort, M. (1993) Introns as mobile genetic
elements. Annu. Rev. Biochem. 62, 587–622.
8. Muscarella, D.E. & Vogt, V.M. (1989) A mobile group I intron in
the nuclear rDNA of Physarum polycephalum. Cell 56, 443–454.
9. Johansen, S., Elde, M., Vader, A., Haugen, P., Haugli, K. &
Haugli, F. (1997) In vivo mobility of a group I twintron in nuclear

ribosomal DNA of the myxomycete Didymium iridis. Mol.
Microbiol. 24, 737–745.
10. Harford, J.B. & Morris, D.R. (1997) mRNA metabolism and
post-transcriptional gene regulation. In Modern Cell Biology,Vol.
17, pp. 43–146. Wiley-Liss Press, New York.
11. Lin, J. & Vogt, V.M. (1998) I-PpoI, the endonuclease encoded by
the group I intron PpLSU3, is expressed from an RNA poly-
merase I transcript. M o l. Cell. Biol. 18, 5809–5817.
12. Vader,A.,Nielsen,H.&Johansen,S.(1999)In vivo expression of
the nucleolar group I intron-encoded I-DirI homing endonuclease
involves the removal of a spliceosomal intron. EMBO J. 18, 1003–
1013.
13. Decatur, W.A., Johansen, S. & Vogt, V.M. (2000) Expression of
the Naegleria intron endonuclease is dependent on a functional
group I self-cleaving r ibozyme. RNA 6, 616–627.
14. Lin, J . (2000) Expression of the homing endonuclease I-PpoI
encoded by the mobile nuclear group I intron PpLSU3 from the
ribosomal RNA gene. PhD Thesis, Cornell University, Ithaca,
NY, USA.
15. Lin, J. & Vogt, V.M. (2000) Functional alpha-fragment of
beta-galactosidase can b e expressed from the mobile g r oup I
intron PpLSU3 embedded in yeast p re-ribosomal RNA derived
from the chromosomal rDNA locus. Nucleic Acids Res. 28,
1428–1438.
16. Einvik, C., Elde, M. & Johanse n, S. (1998) Group I twintrons:
genetic elements in myxomycete and schizo pyrenid amoebo-
flagellate ribosomal D NAs. J. Biotechnol. 64, 63–74.
17. Cech, T.R. (1990) Self-splicing of group I introns. Annu. Rev.
Biochem. 59, 543–568.
18. Johansen, S. & Haugen, P. (2001) A new nomenclature of group I

introns in ribosomal DNA. RNA 7, 935–936.
19. De Jonckheere, J.F. & Brown, S. (1998) Three different group I
introns in the nuclear large subunit ribosomal DNA of the
amoeboflagellate Naegleria. Nucleic Acids Res. 26, 456–461.
20. Ruoff, B., Johansen, S. & Vogt, V .M. (1992) Characterization of
the self-splicing products of a mobile intron from the nuclear
rDNA of Physarum polycephalum. Nucleic Acids Res. 20,
5899–5906.
21. Johansen, S. & Vogt, V.M. (1994) An intron in the nuclear ribo-
somal D NA of Didymium iridis codes for a group I ribozyme and a
novel ribozyme that cooperate in s elf-splicing. Cell 76, 725–734.
22. De Jonckheere, J.F. & Brown, S. (1997) Defining n ew Naegleria
spp. using ribosomal DNA sequences. Acta Protozool. 36,
273–278.
23. De Jonckheere, J.F. & Brown, S. (2001) A novel ORF-containing
group I intron with His-Cys box in the LSU rDNA of Naegleria.
Acta Protozool. 40, 27–31.
Fig. 6. Putative ORF expression signals. Features from the Physarum
intron are pre sented on top and inclu de a stable hairpin structure at the
3¢ UTR [14]. A comparison between the two Naegleria intron P1
extensions is presented below. T ranslation start, stop, and polyA sig-
nals are boxed. Identical positions are indicated by dots. 5¢SS, 5¢ splice
sites; IPS, internal processing sites; UTR, untranslated region; ENase
ORF, endonuclease open reading frame.
1648 P. Haugen et al. (Eur. J. Biochem. 269) Ó FEBS 2002
24. Pawlowski, J., Bolivar, I., Guiard-Maffia, J. & Gouy, M. (1994)
Phylogenetic position of foraminifera inferred from LSU rRNA
gene sequences. Mol. Biol. Evol. 11, 929–938.
25. Van der Auwera, G. & De Wachter, R. (1998) Structure of the
large subunit rDNA from a diatom, and comparison between

small and large subunit ribosomal RNA for studying stramenopile
evolution. J. Eukaryot. Microbiol. 45, 521–527.
26. Cech, T.R., Zaug, A.J. & Grabowski, P.J. (1981) In vitro splicing
of the ribosomal RNA precursor of Tetrahymena: involvement of
a guanosine nucleotide in the excision of the intervening sequence.
Cell 27, 487–496.
27. Michel, F. & Westhof, E. (1990) Modelling of the three-dimen-
sional architecture of group I catalytic introns bas ed on com-
parative sequence analysis. J. M ol. Biol. 216, 5 85–610.
28. Cech, T.R., Damberger, S.H. & Gutell, R.R. (1994) Representa-
tion of the secondary an d tertiary structure of group I intron s. Nat.
Struct. Biol. 1, 273–280.
29. Lehnert, V., Jaeger, L., Michel, F. & Westhof, E. (1996) New
loop–loop tertiary interactions in self-splicing introns of subgroup
IC and ID: a complete 3D model of the Tetrahymena thermophila
ribozyme. Chem. Biol. 3, 993–1009.
30. Cate, J.H., G ooding, A.R., Podell, E., Zhou, K., Golden, B.L.,
Kundrot, C.E., Cech, T.R. & Doudna, J.A. (1996) Crystal stru c-
ture of a g roup I ribozyme domain: principle s of RNA packing.
Science 273, 1678–1685.
31. Golden, B.L., Gooding, A.R., Podell, E.R. & Cech, T.R. (1998) A
preorganized active site in the crystal structure of the Tetrahymena
ribozyme. Science 282, 259–264.
32. Jurica, M.S. & Stoddard, B.L . (1999) Homing endonucleases:
structure, func tion and evolution. Cell. M ol. Life. Sci. 55, 1304–
1326.
33. Elde, M., Haugen, P., Willassen, N .P. & Joha nsen, S. ( 1999)
I-NjaI, a nuclear intron-encoded homing endonuclease from
Naegleria, generates a pentanucleotide 3¢ cleavage-overhang
within a 19 base-pair partially symmetric DNA recognition site.

Eur. J. Biochem. 259, 281–288.
34. Ellison, E.L. & Vogt, V .M. ( 1993) Interaction of the intr on-
encoded mob ility e ndonucle ase I-PpoI with its target site. Mol.
Cell. Biol. 13, 7531–7539.
35. Argast, G.M., Stephe ns, K.M., Emond, M.J. & Monnat, R .J. Jr
(1998) I-PpoIandI-Cre I homing site sequence degeneracy
determined by random mutagenesis and sequential in vitro
enrichment. J. M ol. Biol. 280, 345–353.
36. Wojciak, J.M., Connolly, K .M. & Clubb, R.T. (1999) NMR
structure of the Tn916 integrase-DNA complex. Nature Struct.
Biol. 6, 366–373.
37. Lucas, P., Otis, C., Mercier, J P., Turmel, M. & Lemieux, C.
(2001) Rapid evolution of the DNA-binding site in LAG LI-
DADG homing endonucleases. Nucleic Acids Res. 29 , 960–969.
38. Cech, T.R. & Herschlag, D. (1996) Group I ribozymes: substrate
recognition, catalytic strategies, and comparative mechanistic
analysis. Nucleic Acids Mol. Biol. 10, 1–17.
39. Rocheleau, G.A. & Woodson, S.A. (1994) Requirements for self-
splicin g of a group I intron from Physarum polycephalum. Nucleic
Acids Res. 22, 4315–4320.
40. Rocheleau, G.A. & Woodson, S.A. (1995) Enhanced self-splicing
of Physarum polycephalum intron3byasecondgroupIintron.
RNA 1, 183–193.
41. Lykke-Andersen, J. & Garrett, R.A. (1994) Structural character-
istics of the stable RNA intro ns of archaeal hyperthermophiles
and their splicing junctions. J. Mol. Biol. 243, 846–855.
42. Aagaard, C., Dalgaard, J.Z. & Garrett, R.A. (1995) Intercellular
mobility and homing o f a n a rchaeal r DNA i ntron confers a
selective advantage over intron- cells of Sulfolobus acidocaldarius.
Proc. Natl A cad. Sci. USA 92, 12285–12289.

43. Einvik, C., Decatur, W.A., Embley, T.M., Vogt, V.M. &
Johansen, S. (1997) Naegleria nucleolar introns contain two group
I ribozymes with different functions in RNA splic ing and pro-
cessing. RNA 3, 710–720.
44. Einvik, C., Nielsen, H., Westhof, E., Michel, F. & J ohansen, S.
(1998) Group I-like r ibozymes with a n ovel core organization
perform obligate sequential hydrolytic cleavages at two processing
sites. RNA 4, 530–541.
45. Einvik,C.,Nielsen,H.,Nour,R.&Johansen,S.(2000)Flanking
sequences with an essential role in hydrolysis of a self-cleaving
group I-like ribozyme. Nucleic Acids R es. 28, 2194–2200.
46. Vader, A. (1998) Nuclear group I introns of the myxomycetes:
organization, expression and evolution, PhD Thesis, University of
Tromsø, Norway.
47. Oustinova, I. (2000) Molecular phylogeny of the family
Ankist rode sma ceae (Chlorophy ta, Chlorophycea e) and ana lyse s
of their group I intervening sequences, PhD Thesis, University of
Erlangen, Germany.
48. Decatur, W.A., Einvik, C., Johansen, S. & Vogt, V.M. (1995) Two
group I ribozymes with different functions in a nuclear rDNA
intron. EMBO J. 14, 4558–4568.
Ó FEBS 2002 Complex group I introns in Naegleria LSU rDNA (Eur. J. Biochem. 269) 1649

×