Tải bản đầy đủ (.pdf) (14 trang)

Tài liệu Báo cáo khoa học: The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.07 MB, 14 trang )

MINIREVIEW
The RNA recognition motif, a plastic RNA-binding platform
to regulate post-transcriptional gene expression
Christophe Maris*, Cyril Dominguez* and Fre
´
de
´
ric H T. Allain
Institute for Molecular Biology and Biophysics, Swiss Federal Institute of Technology Zurich, ETH-Ho
¨
nggerberg, Zu
¨
rich, Switzerland
History – what defines an RRM?
The RNA recognition motif (RRM), also known as
the RNA-binding domain (RBD) or ribonucleopro-
tein domain (RNP), was first identified in the late
1980s when it was demonstrated that mRNA precur-
sors (pre-mRNA) and heterogeneous nuclear RNAs
(hnRNAs) are always found in complex with proteins
(reviewed in [1]). Biochemical characterizations of the
mRNA polyadenylate binding protein (PABP) and
the hnRNP protein C shed light on a consensus
RNA-binding domain of approximately 90 amino
acids containing a central sequence of eight con-
served residues that are mainly aromatic and posi-
tively charged [2,3]. This sequence, termed the RNP
consensus sequence, was thought to be involved in
RNA interaction and was defined as Lys ⁄ Arg-
Gly-Phe ⁄ Tyr-Gly ⁄ Ala-Phe ⁄ Tyr-Val ⁄ Ile ⁄ Leu-X-Phe ⁄ Tyr,
where X can be any amino acid. Later, a second


consensus sequence less conserved than the previously
characterized one [1] was identified. This six residue
sequence located at the N-terminus of the domain
Keywords
RNA recognition motif; protein–RNA
complex; structure–function relationship;
RNA-binding specificity
Correspondence
F. H T. Allain, Institute for Molecular
Biology and Biophysics, Swiss Federal
Institute of Technology Zurich, ETH-
Ho
¨
nggerberg, CH-8093 Zu
¨
rich, Switzerland
Fax: +41 1 6331294
Tel: +41 1 6333940
E-mail:
Website: />groups/allain_group
*These authors contributed equally to the
work
(Received 16 December 2004, accepted
7 March 2005)
doi:10.1111/j.1742-4658.2005.04653.x
The RNA recognition motif (RRM), also known as RNA-binding domain
(RBD) or ribonucleoprotein domain (RNP) is one of the most abundant
protein domains in eukaryotes. Based on the comparison of more than 40
structures including 15 complexes (RRM–RNA or RRM–protein), we
reviewed the structure–function relationships of this domain. We identified

and classified the different structural elements of the RRM that are import-
ant for binding a multitude of RNA sequences and proteins. Common
structural aspects were extracted that allowed us to define a structural leit-
motif of the RRM–nucleic acid interface with its variations. Outside of the
two conserved RNP motifs that lie in the center of the RRM b-sheet, the
two external b-strands, the loops, the C- and N-termini, or even a second
RRM domain allow high RNA-binding affinity and specific recognition.
Protein–RRM interactions that have been found in several structures rein-
force the notion of an extreme structural versatility of this domain support-
ing the numerous biological functions of the RRM-containing proteins.
Abbreviations
ACF, APOBEC-1 complementary factor; CBP, cap binding protein; CstF, cleavage stimulation factor; hnRNP, heterogeneous nuclear
ribonucleoprotein; HuD, Hu protein D; LRR, leucine rich repeat; MIF4G, middle domain of the translation initiation factor 4 G; PABP,
polyadenylate binding protein; PIE, polyadenylation inhibition element; PTB, polypyrimidine tract binding protein; RBD, RNA-binding domain;
RNP, ribonucleoprotein; RRM, RNA recognition motif; SR, serine/arginine rich proteins; TLS, translocated in liposarcoma; U1A, U2A¢, U2B¢:
U1 snRNP proteins A, A¢,B¢; U2AF, U2 snRNP auxiliary factor; UHM, U2AF homology motif; UPF, up-frameshift protein.
2118 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS
was defined as Ile ⁄ Val ⁄ Leu-Phe⁄ Tyr-Ile ⁄ Val ⁄ Leu-X-
Asn-Leu. The first consensus sequence was therefore
referred as RNP 1 and the second as RNP 2 (Fig. 1).
It was then shown that this protein domain was
necessary and sufficient for binding RNA molecules
with a wide range of specificities and affinities
(reviewed in [4–6]).
Here we review the structural properties of the
RRM domain in its isolated form and in complex with
RNAs and ⁄ or proteins. This review shows how such a
simple domain can modulate its fold to recognize
many RNAs and proteins in order to achieve a multi-
tude of biological functions often associated with post-

transcriptional gene regulation.
An abundant and ancient fold with
multiple biological functions
Genome sequencing projects recently showed that the
RRM is found abundantly in all life kingdoms, inclu-
ding prokaryotes and viruses although at lower abun-
dance than in eukaryotes. To date, only 85 proteins
containing an RRM domain in bacteria (mostly cyano-
bacteria [7]), and six such proteins in viruses have been
identified. Prokaryotic RRM proteins are rather small
(about 100 amino acids) and have a single copy of the
RRM domain. In eukaryotes, the RNA recognition
motif is one of the most abundant protein domains.
To date, a total of 6056 RRM motifs have been identi-
fied in 3541 different proteins (ger.
ac.uk/cgi-bin/Pfam/getacc?PF00076) [8]. In humans,
497 proteins containing at least one RRM have
been identified. Assuming about 20 000–25 000 human
genes, the RRM would therefore be present in about
2% of gene products. In eukaryotic proteins, RRMs
are often found as multiple copies within a protein
(44%, two to six RRMs) and ⁄ or together with other
domains (21%). Among the latter, the most abundant
are the zinc fingers of the CCCH and CCHC type
(21% of those with an additional domain), the poly-
adenylate binding protein C-terminal domain (PABP
or PABC, 10%), and the WW domain (9%). Interest-
ingly, contrary to the well known CCHHs that bind
double-stranded DNA or RNA, the CCCH and
CCHC zinc fingers are domains that bind single-stran-

ded RNA [9,10]. The PABP and the WW domains [11]
are protein–protein interaction domains involved in
translation [12,13] and pre-spliceosome formation,
respectively [14]. By association with different types of
protein domains, the RRM domain can modulate its
RNP2 RNP1
α1 α2β1 β4β3β2
L1 L2 L5L3 L4
10 20 30 40 50 60 70 80
PTB (1SJQ) 60 VIHIRKLPIDVTEGEVISLGLP FGKVTNL LMLKG KNQAFIEMNTEEAANTMVNYYTSVTPVLRGQPIYIQ 147
PTB (1SRJ) 183 RIIVENLFYPVTLDVLH-QIFSK FGTVLKI ITFTKNN QFQALLQYADPVSAQHAKLSLDGQNIYNACCTLRID 282
PTB (1QM9) 338 VLLVSNLNPERVTPQSLFILFGV YGDVQRV KILFNK KENALVQMADGNQAQLAMSHLNGHKLH GKPIRIT 407
PTB (1QM9) 455 TLHLSNIPPSVSEEDLK-VLFSS NGGVVKG FKFFQKD RKMALIQMGSVEEAVQALIDLHNHDLG-ENHHLRVS 531
Cstf-64 (1P1T) 17 SVFVGNIPYEATEEQLK-DIFSE VGPVVSF RLVYDRETGKPKGYGFCEYQDQETALSAMRNLNGREFS GRALRVD 90
LA (1OWX) 244 LKFSGDLDDQTCREDLHILFSNH GEIK WIDFVRGA KEGIILFKEKAKEALGKAKDANNGNLQLRNKEVTWEV 305
TAP (1FO1) 121 KITIPYGRKYDK-AWLLSMIQSKCSVPFTPIEFHYENTRAQFFVEDASTASALKAVNYKILDRENRRISIIINSSAP PHS 290
A
LY (1NO8) 106 KLLVSNLDFGVSDADIQ-ELFAE FGTLKKA AVHYDRSGR-SLGTADVHFERKADALKAMKQYNGVPLD GRPMNIQ 178
hnRNP A1 (1UP1) 15 KLFIGGLSFETTDESLR-SHFEQ WGTLTDC VVMRDPNTKRSRGFGFVTYATVEEVDAAMNARP-HKVD GRVVEPK 87
hnRNP A1 (1HA1) 105 KIFVGGIKEDTEEHHLR-DYFEQ YGKIEVI EIMTDRGSGKKRGFAFVTFDDHDSVDKIVIQKY-HTVN GHNCEVR 177
HUD (1FXL) 47 NLIVNYLPQNMTQEEFR-SLFGS IGEIESC KLVRDKITGQSLGYGFVNYIDPKDAEKAINTLNGLRLQ TKTIKV 119
HUD (1FXL) 133 NLYVSGLPKTMTQKELE-QLFSQ YGRIITS RILVDQVTGVSRGVGFIRFDKRIEAEEAIKGLNGQKPSGATEPITVK 206
SXL (2SXL) 126 NLIVNYLPQDMTDRELY-ALFRA IGPINTC RIMRDYKTGYSYGYAFVDFTSEMDSQRAIKVLNGITVR NKRLKV 199
SXL (1SXL) 212 NLYVTNLPRTITDDQLD-TIFGK YGSIVQK NILRDKLTGRPRGVAFVRYNKREEAQEAISALNNVIPEGGSQPLSVR 290
PABP (1CVJ) 12 SLYVGDLHPDVTEAMLY-EKFSP AGPILSI RVCRDMITRRSLGYAYVNFQQPADAERALDTMNFDVIK GKPVRI 84
PABP (1CVJ) 99 NIFIKNLDKSIDNKALYDTFSAF GNILSCK VVCDENGSKGYGFVHFETQEAAERAIEKMNGMLLNDRKVFVGRFKS 175
Nucleolin (1FJE) 309 NLFIGNLNPNKSVAELKVAISEL FAKND LAVVDVRTGTNRKFGYVDFESAEDLEKAL-ELTGLKVF GNEIKLE 380
Nucleolin (1FJE) 396 LLAKNLSFNITEDELKEVFEDAL EIRLVSQ DGKSKGIAYIEFKS EADAEKNLEEKQGAEID GRSVSLY 463
U1A (1DZ5) 11 TIYINNLNEKIKKDELKKSLYAI FSQFGQI LDILVSRSLKMRGQAFVIFKEVSSATNALRSMQGFPFY DKPMRIQ 85
U2B" (1A9N) 8 TIYINNMNDKIKKEELKRSLYAL FSQFGHV VDIVALKTMKMRGQAFVIFKELGSSTNALRQLQGFPFY GKPMRI 81

CBP20 (1H2T) 41 TLYVGNLSFYTTEEQIY-ELFSK SGDIKKI IMGLDKMKKTACGFCFVEYYSRADAENAMRYINGTRLD DRIIRTD 114
Y14 (1P27) 74 ILFVTGVHEEATEEDIH-DKFAE YGEIKNI HLNLDRRTGYLKGYTLVEYETYKEAQAAMEGLNGQDLM GQPISVD 147
UPF3 (1UW4) 52 KVVIRRLPPTLTKEQLQEHLQPM PEHDYFE FFSNDTSLYPHMYARAYINFKNQEDIILFRDRFDGYVFLDNKGQEYPA 131
U2AF65 (1U2F) 150 RLYVGNIPFGITEEAMM-DFFNAQMR-LGGLTQAPG NPVLAVQINQDKNFAFLEFRSVDETTQAM-AFDGIIFQ GQSLKIR 227
U2AF65 (2U2F) 260 KLFIGGLPNYLNDDQVK-ELLTS FGPLKAF NLVKDSATGLSKGYAFCEYVDINVTDQAIAGLNGMQLG DKKLLVQ 333
U2AF35 (1JMT) 66 RSAVSDVEMQEHYDEFFEEVFTEMEEKYGEVEEM NVC-DNLGDHLVGNVYVKFRREEDAEKAVIDLNNRWFN GQPIHA 143
Fig. 1. Sequence alignment of a selection of RRM domains for which the structure has been solved (PDB codes are indicated in brackets).
The alignment was generated by the program
CLUSTALW ( [55] and manually optimized. The conserved RNP 1
and RNP 2 sequences are displayed in yellow. The amino acids highlighted in boxes refer to the aromatic residues important for primary
RNA binding.
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2119
RNA-binding affinity and specificity and diversify its
biological functions.
A protein domain in such abundance is necessarily
biologically important and associated with many func-
tions in the cell. Indeed, eukaryotic RRM proteins are
present in all post-transcriptional events: pre-mRNA
processing (for example CstF-64, LA, or UPF3 pro-
teins), splicing (U2B¢, U2AF
35
, U2AF
65
, hnRNPA1 or
Y14 proteins), alternative splicing (hnRNPA1, PTB,
sex-lethal, SR proteins), mRNA stability (CBP20,
PABP or HuD), RNA editing (ACF), mRNA export
(TLS), pre-rRNA complex formation (nucleolin),
translation regulation (PABP) and degradation [6]. In

plants, RRM proteins are present in chloroplasts and
are involved in 3¢ end processing of chloroplast mRNA
[15]. They have also been discovered in plant mito-
chondria. Their functions, however, remain unclear
[16]. Similarly, their roles in bacteria and viruses are
still unknown. The numerous three-dimensional struc-
tures of the RRM in isolation, and in complex with
RNA or other proteins, shed light on the function of
RRM proteins, as shown below.
The structure of the RRM, a babbab fold
with some variations and extensions
The RRM folds into an ab sandwich structure with a
b
1
a
1
b
2
b
3
a
2
b
4
topology (Figs 1 and 2) as demonstrated
by the first structure of an RNA recognition motif,
the N-terminal RRM of U1A [17]. The fold is com-
posed of one four-stranded antiparallel b-sheet spa-
cially arranged in the order b
4

b
1
b
3
b
2
from left to
right when facing the sheet (Fig. 2, hnRNP A1-RRM
2, front view) and two a-helices (a
1
and a
2
) packed
against the b-sheet. Most of the conserved residues of
the RRM are in the hydrophobic core of the domain
[17] except four conserved residues that contribute to
RNA binding, namely RNP 1 positions 1, 3 and 5
and RNP 2 position 2 (see the following section and
Fig. 1). The RNP 1 and RNP 2 motifs are located in
the central strands of the b-sheet, namely b
3
and b
1
,
respectively, and are highly conserved apart from a
few RRM domains such as ALY and TAP (Fig. 1)
[18,19].
To date, more than 30 RRM structures have been
determined either by NMR or X-ray crystallography
and reveal unexpected variations as shown in Fig. 2.

The loops between the secondary structure elements
(loops 1–5 as indicated in Figs 1 and 2) can have
different lengths and are often disordered in the free
form. An exception to this is loop 5 that often forms
a small two-stranded b-sheet (b
3
¢ and b
3
¢) (Fig. 2).
The N- and C-terminal regions, outside the RRM,
are usually poorly ordered in the isolated domains
with a few exceptions where they can adopt a secon-
dary structure (Fig. 2, PTB-RRM 3, La C-terminal
RRM and CstF-64). In the structures of La C-ter-
minal RRM [20], U1A N-terminal RRM [21] and
CstF-64 RRM [22], the C-terminus forms an a-helix
that lies on the b-sheet surface, while in PTB-
RRM 2 and 3 it extends the size of the b-sheet by
forming an extra b-strand (b5) antiparallel to b
2
[23,24]. CstF-64 RRM has also an additional short
a-helix in its N-terminal region (Fig. 2) [22]. Finally,
secondary structure elements of the domain can be
modified; for example a-helix 1 in U2AF
35
RRM
that is three times longer than in a canonical RRM
(Fig. 2). This unusual helix 1 is involved in protein–
protein interactions [25] (see the RRM–protein com-
plexes section).

Fig. 2. hnRNPA1 RRM 2, a typical RRM fold
and its structural variations as illustrated by
these different protein structures (hnRNPA1
RRM 2 [52], PTB RRM 3 [23], La C-terminal
[20], Cst64 RRM [22] and U2AF35 [51]).
This figure was generated with the program
MOLMOL [56].
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2120 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS
A true single-stranded nucleic acid
binding domain
Since the first structure of an RRM in complex with
RNA (the N-terminal domain of U1A in complex with
U1snRNA stem-loop II [26]) that founded our under-
standing of RRM–RNA recognition, 10 structures
of RRMs in complex with RNA or DNA (for
hnRNPA1) have been determined either by NMR
[27–30] or X-ray crystallography [31–36]. All of the
structures present intrinsic common features and dif-
ferences in RNA recognition reflecting the remarkable
adaptability of this domain in order to achieve high
affinity and specificity.
Systematic visual analysis of the conserved residues
at the RRM–RNA interface for all 11 published com-
plexes led us to define a common structural archetype
of the RRM–nucleic acid interaction exemplified by
hnRNPA1, an RRM protein binding both DNA and
RNA with high affinity. In the structure of hnRNPA1
RRM 2 in complex with DNA [34] (Fig. 3A), two
deoxynucleotides, A209 and G210, stack two aromatic

rings located on b
1
(Phe108, RNP 2 position 2) and b
3
(Phe150, RNP 1 position 5) strands, respectively
(Fig. 3A). The contacts with these two RNP positions
result in a characteristic arrangement of the nucleic
acid strand on the b-sheet surface in which the 5¢ end
is located on the first half of the b-sheet (b
4
b
1
) and
the 3¢ end on the second half (b
3
b
2
) (Fig. 3B). A third
aromatic residue located on b
3
(Phe148, RNP 1
position 3) interacts hydrophobically with the sugar
rings of A209 and G210. Finally, a positively charged
side chain (Arg146, RNP 1 position 1) forms a salt
bridge with the phosphate between A209 and G210.
This small set of RRM–nucleic acid interactions, in
the center of the domain, involving four conserved
protein side chains of the RRM consensus sequence
and two nucleotides, illustrates the perfect adaptation
of the RRM for effectively binding single-stranded

nucleic acids of any sequence. Indeed, the essential
chemical elements of this dinucleotide, namely the
two bases, the two sugar rings and the phosphates in
between, are recognized. The two bases are stacked on
conserved aromatic rings, and correspondingly, RNP 2
position 5 and RNP 1 position 2 are planar residues
(Phe, Tyr, His or Trp) in 78% and 72% of the 70
RRMs studied by Birney et al. [6], respectively. The
two sugar rings are in contact with a hydrophobic side
chain (RNP 1 position 3) that is present in 81% (67%
of Phe or Tyr) of the RRMs and finally the negatively
charged phosphodiester group is neutralized by a posi-
tively charged side chain (RNP 1 position 1) present in
68% of the RRMs [6]. Although the residue conserva-
tion at these four positions is strong, these four char-
acteristic contacts are not always found all together
[34]. Among the RRM–RNA⁄ DNA complexes, the
two RRMs of hnRNPA1 in complex with DNA have
all four characteristic contacts, whereas only one to
three of those are found in the other structures
(Fig. 4). The most frequent ones are the two stacking
A
BC
Fig. 3. hnRNPA1 RRM 2 as a model of
single stranded nucleic acid binding [25].
(A) Structure of hnRNPA1 RRM 2 in com-
plex with single stranded telomeric DNA
and scheme of the b-sheet annotated with
the conserved RNP 1 and RNP 2 aromatic
residue positions numbered according to

each RNP sequence numbering. The con-
served aromatic residues are highlighted by
green circles [34]. (B) Structural arrange-
ment of the DNA strand on the b-sheet of
hnRNPA1–RRM 2. (C) Hydrogen bond and
van der Waals interaction network confer-
ring base-binding specificity (hnRNPA1–
RRM 2 complex). This figure was generated
with the program
MOLMOL [56].
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2121
interactions involving RNP 2 position 2 (always pre-
sent except in nucleolin RRM 2 [37]) and RNP 1 posi-
tion 5 (always present except in CBP 20 [36]). The
contacts between the sugars and RNP 1 position 3 are
present in five RRM–RNA complexes (CBP20, PABP
RRM 1, nucleolin RRM 1 and RRM 2 and sex-lethal
RRM 1). The RNP 1 position 1 residue does not
necessarily interact with the phosphate between the
dinucleotide because in all structures apart from
hnRNPA1 it contacts an RNA base or a phosphate
oxygen of other nucleotides. Also, the RRM inter-
actions with the sugar–phosphate backbone are fairly
AB C
DE F
Fig. 4. The RRM domain, a highly plastic platform for nucleic acid binding. (A) Nucleolin RRM 2-sNRE complex [28]. (B) Sex-lethal
RRM 1–polyU–Tra mRNA [31]. (C) Sex-lethal RRM 2–Tra mRNA precursor complex [31]. (D) hnRNPA1 RRM 1–telomeric DNA complex [34].
(E) Poly(A)-binding protein RRM 1–polyadenylate RNA complex [33]. (F) Heterodimeric nuclear cap binding complex 5¢ capped polymerase II
transcripts [36]. In all figures, the RNA is shown in yellow and the protein side chain in green. The ribbon of the RRM is shown in grey. The

N- and C-terminal extensions of the RRM are shown in green and red, respectively. This figure was generated with the program
MOLMOL
[56].
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2122 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS
limited compared to other types of RNA-binding
proteins, such as ribosomal proteins, suggesting a less
important role for this type of interaction [38].
This basic binding platform common to all RRMs is
not in essence sequence-specific as eight of the 16 dinu-
cleotide combinations have already been found: AA
[33], AG [34], CG [28], CA [26], GU [31], UC [28], UG
(S. D. Auweter and F. H T. Allain, unpublished data)
and UU [31], with any type of nucleotide either at the
5¢ or the 3¢ position. The nucleotides at these two posi-
tions always adopt an anti conformation, except for
the G at the 3¢ position always found in a syn confor-
mation. Specificity of this central dinucleotide recogni-
tion is provided by other non conserved elements of
the RRMs. The two most frequently observed elements
are the protein side chains at the surface of the b-sheet
(RNP 1 position 7 and the two adjacent positions in
b1) (Fig. 3A) and the backbone and side chains of the
few amino acids just C-terminal to b
4
. These residues
are base-specifically hydrogen-bonded to the RNA or
DNA functional groups as illustrated by the multiple
base–amino acid contacts in hnRNPA1 RRM 2
(Fig. 3C).

A highly plastic domain to achieve high
RNA-binding affinity and specificity
Many RRMs bind RNA with high affinity (in the nm
range) and high sequence-specificity, in particular all
those whose structures have been determined to date.
Nevertheless, sequence-specificity does not necessarily
imply high affinity, e.g. PTB that specifically recogni-
zes pyrimidine tracts but does not provide sufficient
binding enthalpy to reach nm affinity (F. C. Ober-
strass, S. D. Auweter and F. H T. Allain, unpublished
data). To achieve higher affinity, some RRM proteins
use the two external b
4
and b
2
strands, while others
use the loops 1, 3 or 5, or the C- and N- termini [39].
In many proteins, multiple RRMs associate to bind
longer nucleotide stretches. In these cases, the interdo-
main linker is an essential component of RNA recogni-
tion. In addition, the RNA secondary structure can be
an important determinant of the protein binding affin-
ity. All of these aspects are presented in detail below.
Role of the two external b-strands and the loops
The b-sheet surface of an RRM can be modulated by
using only one or up to four b-strands for RNA bind-
ing. Figure 4 clearly illustrates that the b-sheet surface
is not used to the same extent in each RRM–nucleic
acid complex. Exceptionally, in hnRNPA1 RRM 1,
each b-strand binds one nucleotide, the DNA being

spread on the b-sheet from b
4
to b
2
in the 5¢)3¢ direc-
tion. More often, the nucleotide at the 5¢ end of the
central dinucleotide contacts the loops at the bottom
of the b-sheet (loop 1 and loop 3 in particular,
Fig. 4C) and the one at the 3¢ end stacks over the pre-
vious nucleotide (Fig. 4A). In PAPB RRM 1, it is dif-
ferent again; while A6 and A8 stack the protein side
chains at the canonical positions on b
1
and b
3
, respect-
ively, the nucleotide in between, A7, interacts with
loop 3 (Fig. 4E).
Role of the N- and C-terminal regions
The N- and C-terminal regions of the RRM are often
of crucial importance to dramatically enhance the
RNA-binding affinity by increasing the protein–RNA
interaction network. In most RRM–RNA complexes,
the base stacking on the aromatic residue at RNP 2
position 2 is sandwiched either by a protein side chain
from the N-terminal region (CBP20) or by one from
the C-terminal region of the RRM (Fig. 4D–F) [36].
This side chain can be one residue after the end of b
4
as in U1A [26,27] or 16 residues afterwards as in

hnRNPA1 RRM 1 [34] (Fig. 4D). The C-terminus of
hnRNPA1 RRM 1 is particularly interesting because it
is unstructured in the free form and becomes ordered
upon DNA binding forming a 3
10
helix. This structural
rearrangement reinforces the concept of binding by
induced fit, initially proposed with the structure of the
U1A–RNA complex [27]. Side chain residues of this
helix, His101 and Arg92, stack over A203 and G204,
respectively (Fig. 4D) [34].
The C-terminus can also contribute to differentiating
RNA from DNA by interacting with the 2¢OH group
of the sugar ring as shown in Fig. 4B,E. The hydroxyl
group can act as a hydrogen bond acceptor interacting
with protein side chains (Fig. 4E, Arg94; Fig. 4B
Arg202) as well as with the backbone amide (Fig. 4B,
Gly205) and ⁄ or as a hydrogen bond donor interacting
with the carbonyl oxygen of the protein backbone [38].
Other parts of the RRM domain, such as the b
2
-strand
and the loops, also interact with the 2¢OHs and help
to discriminate RNA from DNA [26,31,33,35].
The C-terminal region does not always enhance, but
can also inhibit RNA binding as shown in the struc-
ture of CBP20 [36] (Fig. 4F). Two residues (Asn116
and Arg123) of the C-terminus form a salt bridge
located above the RNP 1 residue at position 5 (Phe85)
preventing any RNA binding at this key position.

Similarly in PTB, the C-terminal region of all the
RRMs hydrophobically interacts with RNP 1 position
5, thereby masking this binding site (F. C. Oberstrass,
S. D. Auweter and F. H T. Allain, unpublished data).
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2123
Role of the RNA secondary structure
in RRM binding
Some proteins such as the N-terminal RRM of U1A
bind single-stranded RNA with high affinity only if the
RNA is embedded within a secondary structure, stem
loop (hairpin loop II of U1 snRNA [26]) or internal
loop (the regulatory element of the U1A 3¢ untranslated
region [27]). For example, the U1A protein that recog-
nizes a stem loop has a much weaker affinity (104-fold)
for a single-stranded 23-mer RNA with no base pairs,
even though the proper single-stranded recognition
sequence is present [26]. U1A RRM 1 specifically recog-
nizes the secondary structure of the target RNA
through its loops 1 and 3 binding to a specific base pair.
In the case of U1A bound to a fragment of U1 snRNA
hairpin II, Arg52 (loop 3) makes crucial interactions
with the closing loop GC base pair and its substitution
to Glu completely abolishes RNA binding [26]
(Fig. 5A). U1A not only binds a stem loop but also an
internal loop [27,29]. This ability to bind RNA in differ-
ent environments shows the adaptability of the proteins
to recognize different secondary structures as long as
the key protein–RNA interactions are conserved. The
closely related U2B¢ RRM binds the same hexanucleo-

tide sequence, AUUGCA, as U1A but within a differ-
ent stem loop (U2 snRNA hairpin IV) and only when
in complex with U2A¢ (Fig. 5B). The adaptability of
the RRM domain is further illustrated here, as the key
residue Arg52 still interacts with the RNA stem
although the closing base pair is a UU base pair in
U2snRNA SLIV instead of a GC in U1snRNA SLII.
While both U1A and U2B¢ recognize the bases at
the top of the stem through numerous hydrogen
bonds, nucleolin contacts the nucleolin recognition ele-
ment (sNRE) RNA stem essentially by van der Waals
interactions [28] (Fig. 5C). The two RRMs of nucleolin
sandwich the seven nucleotide loop and RRM 1 and
its C-terminal part recognize the unusual loop E struc-
ture [28]. The substitution of the loop E by two GC
base pairs separated by a bulge increases the dissoci-
ation constant more than 100-fold (from 5 nm to
0.8 lm) [30] and, as shown in Fig. 5D, this substitution
annihilates all van der Waals interactions (only one
hydrogen bond from Lys95 is retained). The double-
stranded stem is important for two reasons: first, it
restricts the conformation of the RNA loop and redu-
ces the entropy loss accompanying protein binding;
and second, some structural features of the RNA such
as the base pair (U1A and U2B¢) or loop E (nucleolin)
that closes the RNA loop, are crucial for positioning
the RRM onto the RNA. It was postulated that the
RNA structure is essential because it induces conform-
ational changes in order to reach the bound state
[27,40].

Role of additional RRMs
The combination of two or more RRM domains
allows the continuous recognition of a long nucleotide
sequence (8–10 nucleotides) often drastically increasing
the affinity (K
d
<nm). As shown previously, the
b-sheet surface can bind up to four nucleotides and up
to six if loops 1 and 3 contribute extensively to binding
AB
DC
Fig. 5. Role of the RNA secondary structure
in RRM binding. (A) U1A spliceosomal
protein–U1 snRNA hairpin II complex [26].
(B) U2B¢–U2A¢ protein complex bound to U2
snRNA hairpin IV [32]. (C) Nucleolin–sNRE
complex [28]. The loop E motif is composed
of a sheared base G5-A18 pair, an A6-U17-
G16 and a symmetric (trans-Hoogsteen)
locally parallel A7-A15 base pair. (D) Nucleo-
lin–b2NRE complex with the loop E motif
substituted by a bulge (U15 between two
GC base pairs) [30]. The color schemes are
the same as in Fig. 4, except that the pro-
teins loops and the C-terminus are shown in
blue. This figure was generated with the
program
MOLMOL [56].
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2124 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS

(S. D. Auweter and F. H T. Allain, unpublished
data). Thus, recognition of a longer single-stranded
DNA or RNA requires more than one RRM to form
a larger binding platform. Four structures of two con-
secutive RRMs in complex with RNA (sex-lethal [31],
HuD [35], PABP [33] and nucleolin [28,30]) and one
with DNA (hnRNPA1 [34]) have been determined. In
all five cases, the two RRMs and the interdomain lin-
ker cooperatively bind RNA providing high affinity
and specificity. In the free forms of sex-lethal and
nucleolin, the linkers are disordered and the two RRM
domains tumble independently [37,41]. In some cases
(PABP, nucleolin), the interdomain linker (that is the
C-terminal region of the N-terminal RRM as described
above) acts as a bridge, mediating the cooperative
binding of two RRM domains with the RNA. More
interesting is the range of new possible conformations
provided by the association of two RRMs (Fig. 6). In
PAPB, a large binding platform is created for the
RNA; in sex-lethal and HuD, the two RRMs form a
cleft in which the RNA lies; and in nucleolin the RNA
is sandwiched between the RRMs. As a consequence
of the relative arrangement of the two domains in sex-
lethal, HuD and nucleolin, several intra-RNA inter-
actions are created upon RNA binding that contribute
to the overall enthalpy of the complex, while in PABP
almost no intra-RNA interactions are present. On the
contrary, hnRNPA1 RRMs 1–2 and PTB RRMs 3–4
(F. C. Oberstrass, S. D. Auweter and F. H T. Allain,
unpublished results) are arranged in such a way that

only distantly located RNA sequences of the same
RNA can bind simultaneously to both RRMs. These
totally opposite topologies might reflect the opposite
function of the various RRM proteins, as both sex-
lethal and HuD are splicing activators, while
hnRNPA1 and PTB are splicing repressors [42].
The RRM, also a protein–protein
interaction domain
Over the last few years, biochemical and structural
studies have shown that the RRM is not only involved
in RNA recognition but also in protein–protein inter-
action. In addition to structures of multiple RRM-
RRM 1
RRM 2
UP1
A
C
E
B
D
RRM 1
RRM 2
5'
3'
Nucleolin
RRM 1
RRM 2
5'
3'
Sex-lethal

RRM 1
RRM 2
5'
3'
PABP
5'
5'
3'
3'
U1A
Fig. 6. The RRM–RRM interactions. Several
protein structures either free or in a com-
plex in which two RRM domains interact
are shown. Structures of (A) UP1 in the free
form [53] (pdb:1 lp1), (B) nucleolin in com-
plex with RNA [28] (pdb:1fje), (C) sex-lethal
in complex with RNA [31] (pdb:1b7f),
(D) PABP in complex with RNA [33]
(pdb:1cvj), and (E) U1A homodimer in com-
plex with RNA [29] (pdb:1dz5). The RNA
backbone is shown in yellow (A–E), the
N-terminal RRM domain is displayed green,
C-terminal domain blue, and linker region
red. (F) One monomer of U1A is displayed
green and the other blue. In all cases,
important residues for the protein–protein
interaction are displayed as balls and sticks.
This figure was generated using the pro-
grams
MOLSCRIPT and RASTER3D [57,58].

C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2125
containing proteins as described in the previous sec-
tion, structures of RRM domains in complex with var-
ious proteins or domains have been solved [32,43–51].
Analysis of these structures shows that protein recogni-
tion by RRM domains is very diverse with no general
mechanism emerging. For clarity, we distinguish three
main classes of RRM–protein interactions: between
two RRMs, between an RRM-binding RNA and a
non-RRM protein, and finally between RRMs that do
not bind RNA and another protein.
Protein interaction involving two RRM domains
The first structure showing an interaction between two
RRMs is the N-terminal region of hnRNPA1 (UP1) in
its free form that contains two RRM domains separ-
ated by a short linker [52,53]. The two RRMs form a
compact fold and interact with each other via their
a-helix 2. The interaction is stabilized by two salt brid-
ges connecting two arginines of the first RRM and
two aspartic acids of the second (Fig. 6A). This
arrangement positions adjacently the b-sheets of
both domains forming an extended surface of eight
b-strands. Similarly, PTB RRMs 3 and 4, separated by
a 24 residue linker region, do not tumble independ-
ently in the free form (F. C. Oberstrass, S. D. Auweter
and F. H T. Allain, unpublished data).
These RRM–RRM interactions are not a general
feature of all RRM proteins. In the case of sex-lethal
and nucleolin, in the free proteins, the linker is flexible

and the two RRM domains are independent [28,41].
However, upon RNA binding, the two RRM domains
adopt a fixed orientation and contact each other. In
the nucleolin structure, the RRMs interact via two salt
bridges located in the loops (Fig. 6B) and in the struc-
ture of hnRNPA1, the RRMs interact by salt bridges
located in the a
2
-helix. Other examples of RNA indu-
cing RRM–RRM interactions have also been described
in the case of sex-lethal [31], PABP [33], and HuD
[35]. In sex-lethal and HuD, the interdomain inter-
action is mainly governed by two hydrogen bonds
between residues located in b
1
and b
4
of RRM 1 and
in b
2
of RRM 2 (Fig. 6C). Furthermore, additional
contacts between RRM 2 and the linker region are
observed. In the case of PABP, the interdomain inter-
actions are mediated through many salt bridges and
van der Waals contacts between a
2
and b
4
of RRM 1
and b

2
and a
1
of RRM 2, respectively (Fig. 6D).
Another interesting example of RRM–RRM inter-
action is found in the structure of the N-terminal
RRM domain of the U1A protein in complex with the
polyadenylation inhibition element (PIE) RNA [29]. In
this case, two U1A proteins bind cooperatively to the
PIE RNA [54]. The structure shows that when bound
to RNA, U1A RRM 1 forms a homodimer stabilized
by interactions between the two a-helical C-termini
(Fig. 6E). On one side the C-terminal a-helix contains
charged residues that interact with the RNA and on
the opposite side contains hydrophobic residues that
constitute the dimer interface.
All of these structures clearly show that RRM
domains can be involved in RRM–RRM interaction in
addition to RNA binding. In most of these complexes,
these additional interactions contribute to the forma-
tion of a larger RNA-binding interface and are there-
fore critical to reach high RNA-binding affinity and
specificity. This feature is likely to be frequently found
in multiple RRM-containing proteins, especially if the
interdomain linker is short.
Protein interaction involving one RRM domain
and another domain
In some cases, it has been demonstrated that RRM-
containing proteins can associate with RNA only in
the presence of another protein that acts as a cofactor.

Both U2B¢ and CBP20 need a cofactor, U2A¢ and
CBP80, respectively, to recognize RNA. Ternary
structures of these complexes have been solved that
partially explain the importance of a cofactor in
RNA–RRM binding [32,43–45]. U2A¢ consists of five
consecutive leucine-rich repeats, and CBP80 of three
helical hairpin repeats very similar to the fold of the
middle domain of the translation initiation factor 4G
(MIF4G) domain. In both cases, the RRM domains of
U2B¢ and CBP20 interact with the leucine rich repeat
(LRR) motif or the MIF4G domain through their
a-helices and loop 4, keeping the b-sheet accessible for
RNA-binding (Fig. 7). The interactions, however, are
different as they are governed mainly by hydrophobic
contacts in the U2B¢–U2A¢ complex, and salt bridges
and hydrogen bonds in the CBP20–CBP80 complex.
Furthermore, in the case of CBP20, the N- and C-ter-
minal extensions flanking the RRM domain become
structured only when in complex with both RNA and
CBP80. As for RRM–RRM interactions, these RRM–
protein interactions contribute to RNA-binding specif-
icity, U2A¢ contacting the RNA and CBP80 stabilizing
both the N- and C-termini of CBP20 RRM, two key
components of CBP20–RNA recognition (Fig. 4) [44].
RRM domains involved only in protein
recognition
Some proteins containing RRM domains are involved
in protein–protein but not in protein–RNA interactions.
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2126 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS

Recently, three-dimensional structures of such pro-
teins in complex partially explained this unexpected
behavior of the RRM domain. Two different situations,
however, have been reported. In one case, the protein
interaction involves the b-sheet of the RRM domain,
thus preventing RNA binding as in the Y14–Magoh
complex [46–49] or the UPF2–UPF3 complex [50]. In a
second case, the interaction is mediated through the
a-helices, leaving the b-sheet solvent-exposed and there-
fore theoretically able to bind RNA, as with the
U2AF
35
–U2AF
65
[51], and the U2AF
65
–SF1 complexes
[46]. In this latter case, it was postulated that the partic-
ular behavior of these RRM domains is due mainly to
the identity of the amino acids on the surface of the
b-sheet (see below [25]).
Y14 and Magoh proteins are part of the exon junc-
tion complex that comprises several proteins. Y14 and
Magoh form a highly stable complex with nanomolar
binding affinity [48]. The C-terminal domain of Y14
has a typical RRM fold and the RNP 1 and RNP 2
amino acid sequences of Y14 are very similar to other
RRM domains (Fig. 1). However, Y14 does not bind
RNA. Structures of the Y14–Magoh heterodimer show
that Y14 binds Magoh through its entire b -sheet

[46–48] (Fig. 8). This particular complex formation of
the RRM neatly explains why some RRM domains
do not have RNA-binding activities. Similarly, in the
structure of the UPF2–UPF3 complex involved in
non-sense mediated mRNA decay, the b-sheet of the
N-terminal RRM domain of UPF3 binds UPF2 [50].
Although the two RRM proteins both interact through
their b-sheet, their interacting proteins, Magoh and
UPF2, adopt a completely different fold. UPF2 has a
totally a-helical MIF4G fold very similar to CBP80,
while Magoh has an ab fold (Fig. 8). Also striking is
the fact that both UPF2 and CBP80 adopt a MIF4G
fold, but recognize RRM in a totally different manner,
UPF2 recognizing the RRM b-sheet and CBP80 the
RRM a-helices.
The structures of the splicing factors U2AF
35

U2AF
65
and U2AF
65
–SF1 are another example of
the diversity encountered in protein–RRM recogni-
tion. U2AF
65
contains three RRM domains, the two
U2B"-U2A'
A
B

CBP20-CBP80
Fig. 7. The RRM–protein–RNA trimolecular complexes. (A) The
U2B¢–U2A¢–RNA ternary complex [32]. (B) The CBP20–CBP80–RNA
complex [36]. The RNA is shown in yellow, the RRM domain in
green, and leucine-rich repeats or MIF4G domains in blue. Resi-
dues important for the interaction are displayed as balls and sticks.
This figure was generated using the programs
MOLSCRIPT and
RASTER3D [57,58].
Y14-Mago
Fig. 8. The Y14–Magoh complex [48]. Y14 is shown in green, and
Magoh is shown in blue. The RNP 1 and 2 of Y14 are shown in
red. This figure was generated using the programs
MOLSCRIPT and
RASTER3D [57,58].
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2127
N-terminal domains binding RNA while the C-ter-
minal domain mediates SF1 interaction. U2AF
35
con-
tains a central RRM domain flanked by two zinc
finger domains. The structures of U2AF
35
RRM in
complex with the N-terminal domain of U2AF
65
and
of the RRM of U2AF
65

in complex with the N-ter-
minal domain of SF1 have been solved [46,51].
Surprisingly, in this case, the b-sheet of the RRM
domain is not implicated in protein interaction as for
other non-RNA-binding RRM domains, but involves
the two a-helices. Analysis of the RRM fold in these
two structures shows striking differences from the
canonical RRM domains, mainly consisting of a
longer helix a
1
(Fig. 2) and the absence of aromatic
residues in the RNP 1 and 2 motifs. The authors
therefore proposed a novel class of protein recogni-
tion motif that they named U2AF homology motif
(UHM) [25].
The examples described above define a novel class
of RRM domains that are involved in protein but
not RNA interactions, suggesting that RRM
domains might have evolved from RNA to protein
recognition. Although these RRM proteins do not
bind RNA, they are all implicated in RNA-related
functions such as recognition of the exon junction
(Y14), mRNA decay (UPF3) or pre-mRNA splicing
(U2AF
35
and U2AF
65
). This evolutionary process
can be accompanied by amino acid substitutions in
the RNA-binding regions, namely RNP 1 and 2, as

proposed for the UHM domain. However, in the
case of Y14 and UPF3, it is not entirely clear why
these RRM domains that are very similar to the
classical ones favor interaction with proteins rather
than RNA.
Conclusion and perspectives
The RNA recognition motif is an abundant and very
diverse protein motif found mainly in eukaryotes.
Analysis of the structures of this domain in the free
form as well as in complex with both RNA and pro-
teins shows that this small domain is extremely
diverse in terms of both structure and function. We
are now just starting to understand the structural,
functional, as well as evolutionary aspects of this
domain. It is now clear that the original perception
of the RRM as a simple rigid RNA-binding domain
must evolve and that further biochemical and struc-
tural studies are needed to obtain a full picture of
its role in the cell. Structures of RRM domains in
complex with different RNAs show that this small
compact domain is a central component of RNA
recognition but not the only determinant. N- and
C-terminal extensions, multiplication of RRM
domains or protein cofactors can play an important
role in RNA-binding specificity. This review also rai-
ses many questions concerning this domain. First,
concerning RNA binding, analysis of the different
structures shows that although some conserved aro-
matic residues are always found at the interface, the
topology of the bound RNA is quite different in

each complex and the sequence-specificity cannot eas-
ily be predicted. Thus, more structures of RRM–
RNA complexes are needed to fully understand the
determinants of this specificity. Second, RRM
domains are able to bind RNA with affinities ran-
ging from very high to weak, and the structural and
thermodynamic determinants of the RNA-binding
affinity still need to be elucidated. Third, as it is
now demonstrated that some RRM domains are spe-
cific to protein recognition rather than RNA binding,
which of the identified RRM domains are true
RNA-binding domains and which ones are not? In
some cases, the primary sequence can differentiate
between these behaviors, as for the novel UHM
domain, but in other cases, such as Y14 and UPF2,
structural determinants other than the amino acid
sequence must be present but are still unknown and
need to be identified. Fourth, it is established that a
high number of proteins contain both RRM and
auxiliary domains, such as zinc fingers, also involved
in nucleic acid binding. No structural studies, how-
ever, indicate if these two RNA-binding domains
within the same protein influence each other for
RNA binding. Finally, it has recently been discov-
ered that the RRM domain, for a long time thought
to belong exclusively to the eukaryotic world, is also
present in bacteria, viruses and mitochondria. From
an evolutionary point of view, it would be very
interesting to investigate the function of this domain
in such organisms and maybe discover their common

ancestor. In conclusion, further structural investiga-
tions on RRM domains possibly coupled with ther-
modynamic and kinetic studies are still needed to
confirm present hypotheses and possibly to reveal
more surprises.
Acknowledgements
The authors would like to acknowledge the financial
support of the Fondation Schlumberger pour l’Educa-
tion et la Recherche (postdoctoral fellowship), the
Swiss National Science Foundation (Nr. 31–67098.01),
the Roche Research Fund for Biology at the ETH
Zurich and the SNF NCCR structural biology to
FHTA.
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2128 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS
References
1 Dreyfuss G, Swanson MS & Pinol-Roma S (1988) Het-
erogeneous nuclear ribonucleoprotein particles and the
pathway of mRNA formation. Trends Biochem Sci 13,
86–91.
2 Adam SA, Nakagawa T, Swanson MS, Woodruff TK
& Dreyfuss G (1986) mRNA polyadenylate-binding pro-
tein: gene isolation and sequencing and identification of
a ribonucleoprotein consensus sequence. Mol Cell Biol
6, 2932–2943.
3 Swanson MS, Nakagawa TY, LeVan K & Dreyfuss G
(1987) Primary structure of human nuclear ribonucleo-
protein particle C proteins: conservation of sequence
and domain structures in heterogeneous nuclear RNA,
mRNA, and pre-rRNA-binding proteins. Mol Cell Biol

7, 1731–1739.
4 Bandziulis RJ, Swanson MS & Dreyfuss G (1989)
RNA-binding proteins as developmental regulators.
Genes Dev 3, 431–437.
5 Kenan DJ, Query CC & Keene JD (1991) RNA recog-
nition: towards identifying determinants of specificity.
Trends Biochem Sci 16, 214–220.
6 Birney E, Kumar S & Krainer AR (1993) Analysis of
the RNA-recognition motif and RS and RGG domains:
conservation in metazoan pre-mRNA splicing factors.
Nucleic Acids Res 21, 5803–5816.
7 Maruyama K, Sato N & Ohta N (1999) Conservation
of structure and cold-regulation of RNA-binding pro-
teins in cyanobacteria: probable convergent evolution
with eukaryotic glycine-rich RNA-binding proteins.
Nucleic Acids Res 27, 2029–2036.
8 Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L,
Eddy SR, Griffiths-Jones S, Howe KL, Marshall M &
Sonnhammer EL (2002) The Pfam protein families data-
base. Nucleic Acids Res 30, 276–280.
9 Hudson BP, Martinez-Yamout MA, Dyson HJ &
Wright PE (2004) Recognition of the mRNA AU-rich
element by the zinc finger domain of TIS11d. Nat Struct
Mol Biol 11, 257–264.
10 De Guzman RN, Wu ZR, Stalling CC, Pappalardo L,
Borer PN & Summers MF (1998) Structure of the HIV-
1 nucleocapsid protein bound to the SL3 psi-RNA
recognition element. Science 279, 384–388.
11 Sudol M, Sliwa K & Russo T (2001) Functions of WW
domains in the nucleus. FEBS Lett 490, 190–195.

12 Roy G, De Crescenzo G, Khaleghpour K, Kahvejian A,
O’Connor-McCourt M & Sonenberg N (2002) Paip1
interacts with poly (A) binding protein through two
independent binding motifs. Mol Cell Biol 22, 3769–
3782.
13 Kozlov G, Trempe JF, Khaleghpour K, Kahvejian A,
Ekiel I & Gehring K (2001) Structure and function of
the C-terminal PABC domain of human poly (A) -bind-
ing protein. Proc Natl Acad Sci USA 98, 4409–4413.
14 Lin KT, Lu RM & Tarn WY (2004) The WW domain-
containing proteins interact with the early spliceosome
and participate in pre-mRNA splicing in vivo. Mol Cell
Biol 24, 9176–9185.
15 Schuster G & Gruissem W (1991) Chloroplast mRNA-
3¢ end processing requires a nuclear-encoded RNA-bind-
ing protein. EMBO J 10, 1493–1502.
16 Vermel M, Guermann B, Delage L, Grienenberger JM,
Marechal-Drouard L & Gualberto JM (2002) A family
of RRM-type RNA-binding proteins specific to plant
mitochondria. Proc Natl Acad Sci USA 99, 5866–5871.
17 Nagai K, Oubridge C, Jessen TH, Li J & Evans PR
(1990) Crystal structure of the RNA-binding domain of
the U1 small nuclear ribonucleoprotein A. Nature 348,
515–520.
18 Liker E, Fernandez E, Izaurralde E & Conti E (2000)
The structure of the mRNA export factor TAP reveals
a cis arrangement of a non-canonical RNP domain and
an LRR domain. EMBO J 19 , 5587–5598.
19 Perez-Alvarado GC, Martinez-Yamout M, Allen MM,
Grosschedl R, Dyson HJ & Wright PE (2003) Structure

of the nuclear factor ALY: insights into post-transcrip-
tional regulatory and mRNA nuclear export processes.
Biochemistry 42, 7348–7357.
20 Jacks A, Babon J, Kelly G, Manolaridis I, Cary PD,
Curry S & Conte MR (2003) Structure of the C-term-
inal domain of human La protein reveals a novel RNA
recognition motif coupled to a helical nuclear retention
element. Structure (Camb) 11, 833–843.
21 Avis JM, Allain FH, Howe PW, Varani G, Nagai K &
Neuhaus D (1996) Solution structure of the N-terminal
RNP domain of U1A protein: the role of C-terminal
residues in structure stability and RNA binding. J Mol
Biol 257, 398–411.
22 Perez Canadillas JM & Varani G (2003) Recognition of
GU-rich polyadenylation regulatory elements by human
CstF-64 protein. EMBO J 22, 2821–2830.
23 Conte MR, Grune T, Ghuman J, Kelly G, Ladas A,
Matthews S & Curry S (2000) Structure of tandem
RNA recognition motifs from polypyrimidine tract
binding protein reveals novel features of the RRM fold.
EMBO J 19, 3132–3141.
24 Simpson PJ, Monie TP, Szendroi A, Davydova N, Tyz-
ack JK, Conte MR, Read CM, Cary PD, Svergun DI,
Konarev PV, Curry S & Matthews S (2004) Structure
and RNA interactions of the N-terminal RRM domains
of PTB. Structure (Camb) 12, 1631–1643.
25 Kielkopf CL, Lucke S & Green MR (2004) U2AF
homology motifs: protein recognition in the RRM
world. Genes Dev 18, 1513–1526.
26 Oubridge C, Ito N, Evans PR, Teo CH & Nagai K

(1994) Crystal structure at 1.92 A resolution of the
RNA-binding domain of the U1A spliceosomal pro-
tein complexed with an RNA hairpin. Nature 372,
432–438.
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2129
27 Allain FH, Gubser CC, Howe PW, Nagai K, Neuhaus
D & Varani G (1996) Specificity of ribonucleoprotein
interaction determined by RNA folding during complex
formulation. Nature 380, 646–650.
28 Allain FH, Bouvet P, Dieckmann T & Feigon J (2000)
Molecular basis of sequence-specific recognition of pre-
ribosomal RNA by nucleolin. EMBO J 19, 6870–6881.
29 Varani L, Gunderson SI, Mattaj IW, Kay LE, Neuhaus
D & Varani G (2000) The NMR structure of the 38
kDa U1A protein – PIE RNA complex reveals the basis
of cooperativity in regulation of polyadenylation by
human U1A protein. Nat Struct Biol 7, 329–335.
30 Johansson C, Finger LD, Trantirek L, Mueller TD,
Kim S, Laird-Offringa IA & Feigon J (2004) Solution
structure of the complex formed by the two N-terminal
RNA-binding domains of nucleolin and a pre-rRNA
target. J Mol Biol 337, 799–816.
31 Handa N, Nureki O, Kurimoto K, Kim I, Sakamoto H,
Shimura Y, Muto Y & Yokoyama S (1999) Structural
basis for recognition of the tra mRNA precursor by the
Sex-lethal protein. Nature 398, 579–585.
32 Price SR, Evans PR & Nagai K (1998) Crystal structure
of the spliceosomal U2B¢-U2A¢ protein complex bound
to a fragment of U2 small nuclear RNA. Nature 394,

645–650.
33 Deo RC, Bonanno JB, Sonenberg N & Burley SK
(1999) Recognition of polyadenylate RNA by the
poly(A)-binding protein. Cell 98 , 835–845.
34 Ding J, Hayashi MK, Zhang Y, Manche L, Krainer
AR & Xu RM (1999) Crystal structure of the two-
RRM domain of hnRNP A1 (UP1) complexed with
single-stranded telomeric DNA. Genes Dev 13, 1102–
1115.
35 Wang X & Tanaka Hall TM (2001) Structural basis for
recognition of AU-rich element RNA by the HuD pro-
tein. Nat Struct Biol 8, 141–145.
36 Mazza C, Segref A, Mattaj IW & Cusack S (2002)
Large-scale induced fit recognition of an m (7) GpppG
cap analogue by the human nuclear cap-binding com-
plex. EMBO J 21, 5548–5557.
37 Allain FH, Gilbert DE, Bouvet P & Feigon J (2000)
Solution structure of the two N-terminal RNA-binding
domains of nucleolin and NMR study of the interaction
with its RNA target. J Mol Biol 303, 227–241.
38 Allers J & Shamoo Y (2001) Structure-based analysis of
protein–RNA interactions using the program ENTAN-
GLE. J Mol Biol 311, 75–86.
39 Varani G & Nagai K (1998) RNA recognition by RNP
proteins during RNA processing. Annu Rev Biophys
Biomol Struct 27, 407–445.
40 Showalter SA & Hall KB (2004) Altering the RNA-
binding mode of the U1A RBD1 protein. J Mol Biol
335, 465–480.
41 Crowder SM, Kanaar R, Rio DC & Alber T (1999)

Absence of interdomain contacts in the crystal structure
of the RNA recognition motifs of Sex-lethal. Proc Natl
Acad Sci USA 96, 4892–4897.
42 Grabowski PJ & Black DL (2001) Alternative RNA
Splicing in the nervous system. Prog Neurobiol 65,
289–308.
43 Mazza C, Ohno M, Segref A, Mattaj IW & Cusack S
(2001) Crystal structure of the human nuclear cap bind-
ing complex. Mol Cell 8, 383–396.
44 Mazza C, Segref A, Mattaj IW & Cusack S (2002)
Co-crystallization of the human nuclear cap-binding
complex with a m7GpppG cap analogue using protein
engineering. Acta Crystallogr D Biol Crystallogr 58,
2194–2197.
45 Calero G, Wilson KF, Ly T, Rios-Steiner JL, Clardy
JC & Cerione RA (2002) Structural basis of m7GpppG
binding to the nuclear cap-binding protein complex. Nat
Struct Biol 9, 912–917.
46 Selenko P, Gregorovic G, Sprangers R, Stier G,
Rhani Z, Kramer A & Sattler M (2003) Structural
basis for the molecular recognition between human
splicing factors U2AF65 and SF1 ⁄ mBBP. Mol Cell
11, 965–976.
47 Fribourg S, Gatfield D, Izaurralde E & Conti E
(2003) A novel mode of RBD-protein recognition
in the Y14-Mago complex. Nat Struct Biol 10, 433–
439.
48 Lau CK, Diem MD, Dreyfuss G & Van Duyne GD
(2003) Structure of the Y14-Magoh core of the exon
junction complex. Curr Biol 13, 933–941.

49 Bono F, Ebert J, Unterholzner L, Guttler T, Izaurralde
E & Conti E (2004) Molecular insights into the interac-
tion of PYM with the Mago-Y14 core of the exon junc-
tion complex. EMBO Report 5, 304–310.
50 Kadlec J, Izaurralde E & Cusack S (2004) The struc-
tural basis for the interaction between nonsense-
mediated mRNA decay factors UPF2 and UPF3. Nat
Struct Mol Biol 11 , 330–337.
51 Kielkopf CL, Rodionova NA, Green MR & Burley SK
(2001) A novel peptide recognition mode revealed by
the X-ray structure of a core U2AF35 ⁄ U2AF65 hetero-
dimer. Cell 106, 595–605.
52 Xu RM, Jokhan L, Cheng X, Mayeda A & Krainer AR
(1997) Crystal structure of human UP1, the domain of
hnRNP A1 that contains two RNA-recognition motifs.
Structure 5, 559–570.
53 Shamoo Y, Krueger U, Rice LM, Williams KR & Steitz
TA (1997) Crystal structure of the two RNA binding
domains of human hnRNP A1 at 1.75 A
˚
resolution.
Nat Struct Biol 4 , 215–222.
54 van Gelder CW, Gunderson SI, Jansen EJ, Boelens
WC, Polycarpou-Schwarz M, Mattaj IW & van Ven-
rooij WJ (1993) A complex secondary structure in U1A
pre-mRNA that binds two molecules of U1A protein is
required for regulation of polyadenylation. EMBO J 12,
5191–5200.
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2130 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS

55 Thompson JD, Higgins DG & Gibson TJ (1994) CLUS-
TAL W: improving the sensitivity of progressive multi-
ple sequence alignment through sequence weighting,
position-specific gap penalties and weight matrix choice.
Nucleic Acids Res 22, 4673–4680.
56 Koradi R, Billeter M & Wuthrich K (1996) MOLMOL:
a program for display and analysis of macromolecular
structures. J Mol Graph 51–5, 29–32.
57 Kraulis PJ (1991) MOLSCRIPT: a program to produce
both detailled and schematic plots of protein structures.
J Appl Crystallogr 24 , 946–950.
58 Merritt EA & Murphy MEP (1994) Raster3d, Version
2.0: A program for photorealistic molecular graphics.
Acta Crystallogr D Biol Crystallogr 50, 869–873.
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2131

×