Tải bản đầy đủ (.pdf) (15 trang)

Báo cáo khoa học: The multidrug/oligosaccharidyl-lipid/polysaccharide (MOP) exporter superfamily pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (974.39 KB, 15 trang )

REVIEW ARTICLE
The multidrug/oligosaccharidyl-lipid/polysaccharide (MOP)
exporter superfamily
Rikki N. Hvorup, Brit Winnen, Abraham B. Chang, Yong Jiang, Xiao-Feng Zhou and Milton H. Saier Jr
Division of Biological Sciences, University of California at San Diego, USA
The multidrug/oligosaccharidyl-lipid/polysaccharide (MOP)
exporter superfamily (TC #2.A.66) consists of four previ-
ously recognized families: (a) the ubiquitous multi-drug and
toxin extrusion (MATE) family; (b) the prokaryotic poly-
saccharide transporter (PST) family; (c) the eukaryotic
oligosaccharidyl-lipid flippase (OLF) family and (d) the
bacterial mouse virulence factor family (MVF). Of these
four families, only members of the MATE family have been
shown to function mechanistically as secondary carriers,
and no member of the MVF family has been shown to
function as a transporter. Establishment of a common ori-
gin for the MATE, PST, OLF and MVF families suggests a
common mechanism of action as secondary carriers cata-
lyzing substrate/cation antiport. Most protein members of
these four families exhibit 12 putative transmembrane
a-helical segments (TMSs), and several have been shown to
have arisen by an internal gene duplication event; topo-
logical variation is observed for some members of the
superfamily. The PST family is more closely related to the
MATE, OLF and MVF families than any of these latter
three families are related to each other. This fact leads to the
suggestion that primordial proteins most closely related to
the PST family were the evolutionary precursors of all
members of the MOP superfamily. Here, phylogenetic trees
and average hydropathy, similarity and amphipathicity
plots for members of the four families are derived and


provide detailed evolutionary and structural information
about these proteins. We show that each family exhibits
unique characteristics. For example, the MATE and PST
families are characterized by numerous paralogues within a
single organism (58 paralogues of the MATE family are
present in Arabidopsis thaliana), while the OLF family
consists exclusively of orthologues, and the MVF
family consists primarily of orthologues. Only in the PST
family has extensive lateral transfer of the encoding genes
occurred, and in this family as well as the MVF family,
topological variation is a characteristic feature. The results
serve to define a large superfamily of transporters that we
predict function to export substrates using a monovalent
cation antiport mechanism.
Keywords: transport; membranes; proton motive force;
superfamily; phylogeny; drug resistance; polysaccharides;
lipid-linked oligosaccharides.
Major families of drug efflux pumps
Bacterial species that have developed clinical resistance to
antimicrobial agents are increasing in numbers and have
become a serious problem in hospitals [1]. One of the major
mechanisms of drug resistance in both prokaryotes and
eukaryotes involves drug efflux from cells. There are many
drug efflux systems known in bacteria [2–5], and these
belong to five ubiquitous transporter (super)families [6,7].
Four of them (RND, DMT, MFS and MATE; see below)
use drug/H
+
or Na
+

antiport to energize drug efflux while
one, the ATP-binding cassette (ABC), uses ATP hydrolysis.
Most characterized members of the resistance/nodula-
tion/division (RND) superfamily [8] function as drug or
heavy metal efflux pumps in Gram-negative bacteria.
Homologues in Gram-positive bacteria serve as lipid
exporters [9]. Small multidrug resistance (SMR) family
pumps within the drug/metabolite transporter (DMT)
superfamily [10] consist of homodimeric or heterodimeric
structures with only four TMSs per subunit [11]. They
export cationic drugs using a simple cation antiport
mechanism involving a conserved glutamyl residue [12].
Drug exporters of the major facilitator superfamily
(MFS) are found within six families, each known to
transport a broad range of structurally distinct drugs
[13,14].
The ABC superfamily of ATP-driven transporters
includes many families that are potentially active in the
uptake or efflux of metabolite analogues and other drugs.
For instance, the oligopeptide uptake transporter of
Salmonella typhimurium takes up amino-glycoside antibio-
tics such as kanamycin and neomycin (reviewed in [7]).
Uptake systems segregate from the efflux systems phylo-
genetically [15]. Members of this superfamily can also act on
Correspondence to: M. Saier, Division of Biological Sciences,
University of California at San Diego, La Jolla, CA 92093 0116,
USA. Fax: 001 858 534 7108, Tel.: 001 858 534 4084,
E-mail:
Abbreviations: ABC, ATP-binding cassette; MATE, multi-drug and
toxin extrusion; MOP, multidrug/oligosaccharidyl-lipid/polysaccha-

ride; MPA, membrane-periplasmic auxiliary; MVF, mouse virulence
factor; OLF, oligosaccharidyl-lipid flippase; PST, prokaryotic
polysaccharide transporter; TMS, transmembrane segment.
(Received 23 July 2002, revised 14 November 2002,
accepted 9 December 2002)
Eur. J. Biochem. 270, 799–813 (2003) Ó FEBS 2003 doi:10.1046/j.1432-1033.2003.03418.x
many types of macromolecules, a unique characteristic of
the ABC superfamily [16].
The MATE family of drug exporters (TC #2.A.66.1)
Only a few members of the multidrug and toxin extrusion
(MATE) family [3] are characterized functionally (Table 1).
These proteins include: (a) NorM and (b) VmrA in
Vibrio parahaemolyticus, a halophilic marine bacterium
that is one of the major causes of food poisoning in Japan
[17–19]; (c) YdhE from Escherichia coli, a close homologue
of NorM [20]; (d) Alf5 from the plant, A. thaliana [21];
(e) VcmA from Vibrio cholerae non-O1, a nonhalophilic
Vibrio species [22] and (f) BexA from Bacteroides thetaiotao-
micron [23]. One member of the family, the yeast Erc1
protein, elevates resistance to the methionine analogue,
ethionine [24].
NorM, VmrA and VcmA have been shown to function
by a drug/Na
+
antiport mechanism [17,19,22], and other
members of the superfamily may also be drug/Na
+
antiporters. Several members are annotated as DinF
proteins. The functions of the DinF proteins are unknown,
but expression of some of these proteins has been shown to

be induced by DNA damage [25,26]. A function related to
the export of nucleotides excised from damaged DNA
during photorepair can be postulated.
The polysaccharide transporter (PST) family
(TC #2.A.66.2)
Characterized protein members of the PST family are
generally of 400–500 amino acid residues in size and traverse
the membrane 12 times as putative a-helical TMSs.
Analyses conducted in 1997 [27] showed that members of
the PST family formed two major clusters, one of which was
concerned putatively with lipopolysaccharide O-antigen
repeat unit export (flipping) in Gram-negative bacteria,
the other which was concerned with exopolysaccharide or
capsular polysaccharide export in both Gram-negative and
Gram-positive bacteria. However, numerous archaeal
homologues are now recognized, and bacteria use PST
systems to export other complex carbohydrates such as
teichuronic acids [28]. The mechanism of energy coupling
for PST exporters is not established.
PST transporters may function together with auxiliary
proteins that regulate transport and allow passage of
complex carbohydrates across both membranes of the
Gram-negative bacterial envelope [29]. Thus, each Gram-
negative bacterial PST system specific for an exo- or
capsular polysaccharide functions in conjunction with
a cytoplasmic membrane-periplasmic auxiliary (MPA)
protein with a cytoplasmic ATP-binding domain (MPA1-
C; TC #8.A.3) as well as an outer membrane auxiliary
protein (OMA; TC #1.B.18) [27]. Each Gram-positive
bacterial PST system functions in conjunction with a

homologous MPA1 + C pair of proteins (TC #8.A.3)
equivalent to an MPA1-C protein of Gram-negative
bacteria. The C-domain has been shown to possess
tyrosine protein kinase activity, suggesting that it functions
in a regulatory capacity [30]. The lipopolysaccharide
exporters may function specifically in the translocation of
the lipid-linked O-antigen side chain precursor from the
inner leaflet of the cytoplasmic membrane to the outer
leaflet, but this possibility has not been established
experimentally.
The oligosaccharidyl-lipid flippase (OLF) family
(TC #2.A.66.3)
N-Linked glycosylation in eukaryotic cells follows a
conserved pathway in which a tetradecasaccharide sub-
strate (Glc
3
Man
9
GlcNAc
2
) is assembled initially in the
Table 1. Functionally characterized members of the multidrug and toxin extrusion (MATE) family.
Organism Abbreviation
Gene name
(Accession number) Substrates (drug class) References
Arabidopsis thaliana Ath57 Alf5 (BAB02774) Tetramethylammonium, PVP (polyvinylpyrrolidone)
pyrrolidinone
[21]
Escherichia coli Eco2 NorM (YdhE)
(P37340)

Ciprofloxacin
a
, berberine, kanamycin
b
, streptomycin
b
,
acriflavine, tetraphenylphosphonium ion (TPP)
chloramphenicol
e
, norfloxacin
a
, enoxacin
a
, fosfomycin,
doxorubicin
c
, trimethoprim
d
, ethidium bromide,
benzalkonium, deoxycholate
[1,25]
Bacteroides
thetaiotaomicron
Bth1 BexA (BAB64566) Norfloxacin1, ciprofloxacin
a
, ethidium bromide [19]
Vibrio cholerae Vch1 NorM (VcmA)
(Q9KRU4)
Norfloxacin1, ciprofloxacin

a
, ofloxacin1, daunomycin
c
,
doxorubicin
c
, streptomycin
b
, kanamycin
b
ethidium bromide,
4¢6¢-diamidino-2-phenylindole dihydro-chloride (DAPI),
Hoechst33342, acriflavine
[10]
Vibrio parahaemolyticus Vpa1 NorM (O82855) Norfloxacin1; ethidium bromide; kanamycin
b
;
ciprofloxacin1, streptomycin
b
[25]
Vpa3 VmrA (BAB68204) 4¢6¢-diamidino-2-phenylindole (DAPI), (TPP),
acriflavine, ethidium bromide
[17]
Saccharomyces cerevisiae Sce3 Erc1 (382954) Ethionine [28]
a
Quinolones and fluoroquinolones;
b
aminoglycosides;
c
anthracyclines such as adriamycin hydrochloride;

d
trimethoprim-sulfamethox-
azole;
e
miscellaneous antibiotics.
800 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
endoplasmic reticular (ER) membrane as a dolichylpyro-
phosphate (Dol-PP)-linked intermediate before being trans-
ferred to an asparaginyl-residue in a lumenal protein. An
intermediate, Man
5
GlcNAc
2
-PP-Dol is made on the cyto-
plasmic side of the membrane and translocated across the
membrane so that the oligosaccharide chain faces the ER
lumen where biosynthesis continues to completion [31]. The
exporter in Saccharomyces cerevisiae that catalyzes the
translocation step is the 574 amino acid nuclear division
Rft1 protein with 12 putative TMSs [31]. Homologues are
found in plants, animals and fungi.
The mouse virulence factor (MVF) family (TC #2.A.66.4)
A single member of the MVF family, MviN of Salmo-
nella typhimurium,hasbeenshowntobeanimportant
virulence factor for this organism when infecting the mouse
[32]. In several bacteria, genes encoding MviN homologues
occur in operons that also encode the uridylyl transferase,
GlnD, that functions in the regulation of nitrogen meta-
bolism [33]. Nothing more is known about the function of
MviN or any other member of the MVF family. However,

as will be shown below, these proteins are related to
members of the PST and MATE families with greatest
sequence similarity to members of the PST family. It is
therefore possible that MVF family members are related
functionally to PST family members exporting complex
carbohydrates or related substances.
The MOP superfamily
In this paper, we show that the MATE family of drug
exporters, the PST family of polysaccharide exporters, the
OLF family of lipid-linked oligosaccharide exporters and the
MVF family of mouse virulence-related proteins are all
homologous and are, therefore, related by common descent.
We designate the superfamily that includes these four related
families the MOP (MATE/MVF-OLF-PST) superfamily.
Currently sequenced members of these families are identified,
and the distribution of their members in the living world is
determined. While MATE family members are found in all
domains of life (bacteria, archaea and eukaryotes), PST
family members are restricted to prokaryotes (both archaea
and bacteria), OLF family members are restricted to
eukaryotes and MVF family members are restricted to
bacteria. In contrast to the MATE and PST families that
exhibit multiple paralogues in any one organism, the
eukaryotic OLF family is currently very small, consisting
of only eight sequenced orthologous members with no more
than one homologue per organism, and the MVF family,
while much larger, probably also consists primarily of
orthologues. Because at least some members of the pro-
karyotic-specific PST family can flip lipid-linked oligosac-
charides (i.e. O-antigen precursors of lipopolysaccharides in

Gram-negative bacteria), members of this family may serve
as the functional counterpart of the oligosaccharidyl lipid
exporters of eukaryotes. The reported sequence analyses lead
us to suggest that prokaryotic oligosaccharidyl-lipid export-
ers were the primordial systems that gave rise to all members
of the MOP superfamily. We tabulate these proteins
according to family and derive reliable multiple alignments
upon which phylogenetic trees are based (see our
ALIGN
website: />We also derive average hydropathy, similarity and amphi-
pathicity plots which allow us to make transmembrane
topological predictions, and these in turn lead to predictions
regarding the evolutionary origins of protein topological
types found within the MOP superfamily. Most importantly,
our results allow us to propose that all members of the MOP
superfamily function as secondary efflux carriers using a
solute/cation antiport mechanism.
Computer methods
Sequences of the proteins that comprise the MATE, PST,
OLF and MVF families were obtained separately by initial
screening procedures involving
PSI
-
BLAST
[34] and recursive
PSI
-
BLAST
searches [10] using the
SCREENTRANSPORTER

program without iterations [35]. Recognizable members
were retrieved from
GENBANK
[36],
SWISS
-
PROT
and
TREMBL
[37] and the nonredundant database,
NRDB
90 [38] (e-value
10
)3
). Homologues were retrieved in the period March–
June 2002.
Multiple sequence alignments were constructed using the
CLUSTAL X
program [39]. The gap penalty and gap extension
values used with the
CLUSTAL X
program were 10 and 0.1,
respectively, although other combinations were tried.
Average hydropathy, similarity and amphipathicity plots
were derived from
CLUSTAL X
alignments using the
AVEHAS
program [40]. Phylogenetic trees were derived by the
neighbor-joining method from alignments generated with

the
CLUSTAL X
program using the
BLOSUM
62 scoring
matrix. The phylogenetic trees were drawn using the
TREE-
VIEW
program [1,41].
Charge bias analyses of membrane protein topology were
performed using the
HMMTOP
[42],
TMHMM
[43],
WHAT
[44]
and
TOPPRED
2 [45] programs. Motif searches were conduc-
ted using the
MEME
program [46]. The statistical significance
of intrafamilial and interfamilial protein sequence similari-
ties (i.e. between members within each of the four families of
the MOP superfamily as well as between members of the
four constituent families) was established using the
GAP
(
IC

)
program [47,48] and the
PRSS
program [49] with the
BLOSUM
62 scoring matrix, a gap opening penalty of )8,
a gap extension penalty of )2 and 500 random shuffles. The
BLAST
2 program [50] was used additionally for comparison
of two sequences. The
IC
program was similarly used for
analyses of internally duplicated intraprotein segments.
Binary comparison scores are expressed in standard devi-
ations (SD) [51]. A value of 9 SD is deemed sufficient to
establish homology [52]. The
TMS
-split program [53], that
combines a TMS-prediction program (
HMMTOP
) with
a multiple alignment program (
CLUSTAL W
), in conjunction
with the
IC
program was used to identify internal duplica-
tions within a family. The
TMS
-

ALIGN
program [53] was used
to position TMSs in one protein relative to its homologues.
Thus, the positions of the extra TMSs in 14 TMS proteins
relative to their 12 TMS homologues could be determined
using this program.
The four tables of family members (Tables S1–S4) and
the multiple alignments from which the results reported in
this paper were derived (Figs. S1–S4), as well as additional
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 801
supplementary supporting data can be found on our
ALIGN website.
Results
The four families of the MOP superfamily
The general characteristics of the four currently recognized
families within the MOP superfamily are summarized in
Table 2. Columns 1 and 2 present the family abbreviations
and the TC # while column 3 gives the number of family
members identified. The MATE family is the largest with
203 members while the PST, MVF and OLF families are of
decreasing sizes in that order (155, 45 and 8 recognized
members, respectively). As shown by the results presented in
column 4 of Table 2, most of the members of these four
families fall within the same size range. However, a few of
the homologues were much larger (Table 2). Thus, in the
MATE family, two plant proteins, Ath10 and Ath8, have
1094 and 746 amino acid residues, respectively. The
extended hydrophilic regions in these two proteins did not
show sequence similarity with anything else in the data-
bases. Of greater interest were four large MVF family

homologues, all from high G + C Gram-positive bacteria.
These four proteins were Mle of Mycobacterium leprae
(1206 amino acid residues), Mtu of M. tuberculosis (1184
amino acid residues), Cgl of Corynebacterium glutamicum
(1083 amino acid residues) and Sco of Streptomyces coeli-
color (811 amino acid residues). Except for Sco, these
proteins exhibit domains (residues  720–950) that are
homologous to regions of eukaryotic-type serine/threonine
kinases. Presumably, these C-terminal domains function in
a regulatory capacity, possibly to control the activities of the
N-terminal transporter protein domains.
Column 5 in Table 2 summarizes the topological types
identified within each of the four families of the MOP
superfamily. MATE and OLF family permeases may all
have 12 (or possibly 13) TMSs, but about one-third of all
PST family members are predicted to have 14 TMSs, and
MVF family members have 10, 12, 13, 14 or 15 putative
TMSs. These will be analyzed in greater detail below.
Finally, as summarized in column 6 of Table 2, each of
the four families within the MOP superfamily has
a distinctive organismal distribution. While the MATE
family is present in all three domains of living organisms
(archaea, bacteria and eukaryotes), the PST family is found
both in archaea and bacteria but not in eukaryotes, while
the OLF and MVF family members are restricted to
eukaryotes and bacteria, respectively.
The MATE family
Our searches revealed that the MATE family contains 203
currently sequenced proteins, including representatives from
all three domains of life (Table S1). In this sense, the family

is ubiquitous. The family could be divided into 15
subfamilies (see phylogenetic tree displayed in Fig. 1). Most
of the members are of about 450–550 amino acid residues in
length and possess 12 putative TMSs. The yeast proteins are
larger (up to about 700 residues) whereas the archaeal
proteins are generally smaller. Large transporter size is
a characteristic of the eukaryotic domain while small size is
a characteristic of the archaeal domain [11].
Table S1 presents a summary of the 15 subfamilies
(phylogenetic clusters) of the MATE family. The sub-
families, some of which include sequence divergent proteins,
are as presented in the phylogenetic tree shown in Fig. 1.
The subfamily numbers as well as the names, organismal
sources and abbreviations of the members of the MATE
family are presented in columns 1–4 of Table S1. A short
description of the proteins (column 5), the gene names,
accession numbers and the database sources (
GENBANK
,
TREMBL
and
SWISS
-
PROT
) (columns 6–8) are also provided.
Finally, columns 9–10, respectively, present the protein sizes
in numbers of amino acid residues and the numbers of
putative transmembrane a-helical TMSs per polypeptide
chain, based on hydropathy plots. The same format of
presentation is used for tabulation of the proteins of the

PST, OLF and MVF families (see Tables S2–S4).
The functionally characterized members of the family are
found in subfamilies 1 (Sce3), 3 (Ath57), 4 (Eco2, Vch1,
Vpa1), 7 (Vpa3) and 9 (Bth1). Subfamily 1 consists
exclusively of yeast proteins; subfamily 2 includes only
mammalian proteins, and subfamily 3 contains only plant
proteins, mostly from A. thaliana.Mostoftheother
subfamilies consist exclusively of bacterial and/or archaeal
proteins. Of them, only subfamilies 6, 7 and 10 include
proteins from both of these prokaryotic domains. In
Table 2. Characteristics of the families of the MOP superfamily. For the MATE family, two larger homologues were found in Arabidopsis thaliana:
Ath10, 1094 aas, and Ath8, 746 aas (see web table S1, Results section). No sequence similarity was observed for the extra portions of these proteins.
For the OLF family, two homologues in Kluyveromyces lactis, Kla, 417 aas, and A. thaliana, Ath, 401 aas (see web table S3, Results section) are
believed to be fragments. For the MVF family, four larger homologues, all from high G + C Gram-positive bacteria, were found in Mycobac-
terium leprae, Mle, 1206 aas, Mycobacterium tuberculosis, Mtu, 1184 aas, Corynebacterium glutamicum, Cgl, 1083 aas and Streptomyces coelicolor,
Sco, 811 aas. Mle, Mtu and Cgl all include C-terminal domains (residues 720–950) that are homologous to each other and to domains in eukaryotic-
type serine/threonine protein kinases. A, archaea; B, bacteria; E, eukaryotes; aas, amino acid residues per polypeptide chain; TMSs, transmembrane
a-helical segments.
Family TC number
Number
of members
Size range
(number aas)
Number
TMSs
Distribution among
organisms
MATE 2.A.66.1 203 400–695 12; 13 A, B, E
PST 2.A.66.2 155 346–582 12; 14 A, B
OLF 2.A.66.3 8 401–547 12 E

MVF 2.A.66.4 45 461–555 10, 12, 13, 14, 15 B
802 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
subfamily 14, plant and bacterial proteins cluster very
loosely together. Thus, seven subfamilies are bacterial
specific, three include both archaeal and bacterial proteins,
one is archaeal specific, one includes bacterial and plant
proteins, and three are eukaryotic specific. The three
eukaryotic subfamilies consist of yeast, animal and plant
proteins, respectively (Fig. 1).
Many organisms exhibit multiple MATE family para-
logues. For example, among the bacteria, E. coli and
Bacillus subtilis each have four paralogues, Listeria innocua
and L. monocytogenes each have six and Clostridium perf-
ringens has eight (Table S1). In the eukaryotic kingdom,
both S. cerevisiae and S. pombe have three paralogues, but
these are not all orthologous to each other. Most impres-
sively, A. thaliana has 58 MATE family paralogues. No
archaeon has more than four MATE family paralogues.
Individual paralogues from a single species may either be
closely related, presumably arising from a recent gene
duplication event, or distantly related, arising from an
earlier gene duplication event. Extensive phylogenetic
studies of more than 70 transporter families have shown
that substrate specificity typically correlates with phylogeny
[54–56] although exceptions have been reported [57]. This
fact allows functional predictions for many uncharacterized
transporters.
In Subfamily 7, the Thermotoga maritima homologue
clusters loosely together with three archaeal proteins.
T. maritima is of evolutionary significance because small-

subunit ribosomal RNA phylogeny has suggested that this
bacterium is one of the deepest and most slowly evolving
bacterial lineages [58]. By using whole-genome similarity
comparisons, T. maritima appears to be the most archaeal-
like of all sequenced bacteria. It has been suggested that
much of the similarity between T. maritima and the archaea
is due to a shared ancestry of portions of their genomes as
a result of lateral gene transfer [59]. In subfamilies 6 and 10,
Fig. 1. Phylogenetic tree for the multidrug and toxin extrusion (MATE) family. The tree was derived using the
CLUSTAL X
program. The 15
subfamilies are labeled 1–15 together with the class of organisms from which the included proteins were derived; B, bacteria; Ar, archaea;
An, animals; Y, yeast; Pl, plants. The arrow indicates the probable root of the tree as determined with outlying sequences.
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 803
the archaeal proteins are so distant from the bacterial
homologues that the results are probably consistent with
vertical transmission from a common ancestor without
lateral transfer.
Drug resistances demonstrated for characterized MATE
family drug/Na
+
antiporters are listed in Table 1. These
proteins mediate resistance to a wide range of cationic dyes,
fluoroquinolones, aminoglycosides and other structurally
diverse antibiotics and drugs. It is interesting to note that
while cationic dyes are generally amphipathic and positively
charged, aminoglycosides are strongly hydrophilic, and
norfloxacin is amphiphilic. Thus, MATE family transporter
substrates are diverse in nature.
Average hydropathy and similarity plots for the MATE

family are shown in Fig. 2A. All 12 peaks of hydrophobicity
are well conserved. Two additional peaks that are very
poorly conserved are found just preceding and following
conserved peak 12. The peak of hydrophobicity preceding
TMS 12 is due to an inserted sequence in just one protein,
Hsp
2
of the archaeon Halobacterium spNRC-1, while the
C-terminal peak following putative TMS 12 is due to
extension of the animal homologues. These few proteins
may have 13 rather than the usual 12 TMSs that charac-
terize the MATE family. The ÔextraÕ regions in these few
proteins are presumably nonessential for transport function.
Figure 3 shows an alignment of the first half of a MATE
family protein with the second half of the same protein
(PAB0243 from Pyrococcus abyssi). The two halves exhibit
40–50% similarity, 30% identity and a comparison score of
14.5 SD. These values are sufficient to establish homology
[52]. Homology between the two halves of members of the
PST and MVF families could also be established but not for
the two halves of members of the OLF family.
The PST family
The sequenced proteins of the PST family are tabulated in
Table S2. These proteins are derived exclusively from
bacteria and archaea. However, many diverse groups of
these organisms are represented. The transport functions of
these systems are indicated when gene position or bio-
chemical evidence allows postulation of their substrates.
The format of presentation for Table S2 (as well as Tables
S3 and S4) is as for Table S1. Interestingly, and in contrast

to MATE family members, many PST family members are
predicted to exhibit 14 rather than 12 TMSs. However, few
proteins are predicted to have odd numbers of TMSs (11 or
13). As will be shown below, the extra two TMSs in the 14
TMS PST family proteins are localized to the C-termini of
these proteins.
A dendrogram for the PST family is shown in Fig. 4. Of
the 12 clusters shown, only clusters 1, 2, 6 and 12 are
restricted to the bacterial domain. All other clusters include
both archaeal and bacterial proteins. This surprising
observation shows that protein phylogeny does not corre-
late with organism phylogeny. In contrast to most families
Fig. 2. Average hydropathy plots (top) and average similarity plots (bottom) for the MATE (A), PST(B), OLF(C) and MVF (D) families. The
A
VE
H
AS
Program [40] was used to generate the plots with a window size of 19. Alignment position is indicated at the bottom of the figures. The
numbers above the hydropathy plots indicate the numbers of the putative TMSs. In A and B, but not C and D, nonhomologous hydrophilic
extensions were removed prior to graph generation.
804 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
of transporters, including the MATE family of the MOP
superfamily, extensive horizontal transfer may have
occurred during the evolution of the PST family.
In some cases we were able to provide convincing
evidence that horizontal transfer had in fact occurred. For
example, in subfamily 5, Mth3, from the archaeon Meth-
anobacterium thermoautotrophicum,andCac1,fromthe
bacterium Clostridium acetobutylicum, gave a
BLAST

E-value
of e
)66
with 39% identity and 62% similarity. The gene
encoding the Clostridium acetobutylicum protein (but not
that encoding the Methanobacterium thermoautotrophicum
homologue) showed a G + C content that differed sub-
stantially from that of the DNA of this organism overall
(0.31 for the genome and 0.24 for the gene). These results
taken together provide strong evidence for lateral transfer of
PST family genes across the bacterial–archaeal boundary. It
should be noted that evidence for lateral transfer of genes
encoding cell surface bacterial polysaccharide biosynthetic
enzymes is extensive [60–62].
Figure 2B shows the average hydropathy (top) and
similarity (bottom) plots for PST family members. Fourteen
peaks of hydropathy are evident, and for each of the first 12
such peaks, there is a corresponding peak of average
similarity. However, the last two peaks of hydropathy are
not well conserved. These two peaks represent the extra
peaks present in a minority of PST family members. The
identities of the proteins exhibiting 14 rather than 12
putative TMSs is possible by examining the data presented
in Table S2.
About one-third of the PST family members were
predicted to exhibit 14 TMSs. Surprisingly, these proteins
were found in most subfamilies although only in subfamily 8
did the 14 TMS homologues predominate. In some cases,
fairly close homologues were predicted to differ in topology.
For example, Axy1 of Acetobacter xylinum (14 TMSs) and

Pae6 of Pseudomonas aeruginosa (12 TMSs) in cluster 1 had
essentially identical topologies except that Axy1 had
a C-terminal extension including the extra two TMSs that
were lacking in Pae6. In fact, all 14 TMS proteins that were
checked carefully were homologous throughout their first
12 TMSs to the 12 TMSs of their shorter homologues but
had an extra C-terminal 2 TMS segment. It was therefore
concluded that the 14 TMS topological types arose from the
12 TMS proteins by addition of two TMSs at the C-termini.
The phylogenetic analyses suggest that this event has
occurred repeatedly throughout the evolutionary history
of the PST family.
The OLF family
The proteins of the OLF family are presented in Table S3
and the corresponding phylogenetic tree is shown in Fig. 5.
All eukaryotic organisms with a fully sequenced genome
have one and only one OLF family member with the
notable exception of Plasmodium falciparum, a eukaryotic
parasite that lacks N-glycoproteins [63] and also lacks an
OLF family homologue. These proteins are of 401–547
residues in length and display variable numbers of putative
TMSs, from eight to 14. The two proteins with only eight
putative TMSs, from Kluyveromyces lactis and A. thaliana,
may be incomplete sequences, due to incomplete sequencing
and to nonrecognition of exons, respectively. The other
proteins are predicted to have 11–14 TMSs, and this
prediction is dependent on the TMS prediction program
used. The actual numbers of TMSs may be 12 as suggested
by Helenius et al. 2002 [31]. The phylogeny of these proteins
follows that of the organisms with the fungal, plant and

animal proteins segregating as expected. This fact suggests
orthologous relationships for all family members and
therefore suggests a common function (Fig. 5).
The average hydropathy and average similarity plots for
the six full length members of the OLF family are shown in
Fig. 2C. We interpret the results in terms of a 12 TMS
topology, the same as the major topological type observed
Fig. 3. Binary alignment of the first half of a MATE family protein with its second half. This protein is one of the three paralogues from Pyrococcus
abyssi (PAB0243). The two halves were aligned using the
GAP
program with 500 random shuffles using the
BLOSUM
62 program as the scoring
matrix, a gap opening penalty of )8 and a gap extension penalty of )2. The two halves have a similarity of 40.5%, an identity of 29.8% and a
comparison score of 14.5 SD. |, an identity; :, a close conservative substitution; Æ, a more distant conservative substitution.
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 805
Fig. 4. Phylogenetic dendrogram for the polysaccharide transporter (PST) family. The dendrogram was derived essentially as described in the legend
to Fig. 1. The 12 subfamilies are labeled 1–12 together with the class of organisms from which the included proteins were derived; B, bacteria;
A, archaea.
806 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
for the MATE and PST families. However, the first two
putative TMSs are not strongly hydrophobic, and it is
therefore possible that these are localized to the cytoplasmic
side of the membrane as has been shown for members of
the chromate-resistance (CHR) family of transporters
(TC #2.A.51) [64]. Although the proteins of the OLF
family can be hypothesized to exhibit a 6 + 6 TMS
topology with a large, well-conserved cytoplasmic loop
between putative TMSs 6 and 7 (Fig. 2C), homology
between the two halves of these proteins could not be

demonstrated.
As noted above, the first two putative TMSs displayed in
Fig. 2C are quite hydrophilic, and they were therefore
examined in greater detail. When putative TMSs 1 of the
full-length OLF family members were drawn in an a-helical
wheel, the helices (which lack prolyl and glycyl residues)
proved to be strongly amphipathic with three fully con-
served hydrophilic residues (helix residues Q4, R8 and N15)
tightly clustered on one side of the helix. All other residues,
including the fully conserved F
12
,provedtobestrongly
hydrophobic or slightly semipolar (data not shown).
Putative helix 2 was similarly amphipathic with four well-
conserved hydrophilic or semipolar residues (S5, E9, Q12
and S13) localized to one side of the helix. These two helices
could provide a partially hydrophilic transmembrane path-
way for passage of lipid-linked oligosaccharides through the
membrane. Alternatively, these two putative helices may be
localized to the cytoplasmic surface of the membrane. The
remarkable conservation of Q4, R8 and N15 in helix 1
suggests an important function for these residues.
The MVF family
The phylogenetic tree for the MVF family is shown in
Fig. 6. Only two organisms, Pseudomonas aeruginosa and
Streptomyces coelicolor, both large genome organisms, have
more than a single MVF family member encoded within
their genomes, and they have only two MVF family
paralogues. Except for the two ÔextraÕ paralogues, Pae2
and Sco2 (subfamily 8), all a-, b-, c-andd-proteobacterial

proteins (23 proteins) are found in the lower half of the tree.
These fall into three primary clusters: cluster 1 includes only
a-proteobacterial homologues, cluster 2 includes only
b-andc-proteobacterial homologues, and cluster 3 includes
the one d-proteobacterial homologue. Within cluster 1, the
phylogenies of all a-proteobacterial homologues follow the
phylogenies of the 16S rRNAs [65], suggesting that they are
orthologues. Within cluster 2, the phylogenies of most
b-andc-proteobacterial homologues are in accordance with
those of the 16S rRNAs except for Vch which clusters with
Pmu and Hin but should be between Ype and Pae, and Bap
which should be close to Eco [65]. Finally, the separate
clustering of the d-proteobacterial protein, Bba, distant
from all other homologues, is as expected.
The upper part of the tree shows sequence divergent
proteins from sequence divergent bacteria. Only a few
clusters are noteworthy. Thus, cluster 5 includes all four
high G + C Gram-positive bacterial homologues, cluster 6
includes the two cyanobacterial homologues, cluster 8
includes the two low G + C Gram-positive bacterial
proteins, cluster 11 includes the two e-proteobacterial
proteins, and cluster 15 includes the chlamydial orthologues.
Thus, with the exception of just four proteins (Bap, Vch,
Pae2 and Sco2) the protein phylogenies follow the organi-
smal (16S rRNA) phylogenies within experimental error.
This fact suggests that most of these bacterial proteins are
orthologues, possibly serving a single function.
The G + C contents and codon usage frequencies for the
four anomalous genes were compared with the correspond-
ing values for the protein-encoding regions of the genomes

of the same organisms. For Bap, the G + C content was
26% for both the gene and the organism. For Vch, both
values were 48%; for Sco2, the values were 76% for the gene
and 72% for the genome; and for Pae2, the values were 72%
for the gene and 67% for the genome. In no case was the
codon usage frequency for the gene significantly different
from that for the organism as a whole. These approaches
therefore failed to provide further evidence for recent
horizontal gene transfer of genes encoding MVF homo-
logues.
All MVF family proteins fell within the size range 480–
555 amino acid residues except for the high G + C Gram-
positive bacterial homologues which were large (811–1184
amino acid residues) exhibiting 15 putative TMSs. Three of
these four proteins exhibit soluble protein kinase-like
domains of about 250 residues in their C-terminal regions.
Most of the proteobacterial homologues appear to have 13
TMSs when analyzed with the
TMS
-
ALIGN
program [53] (see
Fig. 2D). Thus, there is probably some topological hetero-
geneity in the MVF family.
Establishment of homology for the four families
of the MOP superfamily
The superfamily principle states that if A is homologous to
B, and B is homologous to C, then A is homologous to C
[52]. We have previously published the criteria used to
establish homology, namely a comparison score in excess of

9 SD for two protein sequences of greater than 60 amino
acid residues in length [52]. Nine SD corresponds to a
probability of 10
)19
that the observed sequence similarity
Fig. 5. Phylogenetic tree for oligosaccharidyl-lipid flippase (OLF)
family proteins. See the legend to Fig. 1 for format of presentation.
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 807
arose by chance [51]. In order to establish homology
between two coherent families, it is only necessary to
establish homology between one member of each of these
families. Two such representative examples for each inter-
familial comparison are presented in Table 3 although
many more with comparison scores in excess of 9 SD could
have been selected. When an equivalent number of non-
homologous proteins are compared (i.e. comparing proteins
of the MOP superfamily with proteins of the major
facilitator superfamily (MFS; TC #2.A.1), values never
exceeded 7 SD.
Figure S5 shows a binary alignment of an established
MATE family member with an established PST family
member. The two proteins exhibit 38% similarity and 22%
identity with a comparison score of 28 SD. The corres-
ponding alignment for interconnecting the OLF and PST
families is presented in Fig. S6. The two proteins are 37%
similar and 25% identical with a comparison score of
13 SD. The corresponding alignment for interconnecting
the MVF and PST families is shown in Fig. S7. The two
proteins are 36% similar and 26% identical, yielding a
comparison score of 19 SD. These comparisons and at least

one additional representative interfamilial comparison,
giving values in excess of 9 SD, are summarized in Table 3.
Many other binary comparisons gave comparison scores of
greater than 9 SD. However, the values reported in Table 3
are more than sufficient to establish homology.
No member of the OLF family gave a comparison score
in excess of 6 SD with a member of the MATE or MVF
family, and no member of the MVF family gave a score with
a MATE family member as high as the values obtained with
PST family members. The values recorded in Table 3
suggest a relative degree of relatedness of the four families as
indicated in Fig. 7. It is clear that the PST family is more
closely related to the other three families than any two of the
latter are related to each other.
Identification of interfamilial conserved motifs
The
MEME
program [46] can be used to identify conserved
motifs in families of proteins, and these can be used to
identify regions of sequence similarity between families
[66]. We therefore applied this program and selected two
interfamilial regions of conservation for the pairs of
Fig. 6. Phylogenetic tree for mouse virulence factor (MVF) family proteins. See the legend to Fig. 1 for format of presentation. The 15 subfamilies
are indicated 1–15.
808 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
families interconnected by lines in Fig. 7. The results are
presented in Table 4. Two members of each of the two
families within the MOP superfamily being compared are
presented, illustrating the sequence conservation between
these families. For each family comparison, two motifs

are presented.
When the PST vs. MVF families are compared, the two
fully conserved motifs are: P-L-R-L-P and G…A-V-L-P-T
(Table 4). Comparison of the PST and MATE families gave
the two motifs shown in Table 4, the second of which is
striking in its degree of conservation. The PST/OLF
comparison (Table 4) and the MATE/MVF comparison
(Table 4) are also presented. Although the functional and/
or structural significance of these motifs are not known,
sequence similarity between families is illustrated.
Discussion
In this paper we provide evidence that four previously
recognized families, MATE [3], PST [27], OLF [31] and
MVF [32] comprise a single superfamily. All of the
functionally characterized proteins that comprise three of
these four families transport their substrates with outwardly
directed polarity. In the case of the MATE family, a solute/
cation antiport mechanism is operative, and for the
characterized members of this family, Na
+
is used prefer-
entially over H
+
as the countertransported cation. The
energy coupling mechanisms used by members of the PST
and OLF families have not yet been investigated using
experimental techniques, and a transport function for
a member of the MVF family has not yet been demonstra-
ted. However, inclusion of these four families within a single
superfamily allows us to extrapolate from the MATE family

Table 3. Comparison stores establishing homology for the four families of the MOP superfamily. Comparison scores (expressed in standard deviations, SD), percentage identity and percentage similarity were
determined using the
GAP
program with 500 random shuffles.
BLOSUM
62 was the scoring matrix. A gap opening penalty of )8 and a gap extension penalty of )2wereused.E-value scores were generated
using the
BLAST
2 program with
BLOSUM
62 as the scoring matrix, a gap opening penalty of 11 and a gap extension penalty of 1. Prot, protein.
Family Prot. 1 Organism Acc. No. Family Prot. 2 Organism Acc. No.
Comparison
score (SD) % identity % similarity E-value
MATE Cac4 Clostridium acetobutylicum AAK81286 PST Pab1 Pyrococcus abyssi Q9UZH4 28 22 38 1e-10
MATE Cac4 Clostridium acetobutylicum AAK81286 PST Mka1 Methanopyrus kandleri AV19 Q8TUX6 19 20 32 5e-07
MATE Bsu3 Bacillus subtilis P54181 MVF Fnu1 Fusobacterium nucleatum ssp
nucleatum ATCC 25586
NP_603606 14 26 36 2e-07
MATE Hsp2 Halobacterium Sp. NRC-1 AAG20494 MVF Mlo1 Mesorhizobium loti NP_106007 11 27 35 7e-07
PST Mac4 Methanosarcina acetivorans str.C2A Q8TNU8 MVF Fnu1 Fusobacterium nucleatum subsp
nucleatum ATCC 25586
NP_603606 18 26 41 6e-9
PST Bsu2 Bacillus subtilis O34674 MVF Nme1 Neisseria meningitidis Z2491 AAF40731 19 26 36 2e-06
PST Cpe1 Clostridium perfringens Q8XMR5 OLF Cel1 Caenorhabditis elegans NP_500581 13 25 37 1e-04
PST Cpe1 Clostridium perfringens Q8XMR5 OLF Spo1 Schizosaccharomyces pombe T40744 10 23 36 1.4e-02
Fig. 7. Schematic depiction of the relative degrees of relatedness of the
four families of the MOP superfamily.
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 809
to the other three families. We propose that all proteins

within these families use a substrate/cation antiport mech-
anism to energize efflux of biologically important molecules,
either small molecules as in the case of MATE family
proteins, or macromolecules as in the case of the PST and
OLF family proteins. Whether the PST and OLF family
porters will prove to use a Na
+
-coupled mechanism as is
true for functionally characterized MATE family members
has yet to be determined, but this possibility should not be
difficult to test. It seems highly likely that MVF family
proteins will prove to be transporters using a similar
mechanism.
Members of the MATE, OLF and MVF families all
proved to exhibit sufficient sequence similarity with PST
family members to establish homology. We could not show
a similar degree of sequence similarity between any member
of the MATE or MVF family and any member of the OLF
family, and the degrees of sequence similarity between
members of the MATE and MVF families was substantially
less than between MVF and PST family members. The PST
family is thus the link between the four families of the MOP
superfamily, establishing that all four families are derived
from a single origin [52]. These observations lead to the
suggestion that proteins most closely related to members of
the PST family were the primordial systems of the MOP
superfamily.
If the primordial system was a complex carbohydrate
exporter similar to members of the PST family, then such a
system must have mutated to give rise to a primordial drug

resistance exporter, the precursor of the MATE family. A
similar pathway has been proposed for the evolution of
drug exporters from carbohydrate exporters within the
ABC superfamily [27,67]. The appearance of the eukaryotic
homologues of the OLF family may have resulted from
vertical transmission of a gene encoding a PST family
protein to the developing eukaryotic kingdom. The strictly
orthologous relationships of OLF family members is
consistent with this suggestion.
We have provided evidence that proteins most closely
related to the PST family were the primordial transporters
and therefore suggest that complex polysaccharide export
was the function of these proteins. The presence of
numerous PST family paralogues in many prokaryotes is
consistent with the interpretation that gene duplication
events gave rise to paralogues for the purpose of exporting
structurally dissimilar surface polysaccharides or their lipid-
linked oligosaccharide precursors. The presence of only one
OLF family member per eukaryote exhibiting N-linked
protein glycosylation is consistent with the notion that these
lipid-linked oligosaccharide exporters arose from a single
prokaryotic PST family member, of the same function,
possibly by vertical descent. As some current members of
the PST family are lipid-linked O-antigen exporters [27], this
possibility seems highly plausible. By contrast, the MATE
family probably arose by mutation of a primordial PST
exporter within the prokaryotic domain, and some of the
members of this new family were transmitted to eukaryotes,
possibly vertically, but also during the endosymbiotic
invasion of the eukaryotic cell by blue-green bacteria to

give rise to chloroplasts. The MATE family phylogenetic
analyses (Fig. 1) are consistent with the suggestion that two
distinct pathways for the introduction of these prokaryotic
Table 4. Interfamilial conserved sequence motifs in the MOP superfamily. Two conserved motifs per family comparison were selected for presen-
tation. Column 1 presents the families from which the proteins selected (columns 2 and 5) were taken. The residue numbers of the first residue in
each motif are presented in columns 3 and 6. The actual motifs are presented in columns 4 and 7 with residues conserved between families presented
in bold print.
Family Protein Residue Sequence Protein Residue Sequence
PST vs. MVF
PST Mth3 37 PYLTRVLGP Sto1 253 GV…ALGNVLLPT
Atu3 32 PILARLLSP Ssol3 243 GV…ALNNVLLPT
MVF Pae2 101 PWLVRLLGP Xfa 283 GV ALGTVILPT
Sco2 103 PVLVRALAP Pmu 275 GI AISTVILPT
PST vs. MATE
PST Lla2 84 IFGIFLVLAFGFGGGII Mlo3 370 NIGLNVVLIPRFGLWGAAMAT
Bha3 403 IFALATRPELGIMGAAL Sag1 363 NWLLNLVLIPHYAAYGAAMAT
MATE Sty1 179 IYGHFGMPELGGIGCGV Vch3 170 TSVLNLILDPI…LGIDGAAIAT
Sen1 179 IYGHFGMPELGGIGCGV Ape1 183 SSILNVILDPI…LGAVGAAVAT
PST vs. OLF
PST Mlo3 376 VLIPRFGLWGAAMAT Bsu2 59 GFPAAVSKFVSKYNSKGDY
Sag1 369 VLIPHYAAYGAAMAT Linn3 60 GIPLAVAKYIAKYNAMEEY
OLF Spo 370 MYIPFMAANGVLEAF Kla 108 GLPLSIILISWQYSNLNSY
Ath 273 LYIIVLAMNGTSEAF Sce 121 GFPLSIGLIAWQYRNINAY
MATE vs. MVF
MATE Ccr1 145 AEGATL…SLSLPVYA Cac1 163 DMKTPMKVNL
Tpa1 144 AEGERY…YTLVPLSF Pab1 153 DTKTPMKLNI
MVF Ype 55 AEGAFS QAFVPILA Sty 392 DIKTPVKIAI
Sty 68 AEGAFS QAFVPILA Sen 365 DIKTPVKIAI
810 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
precursor proteins into eukaryotes may have been followed.

Why plants such as A. thaliana have so many MATE family
paralogues while yeast and animals have so few is an
interesting question, worthy of future experimentation.
We have shown that each of the four families of the
MOP superfamily have different distributions in the living
world (Table 1). The MATE family is ubiquitous, and
many paralogues are found in eukaryotes as well as
prokaryotes [68]. The PST family is widespread in both
archaea and bacteria, and multiple paralogues are present
in many of these organisms. OLF family proteins are
found only in eukaryotes, and all identified members
are probably orthologues. Finally, MVF family proteins
are found exclusively in bacteria, and only two bacteria
were found to have more than one such homologue. In
fact, with only four exceptions (out of 45 recognized
family members), the protein phylogenies followed the
phylogenies of the 16S rRNAs within experimental error,
leading to the possibility that most of these proteins are
orthologous, serving the same function. What that
function is, is a mystery, but there are a few clues: (a)
loss of the function of the homologue in S. typhimurium
compromises the virulence characteristics of this bacter-
ium in mice; (b) several MVF family members are
encoded within operons that also encode GlnD, the
uridylate transferase that functions in nitrogen metabolic
regulation in many bacteria and (c) homology with
MATE and PST family members leads to the clear
suggestion that MVF proteins are secondary carriers
catalyzing export of some biologically important molecule.
Putting these facts together would lead to the hypothesis

that MVF family proteins export substances (like amino
sugar-containing polysaccharides) that are important for
virulence and are regulated in response to nitrogen
availability. A functional genomic approach should pro-
vide answers to this interesting question.
The proteins of the MOP superfamily exhibit a variety of
topological characteristics (Table 2). Thus, while all or most
MATE and OLF family members appear to exhibit 12
TMSs, the first two TMSs in the OLF family proteins are
strongly amphipathic, and therefore exhibit less hydro-
phobicity than would otherwise be expected. The same is
true of the first two TMSs in the MVF family proteins
which are similarly amphipathic (see Fig. 2C,D, respect-
ively). Whether or not these TMSs are transmembrane or
membrane surface localized has yet to be determined, but
their striking conservation clearly suggests an important
functional role. Additionally, MVF family proteins may
have as few as 10 and as many as 15 TMSs, based on
hydropathy analyses (Table 2).
The pathways taken for the appearance of these topo-
logical types are worthy of consideration. A primordial gene
encoding a six TMS protein must have duplicated internally
to give a 12 TMS protein. Then, two additional TMSs were
added to many PST family proteins, and possibly one or
two was/were added to give 13 or 14 TMS proteins in the
MATE and MVF families. Finally, in high G + C Gram-
positive bacteria, large C-terminal hydrophilic domains,
including homologues of eukaryotic-like serine/threonine
protein kinases, and a fifteenth putative TMS, became
associated with MVF transporters. The functions of these

kinase domains is proposed to be related to the regulation of
transport, but as the transport substrates are not known for
MVF family proteins, the significance of this finding cannot
yet be evaluated.
We have noted previously that evidence for horizontal
transfer of genes encoding transport proteins between the
archaeal, bacterial and eukaryotic domains is largely lacking.
However, in this report we provide convincing evidence for
horizontal transfer of genes encoding PST family members
between bacteria and archaea. The PST family includes
polysaccharide (or lipopolysaccharide precursor) exporters,
and as the biosynthetic enzymes for these cell surface
macromolecules are known to have been subject to extensive
lateral transfer [60], it is not surprising that the associated
transporter genes have been subject to similar pressures.
Avoidance of immune surveillance by host animals may
have provided the impetus for such gene transfer events.
Acknowledgements
Work in our laboratory was supported by NIH grants GM55434 and
GM64368 from the National Institute of General Medical Sciences. We
thank Mary Beth Hiller for her assistance in the preparation of this
manuscript.
References
1. Page, R.D.M. (1996)
TREEVIEW
: an application to display phylo-
genetic trees on personal computers. Computer Appl. Biosci. 12,
357–358.
2. Bolhuis, H., van Ween, H.W., Poolma, B., Driessen, A.J. &
Konings, W.N. (1997) Mechanism of multidrug transporters.

FEMS Microbiol. Rev. 21, 55–84.
3. Brown, M.H., Paulsen, I.T. & Skurray, R.A. (1999) The multidrug
efflux protein NorM is a prototype of a new family of transporters.
Mol. Microbiol. 31, 393–395.
4. Nikaido, H. (1996) Multidrug efflux pumps of gram-negative
bacteria. J. Bacteriol. 178, 5853–5859.
5. van Veen, H.W. & Konings, W.N. (1997) Drug efflux proteins in
multidrug resistant bacteria. Biol. Chem. 378, 769–777.
6. Paulsen, I.T., Chen, J., Nelson, K.E. & Saier, M.H. Jr (2001)
Comparative genomics of microbial drug efflux systems. J.Mol.
Microbiol. Biotechnol. 3, 145–150.
7. Saier, M.H. Jr & Paulsen, I.T. (2001) Phylogeny of multidrug
transporters. Cell Dev. Biol. 12, 205–213.
8. Tseng, T T., Gratwick, K.S., Kollman, J., Park, D., Nies, D.H.,
Goffeau, A. & Saier, M.H. Jr (1999) The RND permease super-
family: an ancient, ubiquitous and diverse family that includes
human disease and development proteins. J.Mol.Microbiol.
Biotechnol. 1, 107–125.
9. Cox, J.S., Chen, B., McNeil, M. & Jacobs, W.R. Jr (1999) Com-
plex lipid determines tissue-specific replication of Mycobacterium
tuberculosis in mice. Nature 402, 79–83.
10. Jack, D.L., Yang, N.M. & Saier, M.H. Jr (2001) The drug/
metabolite transporter superfamily. Eur. J. Biochem. 268,
3620–3639.
11. Chung, Y.J., Krueger, C., Metzgar, D. & Saier, M.H. Jr (2001)
Size comparisons among integral membrane transport protein
homologues in bacteria, archaea, and eucarya. J. Bacteriol. 183,
1012–1021.
12. Yerushalmi, H. & Schuldiner, S. (2000) A model for coupling of H
(+) and substrate fluxes based on Ôtime-sharingÕ of a common

binding site. Biochemistry 39, 14711–14719.
13. Pao, S.S., Paulsen, I.T. & Saier, M.H. Jr (1998) The major facili-
tator superfamily. Microbiol. Mol. Biol. Rev. 62, 1–32.
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 811
14. Saier, M.H. Jr, Eng, B.H., Fard, S., Garg, J., Haggerty, D.A.,
Hutchinson, W.J., Jack, D.L., Lai, E.C., Liu, H.J., Nusinew, D.P.,
Omar, A.M., Pao, S.S., Paulsen, I.T., Quan, J.A., Sliwinski, M.,
Tseng, T T., Wachi, S. & Young, G.B. (1999) Phylogenetic
characterization of novel transport protein families revealed by
genome analyses. Biochim. Biophys. Acta 1422, 1–56.
15. Saurin, W., Hofnung, M. & Dassa, E. (1999) Getting in our out:
early segregation between importers and exporters in the evolution
of ATP-binding cassette (ABC) transporters. J.Mol.Evol.48,
22–41.
16. Dassa,E.&Bouige,P.(2001)TheABCofABCs:aphylogenetic
and functional classification of ABC systems in living organisms.
Res. Microbiol. 152, 211–229.
17. Chen, J., Morita, Y., Huda, N., Kuroda, T., Mizushima, T. &
Tsuchiya, T. (2002) VmrA, a member of a novel class of Na
+
-
coupled multidrug efflux pumps from Vibrio parahaemolyticus.
J. Bacteriol. 184, 572–576.
18. Miwatani, T. & Takeda, Y. (1976) Food poisoning due to Vibrio
parahaemolyticus in Japan. In Vibrio parahaemolyticus, a
Causative Bacterium Food of Food Poisoning (T. Miwatani &
Y. Takeda, eds), pp. 22–25. Saikon Publishing Co, Tokyo, Japan.
19. Morita, Y., Kataoka, A., Shiota, S., Mizushima, T. & Tsuchiya,
T. (2000) NorM of Vibrio parahaemolyticus is an Na
+

-driven
multidrug efflux pump. J. Bacteriol. 182, 6694–6697.
20.Morita,Y.,Kodama,K.,Shiota,S.,Mine,T.,Kataoka,A.,
Mizushima, T. & Tsuchiya, T. (1998) NorM, a putative multidrug
efflux protein, of Vibrio parahaemolyticus and its homolog in
Escherichia coli. Antimicrob. Agents Chemother. 42, 1778–1782.
21. Diener, A.C., Gaxiola, R.A. & Fink, G.R. (2001) Arabidopsis
ALF5, a multidrug efflux transporter gene family member, confers
resistance to toxins. Plant Cell 13, 1625–1637.
22. Huda, M.N., Morita, Y., Kuroda, T., Mizushima, T. & Tsuchiya,
T. (2001) Na
+
-driven multidrug efflux pump VcmA from Vibrio
cholerae non-O1, a non-halophilic bacterium. FEMS Microbiol.
Lett. 203, 235–239.
23. Miyamae, S., Ueda, O., Yoshimura, F., Hwang, J., Tanaka, Y. &
Nikaido, H. (2001) A MATE family multidrug efflux transporter
pumps out fluoroquinolones in Bacteroides thetaiotaomicron.
Antimicrob. Agents Chemother. 45, 3341–3346.
24.Shiomi,N.,Fukuda,H.,Murata,K.&Kimura,A.(1995)
Improvement of S-adenosylmethionine production by integration
of the ethionine-resistance gene into chromosomes of the yeast
Saccharomyces cerevisiae. Appl. Microbiol. Biotechnol. 42, 730–
733.
25. Mortier Barriere, I., de Saizieu, A., Claverys, J.P. & Martin, B.
(1998) Competence-specific induction of recA is required for full
recombination proficiency during transformation in Streptococcus
pneumoniae. Mol. Microbiol. 27, 159–170.
26. Thoms, B. & Wackernagel, W. (1987) Regulatory role of recF in
the SOS response of Escherichia coli: impaired induction of SOS

genes by UV irradiation and nalidixic acid in a recF mutant.
J. Bacteriol. 169, 1731–1736.
27. Paulsen, I.T., Beness, A.M. & Saier, M.H. Jr (1997) Computer-
based analyses of the protein constituents of transport systems
catalysing export of complex carbohydrates in bacteria. Micro-
biology 143, 2685–2699.
28. Soldo, B., Lazarevic, V., Pagni, M. & Karamata, D. (1999)
Teichuronic acid operon of Bacillus subtilis 168. Molec. Microbiol.
31, 795–805.
29. Whitfield, C. & Roberts, I.S. (1999) Structure, assembly and reg-
ulation of expression of capsules in Escherichia coli. Mol. Micro-
biol. 31, 1307–1319.
30. Vincent, C., Doublet, P., Grangeasse, C., Vaganay, E., Cozzone,
A.J. & Duclos, B. (1999) Cells of Escherichia coli contain a pro-
tein-tyrosine kinase, Wzc, and a phosphotyrosine-protein phos-
phatase, Wzb. J. Bacteriol. 181, 3472–3477.
31. Helenius, J., Ng, D.T.W., Marolda, C.L., Walter, P., Valvano,
M.A. & Aebi, M. (2002) Translocation of lipid-linked oligo-
saccharides across the ER membrane requires Rft1 protein.
Nature 415, 447–450.
32. Kutsukake, K., Okada, T., Yokoseki, T. & Iino, T. (1994)
Sequence analysis of the flgA gene and its adjacent region in
Salmonella typhimurium, and identification of another flagellar
gene, Flgn. Gene 143, 49–54.
33. Rudnick, P.A., Arconde
´
guy,T.,Kennedy,C.K.&Kahn,D.
(2001) glnD and mviN are genes of an essential operon in
Sinorhizobium meliloti. J. Bacteriol. 183, 2682–2685.
34. Altschul, S.F., Madden, T.L., Scha

¨
ffer, A.A., Zhang, J., Zhang,
Z., Miller, W. & Lipman, D.J. (1997) Gapped
BLAST
and
PSI
-
BLAST
: a new generation of protein databases search programs.
Nucl Acids Res. 25, 3389–3402.
35. Zhou, X., Hvorup, R.N. & Saier, M.H. Jr (2003) An automated
program to screen databases for members of protein families.
J. Mol. Microbiol. Biotechnol. in press.
36. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Rapp,
B.A. & Wheeler, D.L. (2002)
GENBANK
. Nucleic Acids Res. 30,
17–20.
37. Bairoch, A. & Apweiler, R. (2000) The
SWISS
-
PROT
protein
sequence database and its Supplement
TREMBL
in 2000. Nucleic
Acids Res. 28, 45–48.
38. Holm, L. & Sander, C. (1998) Removing near-neighbor
redundancy from large protein sequence collections. Bioinfor-
matics 14, 423–429.

39. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. &
Higgins, D.G. (1997) The
CLUSTAL

WINDOW
interface: flexible
strategies for multiple sequence alignment aided by quality ana-
lysis tools. Nucl Acids Res. 25, 4876–4882.
40. Zhai, Y. & Saier, M.H. Jr (2001) The A
VE
H
AS
program for the
determination of average hydropathy, amphipathicity, and simi-
larity. J. Mol. Microbiol. Biotechnol. 3, 285–286.
41. Zhai, Y., Tchieu, J. & Saier, M.H. Jr (2002) A web-based Tree
View (TV) program for the visualization of phylogenetic trees.
J. Mol. Microbiol. Biotechnol. 4, 69–70.
42. Tusna
´
dy, G.E. & Simon, I. (1998) Principles governing amino acid
composition of integral membrane proteins: applications to
topology prediction. J. Mol. Biol. 283, 489–506.
43. Sonnhammer, E.L.L., von Heijne, G. & Krogh, A. (1998)
A hidden Markov model for predicting transmembrane
helices in protein sequences. In Proceedings of the Sixth Inter-
national Conference on Intelligent Systems for Molecular Bio-
logy (J.Glasgow,T.Littlejohn,F.Major,R.Lathrop,D.Sankoff
& C. Sensen, eds), pp. 175–182, AAAI Press, Menlo Park, CA,
USA.

44. Zhai, Y. & Saier, M.H. Jr (2001b) A web-based program
(
WHAT
) for the simultaneous prediction of hydropathy, amphi-
pathicity, secondary structure and transmembrane topology for
a single protein sequence. J. Mol. Microbiol. Biotechnol. 4,
501–502.
45. von Heijne, G. (1992) Membrane protein structure prediction,
hydrophobicity analysis and the positive-inside rule. J. Mol. Biol.
225, 487–494.
46. Grundy, W.N., Bailey, T.L., Elkan, C.P. & Baker, M.E. (1997)
Meta-MEME: motif-based hidden Markov models of protein
families. Comput. Appl. Biosci. 13, 397–406.
47. Devereux, J., Haeberli, P. & Smithies, O. (1984) A comprehensive
setofsequenceanalysisprogramsforthe
VAX
. Nucl Acids Res. 12,
387–395.
48. Zhai, Y. & Saier, M.H. Jr (2002) A simple sensitive program for
detecting internal repeats in sets of multiply aligned homologous
proteins. J. Mol. Microbiol. Biotechnol. 4, 29–31.
49. Pearson, W.R. (1990) Rapid and sensitive sequence comparison
with FASTP and FASTA. Methods Enzymol. 183, 63–98.
812 R. N. Hvorup et al. (Eur. J. Biochem. 270) Ó FEBS 2003
50. Tatusova,T.A.&Madden,T.L.(1999)
BLAST
2 Sequences, a new
tool for comparing protein and nucleotide sequences. FEMS
Microbiol. Lett. 174, 247–250.
51. Dayhoff, M.O., Barker, W.C. & Hunt, L.T. (1983) Establishing

homologies in protein sequences. Meth Enzymol. 91, 524–545.
52. Saier, M.H. Jr (1994) Computer aided analysis of transport pro-
tein sequences: gleaning evidence concerning function, structure,
biogenesis, and evolution. Microbiol. Rev. 58, 71–93.
53. Zhou, X., Yang, N.M., Tran, C.V., Hvorup, R.N. & Saier, M.H.
Jr (2003) Web-based programs for the display and analysis of
transmembrane-helices in aligned protein sequences. J. Mol.
Microbiol. Biotechnol. in press.
54. Paulsen, I.T., Nguyen, L., Sliwinski, M.K., Rabus, R. & Saier,
M.H. Jr (2000) Microbial genome analyses: comparative transport
capabilities in eighteen prokaryotes. J. Mol. Biol. 301, 75–100.
55. Paulsen, I.T., Sliwinski, M.K. & Saier, M.H. Jr (1998) Microbial
genome analyses: global comparisons of transport capabilities
based on phylogenies, bioenergetics and substrate specificities.
J. Mol. Biol. 277, 573–592.
56. Saier, M.H. Jr, Beatty, J.T., Goffeau, A., Harley, K.T., Heijne,
W.H.M.,Huang,S C.,Jack,D.L.,Jahn,P.S.,Lew,K.,Liu,J.,
Pao, S.S., Paulsen, I.T., Tseng, T T. & Virk, P.S. (1999) The
major facilitator superfamily. J. Mol. Microbiol. Biotechnol. 1,
257–279.
57. Katayama, T., Suzuki, H., Koyanagi, T. & Kumagai, H. (2002)
Functional analysis of the Erwinia herbicola tutB gene and its
product. J. Bacteriol. 184, 3135–3141.
58. Achenbach-Richter, L., Gupta, R., Stetter, K.O. & Woese, C.R.
(1987) Were the original eubacteria thermophiles? Syst. Appl.
Microbiol. 9, 34–39.
59. Nelson, K.E., Clayton, R.A., Gill, S.R., Gwinn, M.L., Dodson,
R.J.,Haft,D.H.,Hickey,E.K.,Peterson,J.D.,Nelson,W.C.,
Ketchum,K.A.,McDonald,L.,Utterback,T.R.,Malek,J.A.,
Linher, K.D., Garrett, M.M., Stewart, A.M., Cotton, M.D., Pratt,

M.S., Philips, C.A., Richardson, D., Heidelberg, J., Sutton, G.G.,
Fleischmann,R.D.,Eisen,J.A.,White,O.,Salzberg,S.L.,Smith,
H.O., Venter, J.C. & Fraser, C.M. (1999) Evidence for lateral gene
transfer between archaea and bacteria from genome sequence of
Termotoga maritima. Nature 399, 323–329.
60. Reeves, P. (1993) Evolution of Salmonella O antigen variation by
interspecific gene transfer on a large scale. Trends Genetics 9,
17–22.
61. Rocchetta, H.L., Burrows, L.L. & Lam, J.S. (1999) Genetics of
O-antigen biosynthesis in Pseudomonas aeruginosa. Microbiol.
Mol. Biol. Rev. 63, 523–553.
62. Wang, L., Andrianopoulos, K., Liu, D., Popoff, M.Y. & Reeves,
P.R. (2002) Extensive variation in the O-antigen gene cluster
within one Salmonella enterica serogroup reveals an unexpected
complex history. J. Bacteriol. 184, 1669–1677.
63. Davidson, E.A. & Gowda, D.C. (2001) Glycobiology of Plasmo-
dium falciparum. Biochimie 83, 601–604.
64. Nies,D.H.,Koch,S.,Wachi,S.,Peitzsch,N.&Saier,M.H.Jr
(1998) CHR, a novel family of prokaryotic proton motive force-
driven transporters probably containing chromate/sulfate anti-
porters. J. Bacteriol. 180, 5799–5802.
65. Yen, M.R., Peabody, C.R., Partovi, S.M., Zhai, Y., Tseng, Y.H. &
Saier, M.H. (2002) Protein-translocating outer membrane porins
of Gram-negative bacteria. Biochim. Biophys. Acta 1562, 6–31.
66. Rabus, R., Jack, D.L., Kelly, D.J. & Saier, M.H. Jr (1999) TRAP
transporters: an ancient family of extracytoplasmic solute-
receptor-dependent secondary active transporters. Microbiology
145, 3431–3445.
67. Reizer, J., Reizer, A. & Saier, M.H. Jr (1992) A new subfamily of
bacterial ABC-type transport systems catalyzing export of drugs

and carbohydrates. Prot. Sci. 1, 1326–1332.
68. Nishino, K. & Yamaguchi, A. (2001) Analysis of a complete
library of putative drug transporter genes in Escherichia coli.
J. Bacteriol. 183, 5803–5812.
Supplementary material
The following material is available from http://www.
blackwellpublishing.com/products/journals/suppmat/EJB/
EJB3418/EJB3418sm.htm
Fig. S1. Mate alignments.
Fig. S2. PST alignments.
Fig. S3. OLF alignments.
Fig. S4. MVF alignments.
Fig. S5. Binary alignment of an established MATE family
member with an established PST family member.
Fig. S6. Binary alignment of an established OLF family
member with an established PST family member.
Fig. S7. Binary alignment of an established MVF family
member with an established PST family member.
Table S1. MATE family members.
Table S2. PST family members.
Table S3. OLF family members.
Table S4. MVF family members.
Ó FEBS 2003 The MOP superfamily (Eur. J. Biochem. 270) 813

×