Looking for the ancestry of the heavy-chain subunits of
heteromeric amino acid transporters rBAT and 4F2hc
within the GH13 a-amylase family
Marek Gabris
ˇ
ko
1
and S
ˇ
tefan Janec
ˇ
ek
1,2
1 Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia
2 Department of Biotechnology, Faculty of Natural Sciences, University of SS. Cyril and Methodius, Trnava, Slovakia
Introduction
To fulfil its metabolic needs, a cell uses specialized
transport proteins to perform and control the uptake
and efflux of crucial compounds (e.g. sugars, amino
acids, nucleotides, inorganic ions and drugs) across
the plasma membrane. These proteins have been clas-
sified into the phylogenetically derived solute carrier
(SLC) families; current classification counts almost 50
SLC families [1,2]. The sequence similarity between
the heavy-chain subunits of heteromeric amino acid
transporters (hcHATs) and the a-glucosidases from
the a-amylase family [3] was first recognized more
than 15 years ago [4]. HATs are composed proteins
consisting of a light subunit (SLC7 members) and a
heavy subunit (known as rBAT or 4F2hc; SLC3
Keywords
4F2hc; evolutionary relatedness; oligo-1,6-
glucosidase subfamily; rBAT; a-amylase
family
Correspondence
S
ˇ
. Janec
ˇ
ek, Institute of Molecular Biology,
Slovak Academy of Sciences, Du
´
bravska
´
cesta 21, SK-84551 Bratislava, Slovakia
Fax: +421 2 59307416
Tel: +421 2 59307420
E-mail:
(Received 15 July 2009, revised
18 September 2009, accepted 12 October
2009)
doi:10.1111/j.1742-4658.2009.07434.x
In an effort to shed more light on the early evolutionary history of the
heavy-chain subunits of heteromeric amino acid transporters (hcHATs)
rBAT and 4F2hc within the a-amylase family GH13, a bioinformatics
study was undertaken. The focus of the study was on a detailed sequence
comparison of rBAT and 4F2hc proteins from as wide as possible taxo-
nomic spectrum and enzyme specificities from the a-amylase family. The
GH13 enzymes were selected from the so-called GH13 oligo-1,6-glucosidase
and neopullulanase subfamilies that represent the a-amylase family enzyme
groups most closely related to hcHATs. Within this study, more than 30
hcHAT-like proteins, designated here as hcHAT1 and hcHAT2 groups,
were identified in basal Metazoa. Of the GH13 catalytic triad, only the cat-
alytic nucleophile (aspartic acid 199 of the oligo-1,6-glucosidase) could
have its counterpart in some 4F2hc proteins, whereas most rBATs contain
the correspondences for the entire GH13 catalytic triad. Moreover, the
4F2hc proteins lack not only domain B typical for GH13 enzymes, but also
a stretch of 40 amino acid residues succeeding the b4-strand of the cata-
lytic TIM barrel. rBATs have the entire domain B as well as longer loop 4.
The higher sequence–structural similarity between rBATs and GH13
enzymes was reflected in the evolutionary tree. At present it is necessary to
consider two different scenarios on how the chordate rBAT and 4F2hc
proteins might have evolved. The GH13-like protein from the cnidarian
Nematostella vectensis might nowadays represent a protein close to the
eventual ancestor of the hcHAT proteins within the GH13 family.
Abbreviations
ATG, amino acid transporter glycoprotein; CSR, conserved sequence regions; GH, glycoside hydrolase; HAT, heteromeric amino acid
transporter; hcHAT, heavy-chain subunits of heteromeric amino acid transporter; SLC, solute carrier.
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7265
members), connected by a disulfide bridge [2].
Because of their significance in human pathology
(their defects lead to primary inherited aminoacidu-
rias, e.g. failed renal reabsorption of amino acids),
HATs have attracted much attention in medical stud-
ies (e.g. [2,5–7]). The light subunit is a nonglycosylat-
ed hydrophobic 12-helix transmembrane protein,
whereas the heavy subunit is a type II membrane
N-glycoprotein with an intracellular N-terminal end, a
single transmembrane region and a large extracellular
C-terminal domain [2]. It is the light subunit that
possesses the amino acid transportation activity,
although without interacting with the heavy subunit it
is unable to reach the plasma membrane. Thus, the
role of the heavy subunit is to recognize the light
subunit and to chaperone it to a proper position in
the plasma membrane, i.e. this subunit is not abso-
lutely necessary for the transport activity [8], but
interestingly its C-terminal extracellular domain exhib-
its sequence similarities to the a-amylase family
enzymes [4,9].
The a-amylase family [3] forms in the sequence-
based classification of glycoside hydrolases (GHs), the
GH-H clan [10], consisting of three GH families:
GH13, GH70 and GH77. These enzymes ( 30 differ-
ent EC numbers) should satisfy the following require-
ments: (a) the catalytic domain is formed by the (b ⁄ a)
8
barrel fold (i.e. TIM barrel) with a small distinct
domain B protruding out from the barrel between the
b3-strand and the a3-helix; (b) the catalytic machinery
consists of the b4-strand aspartate (nucleophile),
b5-strand glutamate (proton donor) and b7-strand
aspartate (transition-state stabilizer); (c) the enzymes
employ retaining reaction mechanism; and (d)
sequences contain between four and seven conserved
sequence regions (CSRs) covering mainly the b-strands
of the catalytic TIM barrel [3,11–13]. Of the three GH
families of the clan GH-H, it was the family GH13
that was originally established as the a-amylase family
[14–17]. At present it belongs to the largest families in
the entire classification of GHs [10]. Although the
overall sequence identity within the GH13 is extremely
low [13], it contains several groups of enzymes exhibit-
ing a higher degree of mutual sequence similarity so
that the family has recently been divided into subfami-
lies [18]. Of these, the best resemblance to hcHATs
was revealed for the members of the so-called oligo-
1,6-glucosidase subfamily [9,19,20]. This was recently
confirmed by solving the three-dimensional structure
of the C-terminal domain of 4F2hc [21], which most
resembles the oligo-1,6-glucosidase from Bacillus cereus
[22] and a-glucosidase from Geobacillus sp. HTA-462
[23].
In hcHATs, the regions of similarity cover the
sequence segments within the C-terminal extracellular
domain. The segments correspond, in fact, with some
of the a-amylase family CSRs, namely the b-strands
b2, b3, b4 and b8 of the (b ⁄ a)
8
barrel domain, and for
rBAT also with the short stretch near the C-terminus
of domain B [9,19]. From the sequence–structure point
of view, the basic difference between rBAT and 4F2hc
is that rBAT possesses the segment that corresponds
with domain B of GH13 enzymes, whereas 4F2hc does
not have it [9,19,24,25].
The main goal of the present study was to investi-
gate further the resemblance between hcHAT proteins
and the enzymes from the a-amylase family. We there-
fore carried out a bioinformatics study focused on a
detailed comparison of all available rBAT and 4F2hc
sequences with GH13 enzyme representatives covering
mainly the oligo-1,6-glucosidase subfamily. This could
help to elucidate the origin of the hcHAT proteins
within the GH13 a-amylase enzyme family, as well as
shed some light on the possible evolutionary events
leading to separation of the heavy-chain subunit of
these amino acid transporters from the enzymes
involved in the metabolism of starch and related
saccharides.
Results and Discussion
Evolutionary relationships and
sequence–structural comparison
This study delivers the in silico analysis of 134
sequences consisting of 92 hcHAT proteins (represent-
ing known rBATs and 4F2hc proteins as well as their
newly identified putative homologues) and 42 GH13
enzymes (including four GH13-like sequences)
(Table 1). Their global multiple sequence alignment
(not shown) covers: (a) the N-terminal region, trans-
membrane segment, central TIM barrel domain,
including domain B and the C-terminal domain C for
rBAT proteins (669 residues on average); (b) the cata-
lytic TIM barrel domain, including domain B and the
C-terminal domain C for GH13 enzymes (572 residues
on average); and (c) the N-terminal region, transmem-
brane segment, central TIM barrel domain and the
C-terminal domain C for 4F2hc proteins (542 residues
on average). The length of the entire amino acid sequence
alignment was 1099 positions, but it should be taken into
account that, if the gaps are excluded, the overall number
of comparable positions would be < 100.
Figure 1 illustrates the evolutionary relationships
between the studied hcHAT proteins and GH13
enzymes from the a-amylase family. The tree was
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7266 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS
M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek Origin of rBAT and 4F2hc within the GH13 a-amylase family
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7267
calculated using the neighbour-joining method [26].
Other approaches, such as maximum likelihood [27],
maximum parsimony [28], minimum evolution [29] and
upgma [30] were also used, but they delivered basically
similar topologies (not shown).
The two main groups of hcHATs (Fig. 1), i.e. those
of rBAT and 4F2hc, form their own clusters within
which taxonomy is respected: (a) for the rBATs from
human via representatives of mammals, birds, lizard,
frogs and fishes to Urochordata (sea squirts) and
Cephalochordata (lancelet); and (b) for the 4F2hc pro-
teins from human via mammals, perhaps omitting
birds (as it is not found in chicken and zebra finch),
lizard, frogs, fishes and platypus to Petromyzon (sea
lamprey), Urochordata (sea squirts) and even Ixodes
(tick). What is also clear is the grouping of the GH13
enzymes, which cover: (a) the representatives of the
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7268 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS
individual enzyme specificities from both the oligo-1,6-
glucosidase and neopullulanase GH13 subfamilies [20];
and (b) the additional GH13 a-glucosidases from fungi
(yeast), insects and thermophilic (and soil) bacteria. It
is worth mentioning that the fungal (yeast) a-glucosid-
ases are clustered with their counterparts from bacilli
and the closely related specificities, such as oligo-1,6-
glucosidase, dextran glucosidase, trehalose-6-phosphate
hydrolase and isomaltulose synthase, whereas the rep-
resentatives of trehalose synthase, amylosucrase and
sucrose phosphorylase share the branch leading also to
members of the neopullulanase GH13 subfamily
together with the intermediary enzymes (Fig. 1). The
overall arrangement of the tree is that the clusters of
true rBAT and 4F2hc proteins are separated from each
other by the GH13 enzymes.
All remaining sequences (except those from nema-
todes) that were not possible to classify as true rBAT
and true 4F2hc proteins were first designated as
hcHAT-like proteins. Then, based on an approximate
alignment, which served to construct a preliminary evo-
lutionary tree, these hcHAT-like proteins were divided
into hcHAT1 and hcHAT2 groups (Table 1). It is
worth mentioning that most of them are hypothetical
proteins that in some cases were retrieved from recent
complete genome sequencing projects containing raw
sequence data still without appropriate annotation.
Most hcHAT1 proteins cover the insects and, in a
wider sense, the Arthropoda (daphnia), which are com-
pleted by Cephalochordata and Echinodermata (both
Deuterostomia) and one representative from Cnidaria
(Nematostella). The group of hcHAT2 proteins also
consists of Arthropoda, i.e. insects accompanied by
Daphnia and Ixodes, and two representatives of schisto-
somes. Interestingly, although present in the subgenus
Drosophila, hcHAT2 proteins seem to be lacking in the
melanogaster group (subgenus Sophophora).
With regard to hcHAT1 from Aeges aegypti [31] and
Drosophila melanogaster [32], these two proteins have
already been experimentally confirmed as heavy-chain
subunits (CD98hc, i.e. 4F2hc) in the amino acid trans-
porter system analogous to that known in mammals
[2,21]. A similar observation was reported for the
SPRM1hc from Schistosoma mansoni [33], which in the
present study is classified in the hcHAT2 group
(Table 1). Obviously, although hcHAT1 and hcHAT2
groups retain independency from each other, both
seem to be more closely related to typical 4F2hc
proteins than to rBATs (Fig. 1).
Concerning the above-mentioned hcHAT sequences
from nematodes, these proteins from Caenorhabd-
itis elegans [34] have been named as amino acid trans-
porter glycoproteins (ATG). Of the two groups, ATG1
and ATG2 (Table 1), the relevant light chains com-
bined only with ATG2 exhibited the transporter func-
tion [34]. From the evolutionary tree (Fig. 1), both
ATG clusters (ATG1 and ATG2) from all studied
nematodes could represent a counterpart group to
hcHAT2 proteins.
As far as the sequence similarities and differences
between the hcHAT proteins and GH13 enzymes are
concerned, the basic feature discriminating the 4F2hc
proteins from both rBATs and GH13 enzymes is the
lack of domain B protruding out of the TIM barrel in
the place of loop 3 connecting the b3-strand to the
a3-helix [9,21]. Sharing domain B by rBATs and
GH13 enzymes, and especially the sequence of the fifth
CSR (QPDLN for both human rBAT and Bacil-
lus cereus oligo-1,6-glucosidase) [20] (Fig. 2), may indi-
cate a shorter evolutionary distance for rBATs from
the GH13 ancestor common for both rBAT and 4F2hc
proteins. Complete domain B with well-conserved
b-strands is also present in hcHAT1 proteins. In all
other groups, this domain is more or less distorted,
culminating in complete loss in 4F2hc proteins. The
presence of full GH13 domain B in hcHAT1 and the
absence of its parts in hcHAT2 indicate the eventual
intermediary or primordial character of both hcHAT1
and hcHAT2 with regard to the appearance of typical
rBAT and typical 4F2hc proteins in animals. This
seems to be obvious, according to our present knowl-
edge, from Urochordata (Fig. 1).
The second sequence feature clearly visible from the
alignment is whether the individual catalytic residues,
or even the entire catalytic triad of the GH13 a-amy-
lase family, could be found in the hcHAT representa-
tives. Fort et al. [21] reported that the human 4F2hc
does not exhibit any a-glucosidase activity. This is con-
sistent with almost a complete lack of the catalytic
triad in all 4F2hc proteins (Fig. 2). It is worth men-
tioning that, especially in higher animals (mammals
and also in frogs and fishes), an aspartate (aspartic
acid 248 in human 4F2hc; aspartic acid 380 in Fig. 1
as both the N-terminal and transmembrane segments
are involved) could be a relic of the GH13 b4-strand
catalytic nucleophile [3,11–13], although shifted one
position to the C-terminus (Fig. 2). On the other hand,
most rBAT representatives contain all three catalytic
residues (Fig. 2) with the exception of those from
birds, lizards and frogs (lacking both essential aspar-
tates at the b4- and b7-strands) and also from some
fishes (lacking the b4-strand aspartate). This may mean
that the eventuality of a-glucosidase activity of true
rBATs cannot be unambiguously eliminated.
The selected CSRs (Fig. 2) characteristic of the
a-amylase enzyme family GH13 [13] illustrate the addi-
M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek Origin of rBAT and 4F2hc within the GH13 a-amylase family
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7269
Fig. 1. Evolutionary tree of the hcHAT pro-
teins and the GH13 a-amylase family mem-
bers. The tree is based on the alignment of
complete sequences and calculated includ-
ing gaps. The numbers represent the boot-
strap values. The individual proteins and
enzymes are abbreviated as follows (see
also Table 1): rBAT, true rBAT proteins; 4F2,
true 4F2hc proteins; ATG1 and ATG2, ATGs
from nematodes; hcHAT1 and hcHAT2,
hcHAT-like proteins covering basal metazo-
ans and arthropods; GH13, GH13-like
proteins or enzymes; OGLU, oligo-1,6-
glucosidase; AGLU, a-glucosidase; DGLU,
dextran glucosidase; T6PH, trehalose-6-
phosphate hydrolase; ASU, amylosucrase;
SPH, sucrose phosphorylase; IMSY, iso-
maltulose synthase; TSY, trehalose syn-
thase; CMD, cyclomaltodextrinase; MGA,
maltogenic amylase; NPU, neopullulanase;
INT, intermediary group between oligo-1,6-
glucosidase and neopullulanase subfamilies.
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7270 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS
tional sequence features conserved mutually between
the hcHAT and hcHAT-like proteins and GH13
enzymes, as well as within the individual groups of
hcHAT representatives, i.e. rBAT, 4F2hc, hcHAT1,
hcHAT2 and ATG groups (Table 1). Overall, and
interestingly, the residues that have not yet been
revealed to be essential for the GH13 enzymes seem to
be well conserved, e.g. (a) a stretch of three hydro-
phobic aliphatic residues (207_LII in human rBAT)
preceding the important aspartate (aspartic acid 98 in
oligo-1,6-glucosidase) in region I covering the
b3-strand; (b) a segment of up to five residues
(307_GVDGF in human rBAT) preceding the func-
tional arginine (arginine 197 in oligo-1,6-glucosidase)
in region II of the b4-strand; and (c) more or less the
entire region VII, i.e. the b8-strand. The fact that
rBATs exhibit more sequence similarities with the
GH13 enzymes than the 4F2hc proteins is also clearly
and easily visible in selected CSRs (Fig. 2). It concerns
mainly: (a) tryptophan (tryptophan 161 in human
rBAT) in region VI (b2-strand); (b) histidine (histidine
215) at the end of region I (b3-strand); the entire
region V in loop 3 (i.e. domain B) being 282_QPDLN
in human rBAT; and (d) conserving the catalytic resi-
dues (often the entire catalytic triad). Some of these
features can be traced in the sequences of hcHAT1
and hcHAT2 groups as well as of the ATG proteins
(Fig. 2), indicating evolutionary relationships of all
these enzymes and proteins and hinting at their even-
tual evolutionary histories. It is worth mentioning that
to understand the common evolutionary history of
hcHAT proteins and GH13 enzymes it is necessary to
re-evaluate the CSR VII covering the b8-strand
[13,20], as this segment – obviously without the GH13
functionally important residues – belongs to their best
conserved shared sequence parts (Fig. 2). It is also of
importance to note that if the CSRs (Fig. 2) serve to
calculate the evolutionary tree (not shown), all
hcHAT1 proteins (covering basal metazoans and
arthropods) and both ATG groups (ATG1 and ATG2
from nematodes; Table 1) cluster together with rBAT
proteins and GH13 enzymes (although with low boot-
strap values), whereas the entire hcHAT2 group shares
the branch with the 4F2hc proteins.
As no a-glucosidase activity was detected for the
human 4F2hc [21], reflecting that only the catalytic
nucleophile (aspartic acid 380; Fig. 2) may be pre-
served, it was of interest to identify the CSRs covering
the GH13 functionally important residues in hcHATs.
From all of them (Fig. 2), CSR III (b5-strand with the
glutamate acting as a proton donor) is not easily iden-
tifiable, even for the enzymatically active GH13 mem-
bers [13]. Therefore, one of the goals was to align
correctly the b5-strands of the hcHAT sequences,
which was especially problematic for the 4F2hc pro-
teins completely lacking the catalytic glutamate
(Fig. 2). In this regard, the putative GH13-like
sequence from the cnidarian Nematostella vectensis
containing the b5-strand segment 273_RLLIGE
(Fig. 2) should be of special importance from an evo-
lutionary point of view, as it contains the features of
both the GH13 enzymes (i.e. the glutamic acid residue
in a corresponding position) and typical 4F2hc
proteins (i.e. arginine or lysine followed by the stretch
of three aliphatic hydrophobic residues, e.g.
405_RLLIAG in human 4F2hc; Fig. 2). This segment
preceding the catalytic b5-strand glutamate is also con-
served in most insect a-glucosidases, supporting the
possibility that the ancestry of the hcHAT proteins
within the GH13 a-amylase enzyme family could be
rooted in basal metazoans, currently represented by
Nematostella vectensis.
A comparison of the three-dimensional structures of
representatives of hcHATs (human 4F2hc, 417 residues
[21] and a model of the human rBAT with 535 resi-
dues) and GH13 enzymes (Geobacillus sp. HTA-46
a-glucosidase; 531 residues [23]) confirmed the
expected higher similarity between rBAT proteins and
GH13 enzymes (root-mean-square deviation 1.62 A
˚
between the C
a
atoms of 436 corresponding residues)
than between 4F2hc proteins and GH13 enzymes
(1.67 A
˚
for 293 C
a
atoms) as well as rBATs and 4F2hc
proteins mutually (1.80 A
˚
for 271 C
a
atoms). However,
what could be more interesting is the observation of
human 4F2hc lacking not only domain B, but also a
stretch of 40 amino acid residues succeeding the
b4-strand (not shown). The human 4F2hc thus pos-
sesses a very short loop 4 connecting the b4-strand to
a4-helix in an opposite manner to what is seen in both
the Geobacillus a-glucosidase and human rBAT protein
(having the entire domain B). Regardless of whether
domain B in the GH13 oligo-1,6-glucosidase subfamily
members (and also in rBATs) operates in conjunction
with the prolonged loop 4, it seems that the consecu-
tive loss of domain B in 4F2hc proteins is connected
with adequate shortening of loop 4, as the observation
can be generalized to all 4F2hc proteins. Note that the
GH13 neopullulanase subfamily members [20], possess-
ing shorter domain B [9,35–37], also lack the longer
excursion of the loop 4 segment.
Selection pressure
With regard to close sequence similarity between the
GH13 enzymes and the hcHAT proteins (especially
rBATs), it is interesting to compare the selection pres-
M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek Origin of rBAT and 4F2hc within the GH13 a-amylase family
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7271
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7272 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS
sure acting on corresponding stretches of amino acid
sequences. For this purpose, the selecton tool [38]
was chosen. Figure 3 illustrates the similarities and dif-
ferences in selection pressure acting on the three stud-
ied protein groups – mammalian 4F2hc proteins,
vertebrate rBATs and insect a-glucosidases. In agree-
ment with the higher degree of sequence similarity
between rBAT proteins and GH13 enzymes, the selec-
tion pressure was also found to be more similar for
these two groups than that observed for 4F2hc and
rBAT proteins, as well as for 4F2hc proteins and
GH13 enzymes (Fig. 3). Remarkably, there are a few
segments, namely those at or around the b2-, b3- and
b8-strands (CSRs VI, I and VII, respectively) that
exhibit similar selection pressure for all the three
groups, i.e. rBAT, 4F2hc and a-glucosidases. This indi-
cates that the residues from the above-mentioned seg-
ments of both rBAT and 4F2hc proteins, sharing the
value of selection pressure with their counterparts from
a-glucosidases, may also share their functions.
Although for the b3-strand at least the histidine (histi-
dine 103 in Bacillus cereus oligo-1,6-glucosidase) is
known to be involved in the active site of GH13
enzymes [3,11,22], no functional role has been assigned
to any residue from both the b2- and b8-strands. The
results shown here (Fig. 3) could therefore mean that
they contribute to the overall structural integrity of the
TIM barrel domain. Concerning the GH13 catalytic
triad, it is worth mentioning that in spite of their pres-
ence in rBATs, their positions (especially for the b4
catalytic nucleophile and b7 transition-state stabilizer)
are selection neutral in contrast to strict purifying
selection observed here for a-glucosidases (Fig. 3).
Eventual evolutionary scenarios
This study has delivered not only evolutionary rela-
tionships (Fig. 1) based on a detailed sequence com-
parison of all currently available sequences of rBAT,
4F2hc and hcHAT-like proteins with their GH13 enzy-
matic counterparts (Table 1), but it has also tried to
trace the ancestry of hcHAT proteins within the GH13
a-amylase family. In fact, two different evolutionary
scenarios could be taken into account: (a) in one single
event in basal Metazoa and a subsequent split into
rBAT and 4F2hc (probably via hcHAT1 group) in
chordates; and (b) in two independent branching
events, i.e. 4F2hc in the basal Metazoa via HAT-like
proteins and rBAT directly from enzymes in deuterost-
omes. It is worth mentioning here that both scenarios
reflect the ancestry of both rBATs and 4F2hc proteins
anchored within the GH13 a-amylase family. The dif-
ference is only in the way leading from the GH13
enzymes either to rBAT and 4F2hc together or to
rBAT and 4F2hc separately. At present it is not possi-
ble to draw the evolutionary picture unambiguously.
The first evolutionary scenario, basically consistent
with the one proposed originally [9], means that in
basal Metazoa an ancestor of both the present-day
4F2hc and rBAT proteins was separated from the
GH13 enzymes. The ancestor acquired the N-terminal
and transmembrane segments and, eventually (in most
taxa), duplicated and evolved to give in chordates: (a)
rBATs that have kept most of the GH13 sequence–
structural features, including domain B as well as cata-
lytic residues (often the entire catalytic triad); and (b)
4F2hc that has consecutively lost almost all of the
GH13 characteristic sequence–structural features,
including domain B as well as functional residues
(mainly the catalytic triad). The weak points of this
scenario are: (a) the striking similarity between rBATs
and GH13 enzymes; (b) the higher similarity between
4F2hc and hcHAT-like proteins than between 4F2hc
and rBATs; and (c) the seeming absence of rBAT
ancestors in nematodes and arthropods (Fig. 1).
The other completely different scenario that would
seemingly obey the observation of a generally higher
degree of sequence–structural similarity between
rBATs and GH13 enzymes than between 4F2hc pro-
teins and GH13 enzymes would assume the indepen-
dent evolution of rBATs and 4F2hc proteins. This
eventuality would leave both hcHAT1 and hcHAT2
groups in the history leading to the 4F2hc proteins.
The problems in this scenario would be: (a) the inde-
pendent acquisition of both the N-terminal segment
and the transmembrane region in rBAT and 4F2hc
proteins, which should appear more parsimoniously
only once; and (b) the gain of the analogous function.
Because the family GH13 enzymes are spread
throughout the whole taxonomy spectrum from prok-
aryotes to eukaryotes and are therefore more ancient
than the hcHATs (present only in Metazoa), there is
Fig. 2. The CSRs of the hcHAT proteins and the GH13 a-amylase family members. A list of the abbreviations of proteins and enzymes can
be found in Fig. 1. The segments covering the strands b2, b3, loop 3 (near the C-terminus of domain B connecting the b3-strand and helix
3), b4, b5, b7 and b8 represent the individual CSRs of the a-amylase family [13]. The positions corresponding with the GH13 catalytic triad
are boxed. The individual selected residues are highlighted as follows: aspartate and glutamate – red; glycine and proline – black; valine,
leucine and isoleucine – grey; phenylalanine and tyrosine – blue; tryptophan – magenta; histidine – cyan; arginine and lysine – green; cysteine
– yellow.
M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek Origin of rBAT and 4F2hc within the GH13 a-amylase family
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7273
only one possible place for rooting the tree that is on
the branch leading to the enzymes originating from
non-Metazoa (the eventual outgroup).
It is worth mentioning, however, that if the evolu-
tionary tree of all proteins studied here is based on the
alignment of CSRs (Fig. 2), the ATG proteins from
nematodes [34] and all hcHAT-like proteins designated
here as hcHAT1 group (Table 1), i.e. hcHAT-like pro-
teins covering the basal Metazoa and Arthropoda,
cluster together with both the rBAT proteins and
GH13 enzymes, leaving the 4F2hc proteins with the
hcHAT2 group at a different branch (tree not shown).
It should be pointed out that despite the fact the
GH13 CSRs could be considered to be something like
sequence fingerprints of the GH13 a-amylase family
members [13], the tree based on the CSRs is supported
by low bootstrap values. It is thus not possible to say
which one, hcHAT1 or hcHAT2, is orthologous to
rBAT or 4F2hc, if any. Although both hcHAT1
(insect) and hcHAT2 (schistosoma) representatives
have already been shown to function rather as 4F2hc
than as rBAT [31–33], their rBAT-like role has not as
yet been investigated. However, as seen in Fig. 1 (the
tree based on the complete alignment), the hcHAT2
group (Arthropoda) cluster with both ATG1 and
ATG2 (Nematoda), indicating that the hcHAT2 and
ATG proteins are orthologues. Because hcHAT2 ⁄ ATG
are present only in Arthropoda and Nematoda, they
probably came from one hcHAT protein (i.e. hcHAT1;
cf. Fig. 1) originating from a common ancestor of
Ecdysozoa. However, it should be stressed that hcHAT2
proteins (except for those from Schistosoma [33]) were
first identified in this study, so further research on
their function and to identify a light subunit to which
they bind, could throw more light on the relationships
between various hcHAT proteins. Finally, it should be
taken into account that the a-amylase family GH13
belongs to the largest GH families covering several
tens of specificities and several thousand sequences
[3,13,18] where, for example, it is still complicated to
trace clearly the evolutionary history, even just for the
animal a-amylase [39].
Conclusions
The examples of a close evolutionary relatedness
between the TIM barrel enzymes and their counter-
parts without the catalytic function are not so excep-
tional. For example, in the family GH18 chitinases,
several plant proteins, such as narbonin [40] and con-
canavalin B [41], have been recognized to be former
chitinases that have lost their catalytic residues. Even
in the GH13 a-amylase family, an enzymatically inac-
tive remote paralogous Amyrel (amylase-related) gene
Fig. 3. Selection pressure acting on rBAT and 4F2hc proteins and GH13 insect a-glucosidases (AGLU). Yellow highlighting (1 and 2) indicates
a positive selection, whereas red highlighting (4–7) indicates a purifying selection. The sequences used for the
SELECTON analysis [38] are
marked by an asterisk in Table 1. The individual CSRs of the GH13 a-amylase family [13] are boxed; the GH13 catalytic residues are indi-
cated by small yellow boxes. The individual structural parts of the proteins, i.e. the N-terminal and the transmembrane segments, domain A
(TIM barrel), domain B and domain C, are indicated by green, yellow, blue and grey shadowing, respectively.
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7274 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS
in fruit flies (Diptera) was revealed [42,43]. Moreover,
the events of horizontal gene transfer have also been
discussed, as animal-like and plant-like a-amylase
genes were found in actinomycetes and other bacteria
[44,45]. However, the main goal of the present work
was to shed more light on the early history of extant
rBAT and 4F2hc proteins. Many sequences of hypo-
thetical hcHAT-like proteins, covering the basal meta-
zoans and arthropods that are very probably
homologues to heavy-chain partners of the light-chain
subunits of the HAT system of chordates, have been
identified and analysed, which should enable the exper-
imentalists in the field to direct their research more
appropriately. The significance of this study could be
in the field of: (a) hcHAT protein research (expanding
the taxonomy spectrum and eventually focusing on
more simple model organisms, e.g. Nematostella vect-
ensis); (b) the a-amylase GH13 family enzymes (identi-
fying the sequence features often omitted in analyses
of GH13 enzymes, but also clearly conserved in rBAT
and 4F2hc proteins, e.g. the b8-strand segment of TIM
barrel domain); and (c) protein evolution in general.
On the basis of our analysis, it seems that the hcHATs
are present in animals starting from basal Metazoa, as
we were unable to find any of their homologues in
fungi, plants and single-cell eukaryotes (protists). On
the other hand, it is not possible to exclude that in the
future some new sequences of hcHAT-like proteins of
nonmetazoan origin may become available that
together with the results delivered here can move our
knowledge further.
Materials and methods
As a first step, all available sequences of hcHATs and
hcHAT-like proteins were collected (Table 1) using the
amino acid sequences of human 4F2hc (GenBank
accession number M21904) [24] and rBAT (M95548)
[25] proteins as queries for protein blast [46] through-
out the default nonredundant database. Most
sequences were retrieved from GenBank ⁄ RefSeq
[47,48], EnsEMBL [49] and Silkworm [50] databases.
Some hcHAT analogues (Table 1) were obtained by
corrections and ⁄ or predictions from recent complete
genome sequencing projects (containing raw and unan-
notated sequence data) using the genewise program
[51], based on sequence comparison with known
hcHAT proteins from related organisms. Overall, 92
hcHAT and hcHAT-like proteins were studied, which
were divided as follows (Table 1): true rBAT – 21
(subfamily GH13_35); true 4F2hc – 27 (subfamily
GH13_34); ATG [34] – 8, plus two groups of hcHAT-
like proteins hcHAT1 (22) and hcHAT2 (14).
Eight key enzyme specificities from the oligo-1,6-glu-
cosidase subfamily (the GH13 subfamilies 4, 16–18,
29–31 and also until now some unassigned GH13
enzymes) [18,20] were selected (Table 1) and retrieved
from GenBank [47]; the proteins with known three-
dimensional structure being preferred. In addition, the
three representatives of the neopullulanase subfamily
(GH13_20) [20] were added together with the three
neopullulanase-like (intermediary character; GH13_36)
enzymes. Because a-glucosidases are produced by a
wide spectrum of diverse taxonomic groups, these
GH13 enzymes from bacteria (6), fungi (2) and insects
(16) were also involved.
The set of studied proteins was completed by four
hypothetical proteins (marked as GH13-like in
Table 1), which could not be distinguished as hcHAT
proteins or GH13 enzymes. Despite possessing the cat-
alytic triad and domain B (typical for GH13 enzymes),
they may also possess the transmembrane region (typi-
cal for hcHAT proteins) because at present their
sequences are apparently (sea urchin) and potentially
(Nematostella) incomplete. Concerning the sequences
from Trichoplax, they contain the N-terminal part pre-
ceding the TIM barrel, but it is sequentially dissimilar
to that found in hcHATs.
All sequence alignments were performed using the
program clustalx2 [52] and then manually tuned with
regard to CSRs, domain borders and tertiary struc-
tures known from the literature [3,9,12–14,20–22]. Sev-
eral preliminary evolutionary trees were calculated
using the neighbour-joining [26], maximum likelihood
[27], maximum parsimony [28], minimum evolution
[29] and upgma [30] methods. The final evolutionary
tree was calculated on the European Bioinformatics
Institute’s server ( for clu-
stalw2 [53] as Phylip-tree type and neighbour-joining
clustering [26] using the alignment of complete
sequences, including the gaps. The tree was displayed
using the program treeview [54].
Three-dimensional structures were retrieved from the
Protein Data Bank [55] for the human 4F2hc (Protein
Data Bank code: 2DH2; [21]), oligo-1,6-glucosidase
from Bacillus cereus (1UOK; [22]) and a-glucosidase
from Geobacillus sp. strain HTA-46 (2ZE0; [23]). Due
to very close sequence-structural similarity with both
these GH13 glucosidases, the structure of dextran glu-
cosidase [55a] was not taken into the present com-
parison. The structure of human rBAT was modelled
by the automated homology modelling program
esypred3d [56] using the co-ordinates of the oligo-
1,6-glucosidase (1UOK) as a template. The structures
were overlapped to each other using the multiprot
server at [57].
M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek Origin of rBAT and 4F2hc within the GH13 a-amylase family
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7275
In order to compare selection pressure acting on
hcHAT proteins and a-amylase family enzymes, three
different groups – mammalian 4F2hc, vertebrate rBAT
and insect a-glucosidases (Table 1) – were analysed
using the selecton tool [38] at />The ratio between nonsynonymous (Ka) and synony-
mous (Ks) substitutions was calculated for every codon
using the default M8 codon-substitution model [58] on
aligned nucleotide sequences. In the analysis, the
default neighbour-joining tree made by the selecton
itself was used, the selection pressure not being
mapped on a three-dimensional structure, and the
precision level was set as the highest.
Acknowledgement
This work was supported in part by VEGA grant no.
2 ⁄ 0114 ⁄ 08 from the Slovak Grant Agency for Science.
References
1 Hediger MA, Romero MF, Peng JB, Rolfs A, Takana-
ga H & Bruford EA (2004) The ABCs of solute carri-
ers: physiological, pathological and therapeutic
implications of human membrane transport proteins.
Introduction. Pflugers Arch 447, 465–468.
2 Palacin M, Nunes V, Font-Llitjos M, Jimenez-Vidal
M, Fort J, Gasol E, Pineda M, Feliubadalo L, Chilla-
ron J & Zorzano A (2005) The genetics of heteromeric
amino acid transporters. Physiology 20, 112–124.
3 MacGregor EA, Janecek S & Svensson B (2001) Rela-
tionship of sequence and structure to specificity in the
a-amylase family of enzymes. Biochim Biophys Acta
1546, 1–20.
4 Wells RG & Hediger MA (1992) Cloning of a rat
kidney cDNA that stimulates dibasic and neutral
amino acid transport and has sequence similarity to
glucosidases. Proc Natl Acad Sci USA 89, 5596–5600.
5 Chillaron J, Roca R, Valencia A, Zorzano A & Pala-
cin M (2001) Heteromeric amino acid transporters:
biochemistry, genetics, and physiology. Am J Physiol
Renal Physiol 281, 995–1018.
6 Broer S & Wagner CA (2002) Structure–function rela-
tionships of heterodimeric amino acid transporters.
Cell Biochem Biophys 36 , 155–168.
7 Peters T, Thaete C, Wolf S, Popp A, Sedlmeier R,
Grosse J, Nehls MC, Russ A & Schlueter V (2003)
A mouse model for cystinuria type I. Hum Mol
Genet 12, 2109–2120.
8 Reig N, Chillaron J, Bartoccioni P, Fernandez E,
Bendahan A, Zorzano A, Kanner B, Palacin M &
Bertran J (2002) The light subunit of system b
0,+
is
fully functional in the absence of the heavy subunit.
EMBO J 21, 4906–4914.
9 Janecek S, Svensson B & Henrissat B (1997) Domain
evolution in the alpha-amylase family. J Mol Evol 45,
322–331.
10 Cantarel BL, Coutinho PM, Rancurel C, Bernard T,
Lombard V & Henrissat B (2009) The Carbohydrate-
Active EnZymes database (CAZy): an expert resource
for glycogenomics. Nucleic Acids Res 37(Database
Issue), D233–D238.
11 Matsuura Y, Kusunoki M, Harada W & Kakudo M
(1984) Structure and possible catalytic residues of
Taka-amylase A. J Biochem 95, 697–702.
12 Kuriki T & Imanaka T (1999) The concept of the a-
amylase family: structural similarity and common
catalytic mechanism. J Biosci Bioeng 87, 557–565.
13 Janecek S (2002) How many conserved sequence
regions are there in the a -amylase family? Biologia
57(Suppl 11), 29–41.
14 MacGregor EA & Svensson B (1989) A supersecond-
ary structure predicted to be common to several a-1,4-
D-glucan-cleaving enzymes. Biochem J 259, 145–152.
15 Henrissat B (1991) A classification of glycosyl hydro-
lases based on amino acid sequence similarities. Bio-
chem J 280, 309–316.
16 Takata H, Kuriki T, Okada S, Takesada Y, Iizuka M,
Minamiura N & Imanaka T (1992) Action of neo-
pullulanase. Neopullulanase catalyzes both hydrolysis
and transglycosylation at a-(1 fi 4)- and a-(1 fi 6)-
glucosidic linkages. J Biol Chem 267, 18447–18452.
17 Jespersen HM, MacGregor EA, Henrissat B, Sierks
MR & Svensson B (1993) Starch- and glycogen-de-
branching and branching enzymes: prediction of struc-
tural features of the catalytic (b ⁄ a)
8
-barrel domain and
evolutionary relationship to other amylolytic enzymes.
J Protein Chem 12, 791–805.
18 Stam MR, Danchin EG, Rancurel C, Coutinho PM &
Henrissat B (2006) Dividing the large glycoside hydro-
lase family 13 into subfamilies: towards improved func-
tional annotations of a-amylase-related proteins.
Protein Eng Des Sel 19, 555–562.
19 Janecek S (2000) Proteins without enzymatic function
with sequence relatedness to the a-amylase family.
Trends Glycosci Glycotechnol 12, 363–371.
20 Oslancova A & Janecek S (2002) Oligo-1,6-glucosidase
and neopullulanase enzyme subfamilies from the a-
amylase family defined by the fifth conserved sequence
region. Cell Mol Life Sci 59, 1945–1959.
21 Fort J, de la Ballina LR, Burghardt HE, Ferrer-Costa
C, Turnay J, Ferrer-Orta C, Uson I, Zorzano A,
Fernandez-Recio J, Orozco M et al. (2007) The struc-
ture of human 4F2hc ectodomain provides a model for
homodimerization and electrostatic interaction with
plasma membrane. J Biol Chem 282, 31444–31452.
22 Watanabe K, Hata Y, Kizaki H, Katsube Y & Suzuki
Y (1997) The refined crystal structure of Bacillus cereus
oligo-1,6-glucosidase at 2.0 A
˚
resolution: structural
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7276 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS
characterization of proline-substitution sites for protein
thermostabilization. J Mol Biol 269, 142–153.
23 Shirai T, Hung VS, Morinaka K, Kobayashi T & Ito
S (2008) Crystal structure of GH13 a-glucosidase GSJ
from one of the deepest sea bacteria. Proteins 73, 126–
133.
24 Gottesdiener KM, Karpinski BA, Lindsten T, Stromin-
ger JL, Jones NH, Thompson CB & Leiden JM (1988)
Isolation and structural characterization of the human
4F2 heavy-chain gene, an inducible gene involved in
T-lymphocyte activation. Mol Cell Biol 8, 3809–3819.
25 Lee WS, Wells RG, Sabbag RV, Mohandas TK &
Hediger MA (1993) Cloning and chromosomal locali-
zation of a human kidney cDNA involved in cystine,
dibasic, and neutral amino acid transport. J Clin Invest
91, 1959–1963.
26 Saitou N & Nei M (1987) The neighbor-joining
method: a new method for reconstructing phylogenetic
trees. Mol Biol Evol 4, 406–425.
27 Felsenstein J (1981) Evolutionary trees from DNA
sequences: a maximum likelihood approach. J Mol
Evol 17, 368–376.
28 Eck RV & Dayhoff MO (1966) Atlas of Protein
Sequence and Structure. National Biomedical Research
Foundation, Silver Springs, MD.
29 Rzhetsky A & Nei M (1993) Theoretical foundation of
the minimum-evolution method of phylogenetic infer-
ence. Mol Biol Evol 10, 1073–1095.
30 Sneath PHA & Sokal RR (1973) Numerical Taxonomy.
Freeman, San Francisco, CA.
31 Jin X, Aimanova K, Ross LS & Gill SS (2003) Iden-
tification, functional characterization and expression
of a LAT type amino acid transporter from the
mosquito Aedes aegypti. Insect Biochem Mol Biol 33,
815–827.
32 Reynolds B, Roversi P, Laynes R, Kazi S, Boyd CA &
Goberdhan DC (2009) Drosophila expresses a CD98
transporter with an evolutionarily conserved structure
and amino acid-transport properties. Biochem J 420,
363–372.
33 Krautz-Peterson G, Camargo S, Huggel K, Verrey F,
Shoemaker CB & Skelly PJ (2007) Amino acid trans-
port in schistosomes: characterization of the permease
heavy chain SPRM1hc. J Biol Chem 282, 21767–21775.
34 Veljkovic E, Stasiuk S, Skelly PJ, Shoemaker CB &
Verrey F (2004) Functional characterization of Caen-
orhabditis elegans heteromeric amino acid transporters.
J Biol Chem 279, 7655–7662.
35 Kim JS, Cha SS, Kim HJ, Kim TJ, Ha NC, Oh ST,
Cho HS, Cho MJ, Kim MJ, Lee HS et al. (1999) Crys-
tal structure of a maltogenic amylase provides insights
into a catalytic versatility. J Biol Chem 274, 26279–
26286.
36 Lee HS, Kim MS, Cho HS, Kim JI, Kim TJ, Choi JH,
Park C, Lee HS, Oh BH & Park KH (2002) Cyclomal-
todextrinase, neopullulanase, and maltogenic amylase
are nearly indistinguishable from each other. J Biol
Chem 277, 21891–21897.
37 Hondoh H, Kuriki T & Matsuura Y (2003) Three-
dimensional structure and substrate binding of Bacillus
stearothermophilus
neopullulanase. J Mol Biol 326,
177–188.
38 Doron-Faigenboim A, Stern A, Mayrose I, Bacharach
E & Pupko T (2005) Selecton: a server for detecting
evolutionary forces at a single amino-acid site. Bioin-
formatics 21, 2101–2103.
39 Da Lage JL, Danchin EG & Casane D (2007) Where
do animal a-amylases come from? An interkingdom
trip FEBS Lett 581, 3927–3935.
40 Hennig M, Schlesier B, Dauter Z, Pfeffer S, Betzel
C, Ho
¨
hne WE & Wilson KS (1992) A TIM barrel
protein without enzymatic activity? Crystal-structure
of narbonin at 1.8 A
˚
resolution FEBS Lett 306,
80–84.
41 Hennig M, Jansonius JN, Terwisscha van Scheltinga
AC, Dijkstra BW & Schlesier B (1995) Crystal struc-
ture of concanavalin B at 1.65 A
˚
resolution. An ‘‘inac-
tivated’’ chitinase from seeds of Canavalia ensiformis.
J Mol Biol 254, 237–246.
42 Da Lage JL, Renard E, Chartois F, Lemeunier F &
Cariou ML (1998) Amyrel, a paralogous gene of the
amylase gene family in Drosophila melanogaster and
the Sophophora subgenus. Proc Natl Acad Sci USA
95, 6848–6853.
43 Maczkowiak F & Da Lage JL (2006) Origin and evo-
lution of the Amyrel gene in the a-amylase multigene
family of Diptera. Genetica 128, 145–158.
44 Janecek S (1994) Sequence similarities and evolution-
ary relationships of microbial, plant and animal a-
amylases. Eur J Biochem 224, 519–524.
45 Da Lage JL, Feller G & Janecek S (2004) Horizontal
gene transfer from Eukarya to bacteria and domain
shuffling: the a-amylase model. Cell Mol Life Sci 61,
97–109.
46 Altschul SF, Gish W, Miller W, Myers EW & Lipman
DJ (1990) Basic local alignment search tool. J Mol Biol
215, 403–410.
47 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J
& Sayers EW (2009) GenBank. Nucleic Acids Res
37(Database Issue), D26–D31.
48 Pruitt KD, Tatusova T, Klimke W & Maglott DR
(2009) NCBI reference sequences: current status, policy
and new initiatives. Nucleic Acids Res 37(Database
Issue), D32–D36.
49 Hubbard TJ, Aken BL, Ayling S, Ballester B, Beal K,
Bragin E, Brent S, Chen Y, Clapham P, Clarke L et al.
(2007) Ensembl 2009. Nucleic Acids Res 37(Database
Issue), D690–D697.
50 Wang J, Xia Q, He X, Dai M, Ruan J, Chen J, Yu G,
Yuan H, Hu Y, Li R et al. (2005) SilkDB: a knowl-
M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek Origin of rBAT and 4F2hc within the GH13 a-amylase family
FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7277
edgebase for silkworm biology and genomics. Nucleic
Acids Res 33(Database Issue), D399–D402.
51 Birney E, Clamp M & Durbin R (2004) GeneWise and
Genomewise. Genome Res 14, 988–995.
52 Jeanmougin F, Thompson JD, Gouy M, Higgins DG
& Gibson TJ (1998) Multiple sequence alignment with
Clustal X. Trends Biochem Sci 23 , 403–405.
53 Thompson JD, Higgins DG & Gibson TJ (1994)
CLUSTAL W: improving the sensitivity of progressive
multiple sequence alignment through sequence weight-
ing, position-specific gap penalties and weight matrix
choice. Nucleic Acids Res 22, 4673–4680.
54 Page RD (1996) TreeView: an application to display
phylogenetic trees on personal computers. Comput
Appl Biosci 12, 357–358.
55 Berman H, Henrick K, Nakamura H & Markley JL
(2007) The worldwide Protein Data Bank (wwPDB):
ensuring a single, uniform archive of PDB data.
Nucleic Acids Res 35(Database Issue), D301–D303.
55a Hondoh H, Saburi W, Mori H, Okuyama M, Nakada
T, Matsuura Y & Kimura A (2008) Substrate recogni-
tion mechanism of alpha-1,6-glucosidic linkage hydro-
lyzing enzyme, dextran glucosidase from Streptococcus
mutans. J Mol Biol 378, 913–922.
56 Lambert C, Leonard N, De Bolle X & Depiereux E
(2002) ESyPred3D: prediction of proteins 3D struc-
tures. Bioinformatics 18, 1250–1256.
57 Shatsky M, Nussinov R & Wolfson HJ (2004) A
method for simultaneous alignment of multiple protein
structures. Proteins 56, 143–156.
58 Yang Z, Nielsen R, Goldman N & Pedersen AM
(2000) Codon-substitution models for heterogeneous
selection pressure at amino acid sites. Genetics 155,
431–449.
Origin of rBAT and 4F2hc within the GH13 a-amylase family M. Gabris
ˇ
ko and S
ˇ
. Janec
ˇ
ek
7278 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS