Tải bản đầy đủ (.pdf) (12 trang)

Báo cáo khoa học: Three-dimensional structure of a thermostable native cellobiohydrolase, CBH IB, and molecular characterization of the cel7 gene from the filamentous fungus, Talaromyces emersonii ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (729.71 KB, 12 trang )

Three-dimensional structure of a thermostable native
cellobiohydrolase, CBH IB, and molecular characterization of the
cel7
gene from the filamentous fungus,
Talaromyces emersonii
Alice Grassick
1,
*, Patrick G. Murray
1,
*, Roisin Thompson
2
, Catherine M. Collins
1
, Lucy Byrnes
3
,
Gabriel Birrane
4
, Timothy M. Higgins
2
and Maria G. Tuohy
1
1
Molecular Glycobiotechnology Group, Department of Biochemistry,
2
Department of Chemistry, and
3
Department of Biochemistry,
National University of Ireland, Galway, Ireland;
4
Beth Israel Deaconess Medical Centre, Harvard Institutes of Medicine,


Harvard Medical School, Boston, MA, USA
The X-ray structure of native cellobiohydrolase IB (CBH IB)
from the filamentous fungus Talaromyces emersonii,PDB
1Q9H, was solved to 2.4 A
˚
by molecular replacement. 1Q9H
is a glycoprotein that consists of a large, single domain
with dimensions of  60 A
˚
· 40 A
˚
· 50 A
˚
andanoverall
b-sandwich s tructure, t he characteristic fold of Family 7
glycosyl hydrolases (GH7). It is the first structure of a native
glycoprotein and cellulase from this thermophilic eukaryote.
The long cellulose-binding tunnel seen in GH7 Cel7A from
Trichoderma reesei is conserved in 1Q9H, as are the catalytic
residues. As a result of deletions and other changes in loop
regions, the b inding and catalytic properties of T. emersonii
1Q9H are different. The gene (cel7) encoding CBH IB was
isolated from T. emersonii and expressed heterologously
with an N-terminal polyHis-tag, in Escherichia coli.The
deduced amino a cid sequence of ce l7 is homologous to
fungal cellobiohydrolases in GH7. The recombinant c ello-
biohydrolase was virtually inactive against methylumb-
eriferyl-cellobioside and chloronitrophenyl-lactoside, but
partial activity c ould be restored after refolding o f the urea-
denatured enzyme. Profiles of cel7 expression in T. emerso-

nii, investigated by N orthern blot analysis, r evealed t hat
expression is regulated at the transcriptional level. Putative
regulatory element consensus sequences for cellulase tran-
scription factors have been identified in the upstream region
of the cel7 genomic sequence.
Keywords:3Dstructure;cel7 gene; GH Family 7 glycopro-
tein; Talaromyces emersonii; thermophilic.
Cellulose is the major constituent of all plant materials
and is the most abundant organic molecule on Earth
[1,2]. Microbial breakdown of cellulose creates the
potential for the production of energy [3–5]. Cellulases
are used in waste recycling processes and in the
processing of cellulose-rich raw materials for the paper
and textile industries [6]. Cellulose is composed of
repeating glucose units, w here each glucos e unit i s
rotated 180° relative to its neighbours along the main
axis, so that t he basic r epeating unit i s cellobiose. Plant
cellulose exists in a highly crystalline form. Hydrolysis of
cellulose requires t he co-operative action of three c lasses
of cellulolytic enzymes, namely e ndo-b-1,4-glucanases
(EC 3 .2.1.4), cellobiohydrolases (EC 3 .2.1.91) and b-gluco-
sidases (EC 3.2.1.21). The CAZy (
carbohydrate active
en
zymes) [7] classification system collates glycosyl
hydrolase (GH) enzymes into families a ccording to
sequence similarity, which have been shown to reflect
shared structural features. To date, GH enzymes are
members o f 87 f amilies, of which 43 have b een assigned
a retaining mechanism of action, 24 an inverting

mechanism, and the stereochemical mode of action of
the remaining families have yet to b e determined. The
endoglucanases a re commonly c haracterized by a groove
or a cleft into which a linear cellulose chain c an fit in a
random manner. Classically, exoglucanases such as the
cellobiohydrolases (CBHs) possess tunnel-like a ctive sites,
which can only accept a s ubstrate chain via its terminal
regions [8]. The se e xo-acting CBH enzymes act by
threading the cellulose chain through the tun nel, where
successive cellobiose units are removed in a sequential
manner. Sequential hydrolysis of a cellulose chain is
termed ÔprocessivityÕ [9]. However, some cellulase enz ymes
are capable of bo th endo- and exo-actions [10,11].
Moreover, some GH families include both endo- and
exo-enzymes, indicating that the mode of action can be
independent of sequence homology and structural fold.
Relatively minor changes i n the lengths of r elevant loops in
the g eneral proximity of the active site in such enzymes,
may dictate the endo- or exo-mode of action without
Correspondence to M. G. Tuohy, Molecular Glycobiotechnology
Group, Department of Biochemistry, National University of Ireland,
Galway, Ireland. Fax: +353 91 512504, Tel.: +353 91 524411,
E-mail:
Abbreviations: CBH, cellobiohydrolase; CNP, chloronitrophenyl; Cre,
catabolite repressor element; GH, glycosyl hydrolase; reCBH,
recombinant cellobiohydrolase.
Enzymes: endo-b-1,4-glucanase (EC 3.2.1.4); cellobiohydrolase
(EC 3.2.1.91); b-glucosidase (EC 3 .2.1.21).
*Both authors contributed equally to this work.
(Received 1 2 July 2004, revised 21 September 20 04,

accepted 4 October 2004)
Eur. J. Biochem. 271, 4495–4506 (2004) Ó FEBS 2004 doi:10.1111/j.1432-1033.2004.04409.x
significant differences in the overall fold. In Trichoder-
ma re esei Cel7A, deletion of the exo-loop (residues
243–256) has been shown to decrease activity against
crystalline cellulose. It was therefore postulated that the
exo-loop has evolved to facilitate processive hydrolysis of
crystalline cellulose by T. reesei Cel7A [12]. Fungal cellu-
lolytic enzymes reported to date comprise a single
polypeptide chain, frequently glycosylated, which contains
a catalytic domain usually connected to a c ellulose-binding
domain b y a proline/serine/threonine-rich linker [13].
CBHs from Humicola grisea [14], Phanerochaete chrysos-
porium [15] and Aspergillus niger [16] have been shown to
consist solely o f a catalytic domain. The m ost characterized
CBH members of GH7 are Cel7A from T. reesei [8] and
Cel7D ( CBH58) from P. chrysosporium [17]. Both C BHs
consistoftwob-sheets that pack face-to-face t o form a
b-sandwic h [18]. Cel7A from T. reesei is composed of long
loops, on one face o f the sandwich, that form a cellulose-
binding tunnel o f  50 A
˚
. The catalytic residues are
glutamate 2 12 and 217, which are l ocated on opposite
sides o f the act ive sit e, separated by an int ervening distance
consistent with a double-displacement retaining m echanism
[18]. Members of GH7 are t hought to follow a retaining
mechanism of action. Kinetic parameters and enzyme–
ligand interactions of GH7 enzymes are well characterized
[19–21]. Genes from this family have been cloned and

characterized from a variety of fungal sources, including
H. grisea [14], T. reesei [22,23], Penicillium janthinellum
[24], P. chrysosporium [15] and Aspergillus species [16,
25,26], but until recently, never from a truly thermophilic
fungal species [27].
The thermophilic aerobic fungus, Talaromyces emersonii,
isolated from composting biomass, produces a completely
thermostable cellulase system that has not been fully
characterized to date [21 ,28–30]. CBH enzymes from
T. emersonii have been purified, c haracterized and assigned
to GH families 6 and 7 [21,27,31]. Protein thermostability is
not, however, reflected in the overall fold of a protein and is
thought to be the result of more localized differences,
causing th ermophilic enz ymes to be somewhat less fl exible
than mesophilic enzymes [32]. In this article we present the
3D structure of the native CBH IB from T. emersonii,the
first structure of any protein from this source and the first
structure of a native fungal CBH core (glycoprotein).
Molecular cloning, transcriptional regulation analysis and
overexpression of the cel7 gene in Escherichia coli are also
reported. The 3D structure has been deposited in the Protein
Data Bank as 1Q9H.
Experimental procedures
Fungal strain and growth conditions
Mycelia harvested from cultures grown from T. emersonii
strain CBS 814.70 at 45 °C on S abouraud dextrose agar
were used to inoculate liquid nutrient media, as described
previously [30]. Cultures were grown at 45 °C with s haking
at 220 r.p.m. At appropriate time-points, mycelia were
harvested by fi ltration through several layers of fine-grade

muslin, w ashed w ith 7 5 m
M
sodium cit rate, pH 7.5 , and
frozen immediately under liquid n itrogen for nucleic acid
extraction.
PCR cloning of genomic DNA
Chromosomal DNA was isolated from T. emersonii
mycelia harvested after 24 h of culture on 2% (w/v) glucose,
by using the method of Raeder & Broda [33]. Amplification
of a DNA fragment encoding a portion of the catalytic
domain o f T. emersonii cel7 was performed by using P CR
and d egenerate p rimers designed from alignments of
existing CBH sequences in the databases. Reaction cocktails
contained 2.5 U of Qiagen HotStar
TM
Taq DNA poly-
merase, 1· buffer (Qiagen, Crawley, West Sussex, UK),
0.5· Q solution, 200 l
M
of each deoxynucleotide triphos-
phate, 1.5 m
M
MgCl
2
and 1 l
M
of the appropriate gene-
specific primers. Reaction conditions for PCR amplification
were 94 °C for 15 min (initial DNA polymerase activation),
94 °C for 1 min, 50–60 °Cfor1minand72°Cfor1min,

followed by a final e xtension of 10 min, for 30 c ycles. PCR
products were separated by electrophoresis through a 1.2%
(w/v) agarose gel and subsequently purified by using a
Wizard PCR preps DNA purification system (Promega,
Southampton, UK) and subcloned into the pGEM-T easy
vector (Promega), following the manufacturer’s guidelines.
Plasmids were purified from E. coli JM109 cultures by using
a spin m iniprep kit (Qiagen), a nd sequenced. Sequencing
reactions were carried out by Altabioscience Laboratories
(University of Birmingham, Birmingham, UK). Sequence
analysis and database similarity searches were performed by
using the online program
BLAST
[34] against protein (
BLASTP
)
and nucleotide (
BLASTX AND BLASTN
) sequences stored at the
National Centre for Biotechnology Information (NCBI).
Rapid amplification of cDNA ends
RNA (10 lg), isolated after growth (48 h) of T. emersonii
on solka floc (ball-milled c ellulose), was u sed as a template
for RACE, which i nvolved a modification o f the manufac-
turer’s (Ambion Europe Ltd., Huntingdon, Cambridge-
shire, UK) RACE protocol described previously [27]. An
aliquot (1 lL) of the reaction mixture was used as a
template for performing 5¢-and3¢-RACE PCRs by using
the outer and inner RACE primers supplied by the
manufacturer and the o uter and inner gene-specific primers

designed from the cel7 PCR products. The Cel7 outer and
inner RACE primers were as follows: outer 5¢-RACE
primer CATGCGGTAAGGGTTGAAGTCACA-3¢;
inner 5¢-RACE primer 5¢-GTTTGCTTCCCAGACATC
CATC-3¢; outer 3¢-RACE primer 5¢-ATGCTGTGGTTGG
ATTCCGACTAC-3¢; and inner 3¢-R ACE primer 5¢-AAC
TCCTACGTGACCTACTCGAAC-3¢. PCR products
were cloned and sequenced as described previously.
Isolation of
cel7
cDNA and genomic genes
Full-length genomic and cDNA sequences corresponding to
cel7 were amplified from T. emersonii first-strand c DNA
and g enomic DNA, respectively, by PCR with primers
corresponding to the 5¢ start and 3¢ stop sequences identified
in the 5¢-and3¢-RACE products. The cel7 sense and
antisense primers were 5¢-ATGCTTCGACGGGCTCTTC
TTCTA-3¢ and 5 ¢-TCACGAAGCGGTGAAGGTCGA
GTT-3¢, r espectively. Reactions contained 1.25 U of Pfu
DNA polymerase, 1 · Pfu reaction buffer, 200 l
M
of each
4496 A. Grassick et al. (Eur. J. Biochem. 271) Ó FEBS 2004
deoxynucleotide triphosphate and 1 l
M
of the a ppropriate
gene-specific primers. PCR products were gel purified,
subcloned and sequenced as described previously.
Northern blot analysis and genomic library screening
Northern blot analysis of cel7 expression was carried out as

described previously [27]. A T. emersonii S au3A genomic
library was prepared in LambdaGEM-11 (Promega). E. coli
KW251 was used as the host strain in the preparation and
screening of the genomic library. Plaque lifts were carried
out as described by Sambrook et al. [35]. Hybridization was
conducted overnight at 68 °Cin5· NaCl/Cit, 0.1% (w/v)
N-lauroylsarcosine, 0.02% (w/v) SDS and 1% (w/v) block-
ing reagent. F ull-length dioxygenin (Roche Molecular
Biochemicals, Roche Diagnostics Ltd., Lewes, East Suss ex,
UK)-labelled cel7 (20 ngÆmL
)1
of hybridization buffer) was
used as a probe. Detection was performed according to the
manufacturer’s instructions. The presence of the full-length
gene in positively hybridizing single plaque-forming units
was confirmed by PCR, and the plaques were purified by
using a l ambda purification kit (Qiagen). The p laques were
then sequenced directly, using a cel7 gene-specific sequen-
cing primer (5¢-GCATTCCTGCCATGTCAG-3¢)to
generate sequence d ata for the 5¢ region upstream o f the
ATG start codon.
Expression of
cel7
in
E. coli
Primers F1 (5¢-CACCCAGCAGGCCGGCACGGCG-3¢)
and R1 ( 5¢-TCACGAAGCGGTGAAGGTCGAGTT-3¢),
corresponding to the N- and C-terminal regions of the
mature protein, were used to amplify cel7 cDNA (the
N-terminal signal peptide, i.e. amino acids 1–18, was

removed). The CACC corresponding to the GTGG over-
hang in the TOPOÒ cloning vector (Invitrogen Ltd.,
Paisley, UK) is underlined in the primer sequence above.
The purified PCR product was ligated into the pENTR/SD/
D-TOPO vector and transformed into One ShotÒ Top10
E. coli competent cells, according to the manufacturer’s
instructions. An LR recombination reaction between the
entry clone, pE-Cel7, and the destination vector , pDEST-17
(Invitrogen), w as transformed into E. coli DH5a library-
efficient cells, thereby generating the expression clone, pD-
Cel7, with an N -terminal poly-histidine t ag. Multiple
transformants were analysed by restriction analysis and
PCR t o confirm the presence and correct orientation of the
insert at all stages. For expression, plasmid DNA was
purified and transformed into BL21-AI competent E. coli
cells (Invitrogen), which were cultured to mid-log p hase,
and expression was induced by the addition of 0.2% (w/v)
arabinose f ollowed by a further growth period of 4 h at
37 °C. Pilot experiments indicated that t he CBH protein
was expressed i n the in clusion body fractio n. C ells were
harvested by centrifugation (3630 g for 5 min) from a
50 mL culture, and the cell pellet was resuspended in 8
M
urea. T he ce ll lysate was sonicated with three, 5-s, high-
intensity pulses, centrifuged at 1307 g for 15 min to pellet
cellular debris, and the supernatant was applied to a Nickel-
nitrilotriacetic a cid purification matrix (Invitrogen). The
lysate was allowed to interact with the matrix at room
temperature for 30 min with gentle agitation and then
washed with 2 volumes of wash solution (containing 8

M
urea, 2 0 m
M
sodium phosphate, 500 m
M
NaCl, p H 7.8),
followed by 2 volumes of a second wash solution (containing
8
M
urea, 20 m
M
sodium phosphate, 500 m
M
NaCl,
pH 6.0). The column was then w ashed with 4 volumes of a
final wash solution of 50 m
M
sodium phosphate, and 20 m
M
imidazole, pH 8.0. Recombinant CBH (reCBH) was eluted
from the column by a pplication o f a solution of 50 m
M
sodium phosphate, pH 8 .0, containing 250 m
M
imidazole.
Denaturation and refolding of reCBH and enzyme assay
reCBH (1 mg) was denatured by incubation in a solution of
8
M
urea/0.1

M
Tris/HCl, pH 8.0, in the presence of 100 m
M
dithiothreitol and 1 m
M
EDTA, for 2 h at 20 °C. The pH
was lowered to pH 4.0 by d ropwise addition of 1
M
HCl,
and the dithiothreitol was removed by dialysis against the
same buffer without the dithiothreitol. Denatured and
reduced reCBH was diluted 1 : 100 in a buffer solution
containing 0.1
M
Tris/HCl, pH 8.5/1 m
M
EDTA/0.3 m
M
oxidized glutathione/3 m
M
glutathione, and then incubated
in a renaturation buffer containing 2.5 mg of protein
disulphide isomerase at 30 °C for 30 h. reCBH was dialysed
against 100 m
M
sodium acetate, pH 5.0, followed b y
concentration in a Millipore microconcentrator fitted with
a 1 0 kDa cut-off m embrane. reCBH activity was measured
by incubating 10 lL of renatured enzyme with 100 lL
of 1 m

M
chloronitrophenyl-lactate (CNP-lactate), 1 m
M
4-nitrophenyl-cellobioside or 50 l
M
4-methylumberiferyl-
cellobioside, at 50 °C. Reactions were terminated by the
addition of 100 lLof1
M
Na
2
CO
3
or 0.2
M
glycine/sodium
hydroxide, pH 10.5, and the absorbance (405 nm) or UV
fluorescence was measured.
Purification of CBH IB
CBH IB was purified from 2% (w/v) solka floc cellulose-
induced cultures and c haracterized as described previously
[21]. The purified enzyme was concentrated to 20 mgÆmL
)1
in 20 m
M
Tris buffer, pH 7.5, and stored at 4 °C. Pe ptide
sequence information for native CBH IB was determined by
Edman degradation on an automated sequenator (J. Gray,
University of Newcastle-upon-Tyne, Newcastle-upon-Tyne,
UK).

Crystallization and data collection
Native CBH I B from T. emersonii was c rystallized by using
the hanging-drop vapour-diffusion method with ammo-
nium phosphate (dibasic) as a precipitant at pH 8.5.
Crystals of CBH IB, which diffracted to 2.4 A
˚
,were
obtained. Data were collected at room temperature on the
multipolar wiggler beamline, BW7B, at t he DORIS storage
ring, EMBL Hamburg Outstation, Germany using a Mar345
area detector. Data processing indicated that CBH IB
crystallised in the t etragonal space group P4
1
2
1
2, with unit
cell dimensions a ¼ b ¼ 74.42 A
˚
, c ¼ 176.92 A
˚
[31].
Structure solution
The structure was solved by molecular replacement by
utilizing the program
AMORE
[36]. Molecular replacement
Ó FEBS 2004 Structure and analysis of T. emersonii CBH IB (Eur. J. Biochem. 271) 4497
was completed using two separate search models, chosen
based on s equence homology. The models used w ere the
catalytic domain o f T. reesei Cel7A (PDB 1CEL) and the

catalytic domain of P. chrysosporium Cel7D (PDB 1GPI).
Structure refinement
A total of 5% of the reflections in the data set was set aside
for free R-factor calculations during refinement.
REFMAC
5
[37] from the
CCP
4 [38] suite of programs was used
throughout this refinement, w ith t he program [ 36] being
employed for graphical displays and manipulation o f the
models. W ith e ach r ound of refinement, maps were
produced and the model was rebuilt where electron d ensity
supported the changes. Water molecules were located and
refined by using the program
ARP
_
WARP
[39]. The stereo-
chemical quality of t he model w as followed b y using the
program
PROCHECK
[40].
Results
Isolation of genomic and cDNA clones
The cel7 degenerate primers amplified a 719 bp PCR
product from T. emersonii chromosomal DNA. T he prod-
uct was cloned, sequenced and found to exhibit homology
to other fungal gene cel7 sequences. Based on this sequence,
5¢-and3¢ outer and inner RACE PCR primers were

designed to amplify the 5 ¢-and3¢ ends of the cel7 gene.
Sequence analysis confirmed the RACE products to be part
of the cel7 gene, which included a 54 bp 5 ¢ untranslated
region and a 281 bp 3¢ untranslated region, including a
polyA tail. The full-length genomic (GenBank AF439935)
and cDNA (GenBank AY081766) cel7 clones were ampli-
fied from first-strand cDNA and chromosomal DNA,
respectively, by using N- and C-terminal gene-specific
primers based on the RACE products. Cel7 was encoded
by a 1365 bp open reading frame encoding 455 amino acids
and interrupted by two introns (52 and 61 bp), with
consensus 5¢-and3¢ intron splice sites (Fig. 1).
Sequence analysis
Peptides sequenced from native CBH I B confirmed the
identity of the T. emersonii cel7 gene/gene product; the
location of these peptides in the deduced polypeptide
sequence is given in the legend to F ig. 1. Comparison of
the d educed cel7 amino acid s equence f rom T. emersonii
with those from P. chrysosporium (GenBank: AAA19802),
T. reesei (GenBank CAA49596), A. niger (GenBank
AAF04491), H. grisea (GenBank AAD11942) and A. acule-
atus (GenBank BAA25183) gave sequence identity values of
65%, 64%, 73%, 51% and 68%, respectively (Fig. 2) [41].
Alignment o f the deduced polypeptide sequence of
T. emersonii cel7 reveals t he presence of a terminal catalytic
domain. Other cel7 (cbhI) gene products possessing a
catalytic domain e xclusively have been identified and
include H. grisea [42] and A. niger [16]. Cel7 genes from
T. reesei [43], P. chrysosporium [44] a nd A. aculea tus [23],
however, c o ntain a modular structure composed of a C-

terminal carbohydrate-binding module linked via a p roline/
serine/threonine-rich linker to the catalytic domain [45].
There are two predicted N-glycosylation sites in the catalytic
domain of 1Q9H (Fig. 3), i.e. Asn-X-Ser/Thr (X is any
amino a cid e xcept proline) consensus sequence, at Asn267
and A sn431. Ther e a re 18 residues corresponding to the
signal peptide at the N-terminus of the translated protein
product. Alignment of the existing fungal CBH sequences
revealed that 1Q 9H from T. emersonii comprises features
found in both C el7D of P. chrysosporium and in C el7A of
T. reesei.
Analysis of the
T. emersonii
cel7 upstream region
Initial screening of 6000 k phage clones from the T. emersonii
Sau3A genomic library identified two positively hybridizing
clones. Sequence a nalysis of t he 5¢ region upsteam from the
start codon of the purified cel7 clones revealed putative
TATA-like and CCAAT box sequences located upstream of
the start codon at bp )99, )132, )340, 1040, )1242, )1348,
)1476 and )1694. In filamentous fungi [46], and in higher
eukaryotes [47], the CCAAT sequence is known as an
upstream activating sequence. The binding sites for putative
cellulase transcription factors [activator of ce llulase expres-
sion I (ACEI) and ACEII] [48,49] are located upstream of the
start codon at bp )562, )844, )853 and )1175, while putative
binding sites for the catabolite repressor element (Cre) [50,51]
are l ocated upstream of the start codon at bp )239, )265,
)320, )359, )460, )977, )1404 and )1523.
Northern blot analysis of

T. emersonii
cel7 expression
Solka floc cellulose, lactose and beechwood xylan induce
high levels of cel7 expression in T. emersonii (Fig. 4). Similar
cellulase expression with complex cellulose has been docu-
mented in P. chrysposporium [52] and T. reesei [53]. M ethyl
xylose and gentiobiose, a b-1,6-linked glucose disaccharide,
induce low levels of cel7 expression, relative to solka floc, in
T. emersonii. Gentiobiose has been shown to induce other
cellulases in T. emersonii [27]. Other researchers have repor-
ted induction of CBH A and B, a nd endoglucanase genes
from A. niger are also induced by
D
-xylose [16]. Sophorose, a
b-1,2-link ed disacch aride of g lucose, h as previo usly been
shown to be a poor inducer of cellulase activity in
T. emersonii [54], and it has been postulated that sophorose
could be the natural inducer of cellulase expression in
T. reesei. Cellobiose is a poor inducer of the T. emersonii
cellulases and did not induce detectable levels of cel7 in this
study. Glucose-induced cultures displayed no detectable
levels of cel7. Indeed, t he addition of 2% (w/v) glucose for 2 h
to T. emersonii mycelia, previously cultured on solka floc for
48 h, resulted in the abolition of the cel7 signal. The
regulatory proteins, CreA [55] and Cre1 [51], similar to
Mig1 in Saccharomyces cerevisiae [56], mediate glucose
repression in Aspergillus and Tricho derma species. The 5 ¢
upstream region of T. emersonii cel7 has eight potential
catabolite r epressor-binding sites (SYRGG). The sequence
of a gene encoding CreA from T. emersonii has recently

been submitted to the GenBank database (AF440004).
Expression of cel7 in
E. coli
A recombinant protein, of  57 000 relative molecular mass,
was expressed in E. coli BL-21A. Under the conditions
4498 A. Grassick et al. (Eur. J. Biochem. 271) Ó FEBS 2004
tested, reCBH was present in the insoluble inclusion fraction
(Fig. 5 A). The protein was purified under hydrid conditions
(denaturing/renaturing) on a Ni-nitrilotriacetic acid column.
reCBH w as inact ive against CNP-lactate. Denaturation of
reCBH, followed by refolding, with concominant disulphide
bond formation, in the presence of protein disulphide
isomerase i n renaturation buffer, successfully restored
partial biological activity of reCBH against CNP-lactate
and methylumberiferyl (Fig. 5B).
Structure solution and refinement of CBH IB
Molecular replacement was pe rformed by using the
CCP
4
(1994) programs contained in the Automated package for
Molecular Replacement (
AMORE
) [ 33]. Both 1 GPI, which
was solved at 1.32 A
˚
resolution and 1CEL, which was
solved at 1.8 A
˚
resolution, were used as search models.
Rotational and translational s earches were p erformed at

different resolutions in the range of 69 A
˚
to 2.4 A
˚
. Rigid
body refinement was carried out after each translation
function to refine the position of the potential solution.
Euler angles, fractional coordinates, cor relation c oefficients
and R factors for the be st molecular r eplacement solution
for each m odel, using P 4
1
2
1
2 as t he space g roup of the
1Q9H crystal, were found. The best solutions had c orrela-
tion coefficients of 56.1% and 55.3%, and R-factors of
40.0% and 40.9% for 1CEL and 1GPI, respectively.
Refinement of the models produced by
AMORE
was
performed by using the
CCP
4 program,
REFMAC
5[34].
REFMAC
5 was used to carry out restrained refinement on
X-ray data by using the maximum likelihood method. The
R
overall

and R
free
values from the first round of
REFMAC
5
cycles on the 1GPI model were 28.6% and 3 3.6%,
respectively, while those for the 1CEL model were 28.9%
Fig. 1. Nucleotide, deduced amino acid sequence and relevant features of the Talaromyces emersonii cel7 gene. The stop c od on is denoted by a n
asterix. The N-glycoslation sites are underlined. Catalytic residues are boxed. Cysteine residues involved in the formation of disulphide bridges are
in bold (19–25, 50–71, 61–67, 135–401, 169–207, 173–206, 227–253, 235–240 and 258–334 bp). Four tryptophan residues involved in the glucosyl-
bindinding platform (W38, W40, W371, W380) in the active-site tunnel are bold italics and underlined. Four peptides sequenced from native CBH
IB had c omplete identity w ith Y124-D129, Y267-D272, I295-P300 and F445-S455 in the deduc ed protein sequence.
Ó FEBS 2004 Structure and analysis of T. emersonii CBH IB (Eur. J. Biochem. 271) 4499
and 35.5%, respectively. The 1GPI model was used for
further analysis. The graphics program
TURBO
[36] was used
to examine both models and the maps produced by
REFMAC
5. 2Fo-Fc and Fo-Fc maps were used and analysed
with contour levels set to 1.0. The amino a cid sequence of
the 1 GPI model was changed to that o f 1 Q9H, and the
model w as rebuilt where changes were supported b y the
electron density. Changes in R factors were u sed a s a guide
to improvements to the overall structure. Further rounds of
model m utation and rebuilding resulted in a model w ith an
R-factor of 16.1% and an R-free of 22.9% (Table 1).
Electron density maps showed almost continuous density
for the backbone of CBH IB. The final model, 1Q9H,
included 430 of the 437 amino acid residues of CBH IB. The

final two amino acids and the loop region (from amino acids
193–197) were not visible in electron density maps. In
addition, no side-chain density was apparent for four
residues, which were subsequently modelled as alanine. All
of these residues are located on the surface of the protein
and are presumed to be disordered. Three N-acetylgluco-
samine and 175 w ater m olecu les were l ocated within the
model. Average isotropic temperature factors (B factors) for
the 1Q9H structure were calculated by using the
CCP
4
program
BAVERAGE
. Average isotropic temperature factors
Fig. 2. Multiple sequence alignment of cellobiohydrolase IB ( Tal.em; CBH IB) with glycosyl hydrolases from Humicola grisea (Hgrisea; GenBank
AAD11942), Aspergillus niger (Aspniger; GenBank AAF04491), Phanerochaete chrysosporium (Phcry; GenBank AAA19802), and Tricho-
derma reesei (T.reesei; GenBank CAA49596). Residues in white against a black background are amino acids that are identical or have a conserved
substitution in all five sequences. Residues in white against a grey background are amino acids that are identical or conserved in four out of the five
sequences.
4500 A. Grassick et al. (Eur. J. Biochem. 271) Ó FEBS 2004
for t he m ain chain were 20.91 A
˚
2
and r oot-mean-square
deviations (rmsd) from ideal bond lengths an d angles were
0.009 A
˚
and 1.281 A
˚
, respectively.

PROCHECK
was used to
verify the stereochemical quality of the model. The Rama-
chandran plot showed that 86% of residues lie in the m ost
favoured regions and 13.8% lie in the a llowed region, while
Ser311 was the only nonglycine residue in the generously
allowed region and there were no residues in t he disallowed
regions. Peptide bond planarity for the main chain was
found to be 7.0 °, nonbonded interactions were 0.6 per 100
residues, a-carbon tetrahedral distortion was 1.8°,the
standard deviation of the hyd rogen bond energies was 0 .7
and o verall G-factor, a measure o f t he normality of the
structure, was 0.0.
Overall structure of
T. emersonii
1Q9H
1Q9H is a large single-domain protein with overall dimen-
sions of  60 A
˚
· 40 A
˚
· 50 A
˚
(Fig. 6 ). About one-third
of this domain is arranged in two large antiparallel b-sheets,
which are stacked face-to-face and are highly curved,
forming c onvex and concave surfaces. The convex and
concave sheets o f the b-sandwich are composed of seven
b-strands. Many of the side-chains in the b-sheets are
hydrophobic, and interactions between these residues

appear to hold the b-sandwich in position. With the
exception of four a-helices and t wo pairs of short b-strands,
the rest of the protein consists almost entirely of loops
connecting the b-strands. The loops extending from the
Fig. 4. Northern blot analysis of Talaromyces emersonii cel7 expression with various carbon sources at 2% (w/v). Glucose at 24 h and 48 h (lanes 1
and 2); methyl glucose at 48 h (lane 3); sorbitol at 48 h (lane 4); galactose at 48 h (lane 5); galactitol at 48 h (lane 6); methyl xylose at 36 h (lane 7);
glycerol at 48 h (lane 8); gentiobiose at 48 h (lane 9); cellobiose at 48 h (lane 10); and beech wood xylan at 48 h (lane 11). Time course transcription,
24 h, 48 h and 96 h (lanes 12, 13 and 14) of cel7 after transfer to Solka floc (ball-milled cellulose). Addition of 2% (w/v) glucose to 48 h cultures of
T. emersonii cultured on Solka floc with RNA isolated after a further 2 h (lane 15). Time course transcription, 24 h and 48 h (lanes 16 and 17) of
cel7 after transfer t o lactose. The bottom panel is the 1 8S ribosomal RNA loading c ontrol.
Fig. 3. Electron maps at the two N-glycosyla-
tion sites. (A) A sn267 with two GlcNAc (2-
amino-2-N-acetylamino-
D
-glucose) residues.
(B) Asn431 with one G lcN Ac residue.
Fig. 5. SDS/PAGE and activity analysis of
recombinant c ellobiohydrolase. (A) SDS/
PAGE [10% (w/v) gel] analysis of purified
recombinant cellobiohydrolase (reCBH).
Lane 1, molecular m akers; lane 2, uninduced
cells; lane 3, induced cells (4 h); a nd lane 4,
purified His6-reCBH. (B) r eCBH activity
against methylumbelliferyl-cellobioside a fter
0, 1 and 24 h. Substrate controls, lan e 1;
enzyme reactions, lane 2.
Ó FEBS 2004 Structure and analysis of T. emersonii CBH IB (Eur. J. Biochem. 271) 4501
b-sandwich forms a t unnel, which runs the l ength of t he
concave sheet, into which the cellulose substrate can be
accommodated. The b-sandwich represents the characteris-

tic fold of GH7 and is a lso the fold of the l egume-lectin
family and of GH16 [57].
The loops extending from the b-sandwich are stabilized
by the p resence o f nine d isulphide bonds which are located
between residues 19– 25, 50–71, 61–6 7, 135–401, 169–207,
173–206, 227–253, 235–240 and 258–334. The N-terminal
glutamine residue is presen t as t he modified pyroglutamate
group, as observed in other GH structures [17,58]. Electron
density corresponding to N-glycosylation is visible at two
asparagine residues, namely Asn267 and Asn431 (Fig. 3). It
was possible to position two N-acetylglucosamine residues,
linked via a b-1,4 bond, at Asn267. A single N -linked
N-acetylglucosamine was seen in the model a t position
Asn431.
Structure of
T. emersonii
1Q9H, in comparison with
P. chrysosporium
1GPI and
T. reesei
1CEL
A
BLAST
search of the Protein Data Bank (PDB) revealed
that the protein structures with the highest sequence
homology to 1Q9H were structures 1GPI and 1CEL, which
are t he catalytic domains of CBH Cel7D from P. chrysos-
porium [17] and CBH Cel7A [8] from T. reesei, respectively.
P. chrysosporium has a sequence identity of 67% with
Cel7A, while T. reesei has a n identity of 66%.

While the sequence homology between 1Q9H, 1CEL and
1GPI are similar, the areas of shared homology differ.
Superimposing the C-alpha traces of 1GPI and 1CEL on
1Q9H, g ave rmsd values of 0.71 A
˚
and 0.67 A
˚
, respectively
(Fig. 7 ).
Substrate-binding subsites
The X-ray structure of the T. reesei CBH, with eight glucose
residues bound (PDB 7CEL), identifies s ome 20 residues
involved in enzyme–substrate interactions. Superposition of
this structure on 1 Q9H shows t hat a ll but two o f these
residues are conserved and suitably positioned f or inter-
actions with the substrate. Four tryptophan residues form a
glucosyl-binding platform in sites )7, )4, )2and+1inthe
tunnel of 1CEL; equivalent tryptophan residues are found
in 1Q9H at positions 38, 40, 371 and 380. A tyrosine residue
(Tyr47) present in the T. emersonii CBH IB sequence, and
seen in 1GPI but not in 1CEL, is located at the entrance of
the tunnel, which Munoz et al. suggests may constitute an
additional binding subsite [17]. Three arginine residues in
the product sites of 1CEL (+1, +2 and +3) are proposed
to assist in the b inding and positioning of the substrate and
play a r ole in the recognition of t h e reducing e nd of the
cellulose chain. Arginine side-chains are present in all
equivalent locations in 1Q9H (Fig. 7 ).
Tunnel-forming loops
There are four major loops involved in the cellulose-binding

tunnel in 1CEL. It is postulated t hat Asn197 and Asn198
Fig. 6. Stereoview of Talaromyces emersonii
1Q9H. The substrate is superimposed onto
1Q9H. The figure w as drawn by using
TURBO
[36].
Table 1. Final statistics for the structure of Talaromyces emersonii
1Q9H. Values in paren theses refer to the last r esolution shell.
Space group P4
1
2
1
2
Unit cell dimensions
a ¼ b 74.42
c 176.92
a ¼ b ¼ c (deg °) 90.00
Resolution range (A
˚
) 20–2.40
No. of reflections 65121
Completeness (%) 94.3 (94.3)
R
merge
(I) 7.8 (36.6)
Mean I>2s (I) (%) 78.8 (56.8)
R-factor (%) 16.1
R-free (%) 22.9
No. of water molecules 175
No. of sugar molecules 3

rms bond lengths (A
˚
) 0.009
rms bond angle (°) 1.28
Average B main chain (A
˚
2
) 20.91
Average B water (A
˚
2
) 31.28
4502 A. Grassick et al. (Eur. J. Biochem. 271) Ó FEBS 2004
make van der Waals interactions with Tyr370 and Tyr371, on
the opposite loop, thus enabling it t o form a fully enclosing
tunnel [8]. While sequence analysis shows that 1Q9H
possesses the equivalent Asn residues (Asn193 and
Asn194), one of the tyrosine residues on the opposite loop
is replaced by an alanine (Ala374), forming a more open
tunnel; however, electron density in this area of 1Q9H is
poor. In 1GPI, neither asparagine residues are present and a
histidine a nd an alanine r esidue ar e found in the equivalent
tyrosine positions. T he tunnel-forming l oo p (amino a cids
240–248) in 1GPI is significantly shorter than i n 1 CEL a nd
1Q9H, owing to a six amino a cid deletion, depicting a more
exposed catalytic site for 1GPI. In 1 CEL, three amino acids
form a tight turn over site )6, with Gln101 hydrogen bonding
to the g lycosyl residue in site )5, thus forming the lid of the
binding site. The structures 1Q9H and 1GPI have a deletion
of these three residues, thus leading to a more open substrate-

binding site.
Catalytic binding site
Brooks et al . showed, by NMR, that the CBHs I from
T. emersonii has a retaining mechanism of action [20]. This
type of mechanism, as shown by Davies & Henrissat [9],
involves a p roton donor and a base separated by  5.5 A
˚
[59–61]. Henrissat [62] classified all members of GH7,
which catalyse the hydrolysis of the b-1,4-glycosidic bond of
cellulose, a s retaining enzymes, i.e. they r etain the configur-
ation o f t he an omeric carbon. Glu212 and Glu217 have
been identified as the proton donor and a cceptor, respect-
ively, in 1CEL. Sequence analysis of 1Q9H and 1GPI shows
that these r esidues are conserved, suggesting that they c arry
out the same function. Based on the proposed mechanism of
action from 1CEL, Glu209 of 1Q9H may act as the
nucleophile, while the proton donor is likely to be Glu214.
The p roposed catalytic r esidues are separ ated by 5.57 A
˚
.
The Asp211 residue of 1Q9H is in a position to s hare a
proton with the nucleophile, in a short hydrogen bond (O-O
distance 2.51 A
˚
; Fig. 8). The residue Glu214 forms a weak
hydrogen bond to Asn138. A platform of hydrophobic
residues has recently been identified as being mechanistically
relevant as a transition-state stabilizing factor in GH f amily
members [ 63]. A tyrosine residue (Tyr142), present near the
)1 subsite in 1Q9H, is thought to be involved in this

platform.
Discussion
This article presents the first repo rt on the purification and
3D structural determination of a native core CBH protein,
and of the cloning and over-expression of the corresponding
gene, from a thermophilic fungal source. CBH IB is
extremely thermostable with a temperature optimum of
68 °C at pH 5.0 and a half-life (t
½
) of 68.0 m in at 80 °Cand
pH 5.0. In comparison, Cel7a from T. reesei has a tempera-
ture optimum of 62 °C over the pH range 3.5–5.6. The ce l7
gene from T. emersonii was cloned a nd the d educed amino
acid sequence used during the structure solution of the native
enzyme. Family 7 contains both CBHs and endoglucanases.
The structure of CBHs is distinguished from that of
endoglucanases by the presence of loops of polypeptide
chain covering the active site residues, which c onvert the
active site cleft of endoglucanases into the characteristic
tunnel of CBHs [8]. T hree CBHs belonging to GH Family 7 –
T. reesei 1CEL, P. chrysosporium 1GPI, and T. emersonii
1Q9H – are generally similar in structure. The catalytic
domains are single domain prote ins with two large antipar-
allel b-sheets that stack face-to-face to form a b-sandwich.
The rest of the three CBHs consist almost entirely of loops
Fig. 7. C-alpha t race of 1Q9H (yellow) superimposed on the C-alpha
trace of 5 Cel (white), illustrating the more open active site of 1Q9H. The
sugar residues are superimposed in blue. The catalytic residues are
shown in red. The figure was drawn by using
TURBO

.
Fig. 8. Diagram of the active site of 1Q9H
showing the distance between the proposed
catalytic residues.
Ó FEBS 2004 Structure and analysis of T. emersonii CBH IB (Eur. J. Biochem. 271) 4503
connecting the b-strands. However, on closer inspection of
the structures, local variations are reflected in the sequence
differences. The cellulose-binding sites in 1Q9H are more
accessible t han those in 1CEL. T he absence of the t hree
amino a cids that are observed to form a tight turn over the
)5/)6 subsites in 1CEL, confer a more open entrance to the
cellulose-binding sites i n 1Q9H. This proposal is supported
by the replacement of Asn7 in 1CEL by a smaller threonine
residue in 1Q9H and 1GPI at the )7 subsite and of Tyr371 in
1CEL by Ala374 i n 1Q9H a t )3/)4 s ubsites. A tyrosine
residue present in 1GPI and 1Q9H, but absent in 1CEL, has
been suggested, by Munoz et al. [17], to be an additional
substrate-binding site. The more open t unnel structure is
probably an adaptation to the lack of a CBM, allowing short
chain oligosaccharides more access to the active site, with
supporting evidence from the higher catalytic rate (k
cat
)
and catalytic efficiency (k
cat
/K
m
)of1Q9H13.4Æs
)1
and

3.6Æs
)1
Æm
M
)1
[21] (compared with 0.093Æs
)1
and
0.23Æs
)1
Æm
M
)1
[12] for 1CEL) with the oligosaccharide
derivative 4-NP-lactopyranoside. An insertion of eight
aminoacidresiduescommonto1Q9H,P. chrysosporium
Cel7D a nd T. reesei endoglucanase Cel7B can be seen.
Although this insertion is located at the outer regions of the
structure, it could potentially have implications for function
and will be the target of future protein engineering studies.
The probable catalytic residues, nucleophile Glu209 and
proton donor Glu214, of 1Q9H are located approximately
on opposite sides of the cleavable glycosidic linkage in the )1/
+1 subsites, with their carboxylic groups 5.57 A
˚
apart. Four
tryptophan residues located along the substrate-binding
tunnel in 1CEL, which are the d eterminants of t he glycosyl-
binding sites , are conserved in 1Q9H. Density w as poor for
one of the tunnel-forming loops of 1Q9H (residues 193–197).

The tunnel is composed of loops that are inherently flexible,
and t he absence of good density in the loops is perhaps
indicative of its flexibility. It is worth noting that the
structures of T. re esei GH7 CBHs were solved in the presence
of substrates. However, as 1Q9H was solved in the absence of
bound substrate, one could imagine that if a substrate was
present in the structure the loops would close over the
substrate yielding a structure more like t hat of 1CEL.
The cel7 gene consists of a 1365 bp open reading frame
encoding 455 amino acids interrupted by two introns. The
deduced amino acid s equence revealed a secretory signal
peptide and a CBH catalytic domain. The 5¢ upstream region
of cel7 has eight potential Cre-binding sites, and it is
probable that glucose repression of cellulase trans cription is
mediated through a Cre protein in T. emersonii (the gene
sequence for a Cre-like protein has been cloned from
T. emersonii). It has been shown previously that sophorose is
a weak inducer of cellulases in T. emersonii [64], but is the
proposed natural inducer of cellulase expression in T. reesei.
Induction of cel7 and cbhII [27] expression by gentiobiose
suggests that this glucose disaccharide may be the natural
cellulase inducer in T. emersonii and indicative of an
alternative cellulase induction mechanism in this fungu s.
The carbohydrate-binding module and linker region that are
characteristic of some other GH f amily members were not
encoded in t he gene, in contrast to cb h 2 from the s ame
source [27]. Biochemical analysis of the CBHs f rom
T. emersonii, previously reported from this laboratory, has
shown that the hydrolysis of crystalline cellulose (Avicel) by
CBH IB is 77% lower than observed with CBH IA [21].

Earlier studies revealed that removal of the carbohydrate-
binding module from the T. re esei CBHresultedina90%
decrease in activity against Avicel [65,66]. More r ecently,
Nutt et al. [ 67] h ave shown, by progressive curve analysis,
that intact CBHs from T. reesei and P. chrysosporium show
higher activities than their corresponding cores against
bacterial microcrystalline cellulose. Takashima et al.[68]
suggest t hat the exoglucanase (EXO1) of H. grisea displays
lower activity towards crystalline cellulose than the corres-
ponding CBHI enzyme from this organism. The same study
indicated exo-synergism between EXO1 and CBH I in the
hydrolysis of crystalline cellulose, and a similar co-operativ-
ity between CBHs in T. emersonii may occur. Despite the
reduced activity of CBH IB against crystalline cellulose, the
enzyme hydrolyses avicel in a processive manner. In
processive cellulose hydrolysis, initial hydrolytic attack
occurs at the chain end, with glucose or cellotetrose produced
only upon initial attack, with c ellobiose being the principal
product of hydrolysis thereafter. During hydrolysis of avicel
by CBH I B, glucose production is markedly low and
remains c onstant after the initial hyd rolytic attack. Cellobi-
ose is the predominant product of hydrolysis and increases in
concentration as the reaction proceeds [21]. The exo-loop of
1CEL (amino acids 243–256) forms the roof of the active site
tunnel at the catalytic centre. Deletion o f this loop has been
shown to lead to a decreased processivity of 1CEL against
crystalline cellulose [69]. This exo-loop is conserved in 1Q9H
and is presumed to contribute to processivity of CBH IB
against crystalline cellulose . It should b e noted, however ,
that 1GPI has a natural deletion of the exo-loop, yet

CEL7D, from P. chrysosporium, is able t o maintain h igh
processivity, leading to efficient crystalline cellulose hydro-
lysis [69]. Therfore, conclusions drawn for one enzyme
within the same family do not necessarily apply to others
because of different substrate p references. We were able to
restore biological activity of the denatured reCBH, although
enzyme activity remained very low. 1Q9H has nine disul-
phide bridges, and so regeneration of the native CBH
enzyme in high yield by in vitro reoxidation of the reduced,
denatured polypeptide, is extremely complex. Expression at
lower temperatures h as been car ried out and has yielded
similar activity results. Therefore, heterologous expression
studies in other hosts are currently in progress. Future site-
directed mutagenesis o f specific residues in 1 Q9H should
provide a valuable insight into the structural basis of ehnaced
thermostability of the CBH IB protein from T. emersonii.
Acknowledgements
This work was funded by HEA pre-PRTLI and Enterprise Ireland
awards to M. G.T. C.M.C. and R.T. are grateful for junior teac hing
fellowships from NU I, Galway, a nd postgraduate scholarships from
Enterprise Ireland.
References
1. Enari, T.M. (1983) Microbial cellulases. In Microbial Enzymes and
Biotechnology (Fogarty, W.M., ed.), pp. 183–223. Elsevier Applied
Science, London.
2. Coughlan, M.P. (1985) Enzymatic hydrolysis of cellulose: an
overview. Biotechnol. Genet. Eng. R ev. 3, 39–169.
4504 A. Grassick et al. (Eur. J. Biochem. 271) Ó FEBS 2004
3. Avgerinos, G.C., Fang, H.Y., Biocic, I. & Wang, D.I.C. (1980) A
novel s tep microbial co nversion of cellulosic biomass to ethanol.

Adv. Biotechnol. 2, 119–124.
4. Eveleigh, D .E. (1984) Biofuels and Oxychemicals from Natural
Polymers – a Persp ective . American So cie ty of Micro biology,
Washington.
5. Lawford, H.G. & Rousseau, J.D. (2003) Cellulosic fuel ethanol:
alternative fermentation process designs with wild-type and
recombinant Zymomonas mobilis. Appl. Biochem. Biotechnol. 105–
108, 457–469.
6. van W yk, J.P. & Mohulatsi, M. (2003) Bi odegradation of waste-
paper by cellulase from Trichoderma viride. Bioresour. Technol. 86,
21–23.
7. Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-Active
Enzymes. Server at URL: http ://afmb.cnrs-mrs.fr/cazy/CAZY/
index.html
8. Divne, C., Stahlberg, J., Teeri, T.T. & Jones, T.A. (1998) High-
resolution crystal structures reveal how a cellulose chain is bound
in the 50 A
˚
long tunnel of cellobiohydrolase I fro m Trichoderma
reesei. J. Mol. Biol. 275, 309–325.
9. Davies, G. & Henrissat, B. (1995) Structures and mechanisms of
glycosyl hydrolases. Structure 3, 853–859.
10. Johnson, P.E., Tomme, P., Joshi, M.D. & McIntosh, L.P. (1996)
Interaction of soluble cellooligosaccharides with the N-terminal
cellulose-bind ing d omain o f Cellulomonas fimi,CenC2.NMRand
ultraviolet absorption spectroscopy. Biochemistry 35, 13895–
13906.
11. Morag, E., H alevy, I., Bayer, E.A. & Lamed, R . (1991) Isolation
and properties of a major cellobiohydrolase from the cellulosome
of Clostridium thermocellum. J. Bacteriol. 17 3 , 4155–4162.

12. Ossowski, I.V., Stahlbe rg, J ., Koivula, A., Pie ns, K., B ecker, D .,
Boer, H., Harle, R., Harris, M., D ivne, C., Mahdi, S., Zhao, Y.,
Driguez, H., Claeyssens, M., Sinnott, M.L. & Teeri, T.T. (2003)
Engineering the exo-loop of Trichoderma reesei cellobiohydrolase,
Cel7A. A comparison with Phanerochaete chrysossporium Cel7D.
J. Mol. Biol. 333, 817–829.
13. Srisodsuk, M., Reinikainen, T.,Penttila,M.&Teeri,T.T.(1993)
Role of the interdomain lin ke r peptide of Trichoderma reesei cel-
lobiohydrolase I in its interaction with crystalline cellulose. J. Biol.
Chem. 268, 20756–20761.
14. Takashima, S., Nakamura, A., Hidaka, M., Masaki, H. &
Uozumi, T. (1999) Molecular cloning and expression of the novel
fungal beta-glucosidase genes from Humicola grisea and Tricho-
derma reesei. J. Biochem. (Tokyo) 125, 728–736.
15. Covert, S.F., Bolduc, J . & Culle n, D. (1992) Genomic organiza-
tion of a cellulase gene family in Phanerochaete chrysosporium.
Curr. Genet. 22, 407–413.
16. Gielkens, M.M., Dekkers, E., Visser, J. & de Graaff, L.H. (1999)
Two cellobiohydrolase-encoding genes from Aspergillus niger
require
D
-xylose and the xylanolytic transcriptional activator XlnR
for their expression. Appl. Environ. Microbiol. 65, 4340–4345.
17. Munoz, I.G., Ubhayasekera, W., Henriksson, H., Szabo, I., Pet-
tersson, G., Johansson, G., Mowbray, S.L. & Sta
˚
lberg, J.J. (2001)
Family 7 cellobiohydrolases from Phanerochaete chrysosporium:
crystal structure of the catalytic module of Cel7D (CBH58) at 1.32
A

˚
resolution and homology models of the isozymes. J. Mol. Biol.
314, 1097–1111.
18. Kleywegt, G.J., Zou, J.Y., Divne, C., Davies, G.J., Sinning, I.,
Stahlberg,J.,Reinikainen,T.,Srisodsuk,M.,Teeri,T.T.&
Jones, T.A. (1997) The crystal structure of the catalytic core
domain of endoglucanase I from Trichoderma r eesei at 3.6 A
˚
resolution, and a compariso n with related enzymes. J. Mol . Biol.
272, 383–397.
19. Folan, M.A. & Coughlan, M.P. (1979) The saccharifying ability of
the cellulase complex of Talaromyces emersonii and comparison
with that of other f ungal species. Int. J. Biochem. 10, 505–510.
20. Brooks, M.M., Tuohy, M.G., Savage, A.V., Claeyssens, M. &
Coughlan, M.P. (1992) The stereochemical course of reactions
catalysed by the cellobioh ydrolases produced b y Talaromyc es
emersonii. Biochem. J. 283, 31–34.
21. Tuohy, M.G., Walsh, D.J., Murray, P.G., Claeyssens, M.,
Cuffe, M.M., Savage, A.V. & Coughlan, M.P. (2002) Kinetic
parameters and mode of action of the cellobiohydrolases pro-
duced by Talaromyces emersonii. Biochim. Biophys. Acta 1596,
366–380.
22. Shoemaker, S., Schweickart, V., Ladner, M., Gelfand, D., Kwok,
S., Myambo, K. & Innis, M. (1983) Molecular cloning of exo-
cellobiohydrolase I derived from Trichoderma reesei strain L27.
Bio/Technology 1, 691–696.
23. Saloheimo, M., Lehtovaara, P., Penttila, M., Teeri, T.T., Stahl-
berg, J., Johansson, G., Pettersson, G ., Claeyssens, M., Tomm e, P.
& Knowles, J.K. (1988) EGIII, a new en doglucanase from Tri-
choderma reesei: the characterization of bot h gene and enzyme.

Gene 63, 11–22.
24. Koch,A.,Weigel,C.T.&Schulz,G.(1993)Cloning,sequencing,
and heterologous expression of a cellulase-encoding cDNA (cbh1)
from Penicillium janthinellum. Gene 124 , 57–65.
25. Chikamatsu, G., Shirai, K., Kato, M., Kobay ash i, T. & T su ka-
goshi, N. (1999) Structure an d expression prop erties of th e endo-
beta-1,4-glucanase A gene from the filamentous fungus Aspergillus
nidulans. FEMS Microbiol. Lett. 175, 239–245.
26. Takada, G., Kawa guchi, T., Sumitani, J. & Arai, M. (19 98)
Expression of Aspergillus aculeatus, F-50 cellobiohydrolase I
(cbhI) and beta-glucosidase 1 (bgl1) genes by Saccharomyces
cerevisiae. Biosci. Biotechnol. B iochem. 62, 1615–1618.
27. Murray, P.G., Collins, C.M., Grassick, A. & Tuohy, M.G. (2003)
Molecular cloning, transcriptional, and expression analysis of the
first cellulase gene (cbh2), encoding cellobiohydrolase II, from the
moderately thermophilic fungus Talaromyces emersonii and
structure prediction of the gene product. Biochem. Biophys. Res.
Commun. 301, 280–286.
28. Moloney, A.P., McCrae, S.I., Wood, T.M. & Coughlan, M.P.
(1985) Isolation and characterization of the endoglucanases of
Talaromyces e mersonii. Biochem. J. 225, 365–374.
29. McHale, A. & Coughlan, M. (1988) Purification of beta glucosi-
dases from Talaromyces emersonii. Methods Enzymol. 160, 437–
443.
30. Moloney, A., Considine, P.J. & Coughlan, M.P. (1983) Cellulose
hydrolysis by Talaromyces emersonii grownondifferentsub-
strates. Biotechnol. Bioe ng. 25, 1169–1173.
31. Grassick, A., Birrane, G ., Tuohy, M., Murray, P. & Higgins, T.
(2003) Crystallisation and preliminary crystallographic analysis of
the catalytic domain cellobiohydrolase IB from Talaromyces

emersonii. Acta Crystallogr. D 59 , 1283–1284.
32. Panasik,N.,Brenchley,J.E.&Farber, G.K. (2000) Distributions
of structural features contributing to thermostability in mesophilic
and thermophilic alpha/beta barrel glycosyl hydrolases. Biochim.
Biophys. Acta 1543, 189–201.
33. Raeder, U. & Broda, P. (1985) Rapid preparation of DNA from
filamentous fungi. Lett. Ap pl. Microbiol. 1, 17–20.
34. Altschul, S.F., Madden, T .L., Schaffer, A.A., Zhang, J., Zhang,
Z., Miller, W. & Lipman, D.J. (1997) Gapped
BLAST
and
PSI
-
BLAST
: a new generation of protein database search programs.
Nucleic Acids Res. 25, 3389–3402.
35. Sambrook, J ., Fritsch, E.F. & Maniatis, T. (1989) Molecular
Cloning: a L aboratory Manual, 2nd edn. Cold Spring Harbor
Laboratory Press, New York.
36. Navaza, J. (2001) Implementation of molecular replacement in
AMORE
. Acta Crystallogr. D Biol. Crystallogr. 57, 1 367–1372.
37. Murshudov, G.N., Vagin, A. & Dodson, E. (1997) Refinement of
macromolecular structures by the maximum-likelihood method.
Acta Crystallogr. D 53, 240–255.
Ó FEBS 2004 Structure and analysis of T. emersonii CBH IB (Eur. J. Biochem. 271) 4505
38. Collaborative Computational Project, N. ( 1994) The CCP4 suite:
programs for p rotein crystallography. Acta Crystallogr. D 50,
760–763.
39. Lamzin,V.S.&Wilson,K.S.(1993)Automatedrefinementof

protein models. Acta Crystallogr. D 49, 1 29–149.
40. Laskowski, R.A., Moss, D.S. & Thornton, J.M. (1993) Main-
chainbondlengthsandbondanglesinproteinstructures.J. Mol.
Biol. 231, 1049–1067.
41. Higgins, D.G. (1994)
CLUSTAL W
: multiple alignment of DNA and
protein sequences. Methods Mol. B iol. 25, 307–318.
42. Pocas-Fonseca, M.J., Silva-Pereira, I., Rocha, B.B. & Azevedo,
M.D.O. (2000) Substrate-dependent differential expression of
Humicola grisea var. thermoidea cellobiohydrolase g enes. Can. J.
Microbiol. 46, 749–752.
43. Fagerstam, L.G. & Pettersson, L.G. (1980) The 1,4-b-glucan cel-
lobiohydrolase from the fungus Trichoderma r eesei QM 941 4.
FEBS Lett. 167, 309–315.
44. Vanden Wymelenberg, A., Covert, S. & Cullen, D. (1993) Iden-
tification of the gene encoding the major cellobiohydrolase of the
white rot f ungus Phanerochaete chrysosporium. Ap pl. E nviron.
Microbiol. 59, 3492–3494.
45. Gilkes, N.R., Henrissat, B., Kilburn, D.G., Miller, R.C. Jr &
Warren, R .A. (1991) Domains in microbial beta-1,4-glycanases:
sequence conservation, function, and enzyme f am ilies. Microbiol.
Rev. 55, 303–315.
46. Littlejohn, T.G. & Hynes, M.J. (1992) Analysis of the site of action
of the a mdR product for regulation o f t he am dS ge ne of Asper-
gillus nidulans. Mol. Gen. Genet. 235, 81–88.
47. Johnson, P.F. & McKnight, S.L. (1989) Eukaryotic transcrip-
tional regulatory proteins. Annu. Rev. Biochem. 58 , 799–839.
48. Saloheimo, A., Aro, N., Ilmen, M. & Penttila, M. (2000) Isolation
of the ace1 gene encoding a Cys(2)-His(2) transcription factor

involved in regulation of activity of the cellulase promoter cbh1 of
Trichoderma re esei. J. Biol. C hem. 275, 5817–5825.
49. Aro, N., Ilmen, M., Saloheimo, A. & Penttila, M. (2003) ACEI of
Trichoderma reesei is a repressor of cellulase and xylanase
expression. Appl. Environ. Microbiol. 69, 56–65.
50. Strauss, J., Mach, R.L., Zeilinger, S., Stoffler, G. & Wolschek, M.
(1995) Cre1, the carbon catabolite repressor protein from
Trichoderma re esei. FEBS L ett. 376, 103–107.
51. Ilmen, M., Thrane, C. & Penttila, M. (1996) The glucose repressor
gene cre1 of Trichoderma: isolation and expression of a full-length
and a truncated mutant form. Mol. Gen. Genet. 251, 451–460.
52. Tempelaars, C.A., Birch, P.R., Sims, P.F. & Broda, P. (1994)
Isolation, characterization, and analysis of the expression of the
cbhII gene of Phanerochaete chrysosporium. Appl. Environ.
Microbiol. 60, 4387–4393.
53. Ilmen,M.,Saloheimo,A.,Onnela,M.L.&Penttila,M.E.(1997)
Regulation of cellulase gene expression in the filamentous fungus
Trichoderma re esei. Appl. Environ. Microbiol. 63, 1298–1306.
54. Moloney, A. & Coughlan, M.P. (1983) Sorption of Talaromyces
emersonii cellulase on cellulosic substrates. Biotechnol. Bioeng. 25,
271–280.
55. Kulmourg, P., Mathieu, M., Dowzer,C.,Kelly,J.&Felenbok,B.
(1993) Specific binding sites in the alcR and alcA promoters of the
ethanol regulon for the C REA represso r mediat ing carbon
catabolite repression in Aspergillus nidulans. Mol. Microbiol. 7,
847–857.
56. Nehlin, J.O. & Ronne, H. (1990) Yeast MIG1 repressor is relate d
to the mammalian early growth response and Wilms’ tumour
finger proteins. EMBO J. 9, 2 891–2898.
57. Hahn,M.,Olsen,O.,Politz,O.,Borriss,R.&Heinemann,U.

(1995) Crystal structure and site-directed m utagenesis o f Bacillus
macerans endo-1,3-1,4-beta-glucanase. J. Biol. Chem. 270, 3081–
3088.
58. Sulzenbacher, G., Schulein, M. & Davies, G.J. (1997) Structure of
the endoglucanase I from Fusarium oxysporum: native, cellobiose,
and 3 ,4-epoxybutyl beta-D-cellobioside-inhibited forms, at 2.3 A
˚
resolution. Biochemistry 36, 5902–5911.
59. McCarter, J.D. & Withers, S.G. (1994) Mechanisms of enzymatic
glycoside hydrolysis. C urr . Opin. Struct. Biol. 4, 885–892.
60. Withers, S.G., D ombroski, D ., Berven , L .A., Kilbu rn, D.G.,
Miller,R.C.Jr,Warren,R.A.&Gilkes,N.R.(1986)Direct1H
n.m.r. determination of the stereochemical course of hydrolyses
catalysed by glucanase co mponents of th e cellulase c omplex.
Biochem. Biophys. Res. Commun. 139, 487–494.
61. Withers, S.G. & Aebersold, R. (1995) Approaches to labeling and
identification of active site residues in glycosidases. Protein Sci. 4,
361–372.
62. Henrissat, B. (1998) Glycosidase families. Biochem. Soc. Trans. 26,
153–156.
63. Nerinckx, W., Desmet, T. & Claeyssens, M. (2003) A hydrophobic
platform as a mechanistically relevant transition state stabilising
factor appears to be present in the active centre of all glycoside
hydrolases. FEBS Lett. 538, 1–7.
64. Moloney, A., Considine, P.J. & Coughlan,M.P.(1983) Cellulose
hydrolysis by the cellulases produced by Talaromyces emersonii
when grown on different ind ucing substrates. Biotechnol. Bioeng.
25, 1169–1173.
65. van Tilbeurgh, H., Tomme, P., Claeyssens, M., Bhikhabhai, R. &
Pettersson, G. (1986) Limited proteolysis of the cellobiohydrolase

IfromTrichoderma reesei. Separation of functional domains.
FEBS Lett. 204, 223–227.
66. Tomme, P., V an Tilbeurgh, H., Pettersson, G., V an Damme, J.,
Vandekerckhove, J., Knowle s, J., Teeri, T . & Claeyssens, M.
(1988) Studies of the cellulolytic system of Trichoderma reesei QM
9414. Analysis of domain function in two cellobiohydrolases by
limited proteolysis. Eur. J. Biochem. 170, 575–581.
67. Nutt, A., Sild, V., Pettersson, G. & Johansson, G. (1998) Progress
curves – a mean for fun ctiona l classification of cellulases. Eur. J.
Biochem. 258, 200–206.
68. Takashima, S., Iikura, H., Nakamura, A., Hidaka, M ., Masaki,
H. & Uozumi, T. (1998) Isolation of the gene and characterization
of the enzymatic pr operties of a major exoglucanase o f Humicola
grisea without a cellulose-bind ing domain. J. Biochem. (T okyo)
124, 717–725.
69. von Ossowski, I., Stahlberg, J., Koivula, A., Piens, K., Becker, D.,
Boer,H.,Harle,R.,Harris,M.,Divne,C.,Mahdi,S.,Zhao,Y.,
Driguez, H., C laeyssens, M., Sinnott, M.L. & Teeri, T .T. ( 2003)
Engineering the exo-loop of Trichoderma reesei cellobiohydrolase,
Cel7A. A com parison w ith Ph an ero cha ete chrysosporium Cel7D.
J. Mol. Bio l. 333, 817–829.
4506 A. Grassick et al. (Eur. J. Biochem. 271) Ó FEBS 2004

×