Tải bản đầy đủ (.pdf) (10 trang)

Biochemistry, 4th Edition P41 pps

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (471.2 KB, 10 trang )

12.2 What Is a DNA Library? 363
heterologous probes because they are not derived from the homologous (same)
organism.
Problems arise if a complete eukaryotic gene is the cloning target; eukaryotic
genes can be tens or even hundreds of kilobase pairs in size. Genes this size are frag-
mented in most cloning procedures. Thus, the DNA identified by the probe may
represent a clone that carries only part of the desired gene. However, most cloning
strategies are based on a partial digestion of the genomic DNA, a technique that
generates an overlapping set of genomic fragments. This being so, DNA segments
from the ends of the identified clone can now be used to probe the library for
clones carrying DNA sequences that flanked the original isolate in the genome. Re-
peating this process ultimately yields the complete gene among a subset of overlap-
ping clones.
cDNA Libraries Are DNA Libraries Prepared from mRNA
cDNAs are DNA molecules copied from mRNA templates. cDNA libraries are con-
structed by synthesizing cDNA from purified cellular mRNA. These libraries pre-
sent an alternative strategy for gene isolation, especially eukaryotic genes. Because
most eukaryotic mRNAs carry 3Ј-poly(A) tails, mRNA can be selectively isolated
from preparations of total cellular RNA by oligo(dT)-cellulose chromatography
(Figure 12.9). DNA copies of the purified mRNAs are synthesized by first anneal-
ing short oligo(dT) chains to the poly(A) tails. These oligo(dT) chains serve as
primers for reverse transcriptase–driven synthesis of DNA (Figure 12.10). [Ran-
dom oligonucleotides can also be used as primers, with the advantages being less
dependency on poly(A) tracts and increased likelihood of creating clones repre-
senting the 5Ј-ends of mRNAs.] Reverse transcriptase is an enzyme that synthe-
sizes a DNA strand, copying RNA as the template. DNA polymerase is then used
to copy the DNA strand and form a double-stranded (duplex DNA) molecule.
Linkers are then added to the DNA duplexes rendered from the mRNA
Known amino acid sequence:
Phe Met Glu Trp His Lys Asn
Possible mRNA sequence:


UUU
UUC
AUG GAA
GAG
UGG CAU
CAC
AGG
AAA
AAU
AAC
1
2
3
4
5
(a)
Cellulose matrix with
covalently attached
oligo(dT) chains
Chromatography
column
Add solution
of total RNA in
0.5 M NaCl
Total RNA in
0.5 M NaCl
5
4
3
2

Wash with 0.5 M
NaCl to remove
residual rRNA, tRNA
Eukaryotic mRNA
with poly(A) tails
hybridizes to oligo(dT)
chains on cellulose;
rRNA, tRNA pass right
through column
(b) (c)
0.5 NaCl
Elute mRNA
from column
with H
2
O
H
2
O
Collect and
evaluate
mRNA solution
ANIMATED FIGURE 12.9 Isolation of eukaryotic mRNA via oligo(dT)-cellulose chromatography.
(a) In the presence of 0.5 M NaCl, the poly(A) tails of eukaryotic mRNA anneal with short oligo(dT) chains cova-
lently attached to an insoluble chromatographic matrix such as cellulose. Other RNAs, such as rRNA (green),pass
right through the chromatography column. (b) The column is washed with more 0.5 M NaCl to remove residual
contaminants. (c) Then the poly(A) mRNA (red) is recovered by washing the column with water because the
base pairs formed between the poly(A) tails of the mRNA and the oligo(dT) chains are unstable in solutions of
low ionic strength. See this figure animated at www.cengage.com/login.
Image not available due to copyright restrictions

364 Chapter 12 Recombinant DNA: Cloning and Creation of Chimeric Genes
CRITICAL DEVELOPMENTS IN BIOCHEMISTRY
Identifying Specific DNA Sequences by Southern Blotting (Southern Hybridization)
Any given DNA fragment is unique solely by virtue of its specific
nucleotide sequence. The only practical way to find one particu-
lar DNA segment among a vast population of different DNA frag-
ments (such as you might find in genomic DNA preparations) is
to exploit its sequence specificity to identify it. In 1975, E. M.
Southern invented a technique capable of doing just that.
Electrophoresis
Southern first fractionated a population of DNA fragments
according to size by gel electrophoresis (see step 2 in figure). The
electrophoretic mobility of a nucleic acid is inversely proportional
to its molecular mass. Polyacrylamide gels are suitable for separa-
tion of nucleic acids of 25 to 2000 bp. Agarose gels are better if the
DNA fragments range up to 10 times this size. Most preparations
of genomic DNA show a broad spectrum of sizes, from less than 1
kbp to more than 20 kbp. Typically, no discrete-size fragments are
evident following electrophoresis, just a “smear” of DNA through-
out the gel.
Blotting
Once the fragments have been separated by electrophoresis
(step 3), the gel is soaked in a solution of NaOH. Alkali dena-
tures duplex DNA, converting it to single-stranded DNA. After
the pH of the gel is adjusted to neutrality with buffer, a sheet of
absorbent material soaked in a concentrated salt solution is then
placed over the gel, and salt solution is drawn through the gel in
a direction perpendicular to the direction of electrophoresis
(step 4). The salt solution is pulled through the gel in one of
three ways: capillary action (blotting), suction (vacuum blotting), or

electrophoresis (electroblotting). The movement of salt solution
through the gel carries the DNA to the absorbent sheet, which
binds the sing
le-stranded DNA molecules very tightly, effectively
immobilizing them in place on the sheet. Note that the distribu-
tion pattern of the electrophoretically separated DNA is main-
tained when the single-stranded DNA molecules bind to the ab-
sorbent sheet (step 5 in figure). The sheet is then dried. Next, in
the prehybridization step, the sheet is incubated with a solution
containing protein (serum albumin, for example) and/or a
detergent such as sodium dodecylsulfate. The protein and
detergent molecules saturate any remaining binding sites
for DNA on the absorbent sheet, so no more DNA can bind
nonspecifically.
Hybridization
To detect a particular DNA within the electrophoretic smear of
countless DNA fragments, the prehybridized sheet is incubated in
a sealed plastic bag with a solution of specific probe molecules
(step 6 in figure). A probe is usually a single-stranded DNA of de-
fined sequence that is distinctively labeled, either with a radioac-
tive isotope (such as
32
P) or some other easily detectable tag. The
nucleotide sequence of the probe is designed to be complemen-
tary to the sought-for or target DNA fragment. The single-stranded
probe DNA anneals with the single-stranded target DNA bound
to the sheet through specific base pairing to form a DNA duplex.
This annealing, or hybridization as it is usually called, labels the
target DNA, revealing its position on the sheet. For example, if
the probe is

32
P-labeled, its location can be detected by autoradi-
ographic exposure of a piece of X-ray film laid over the sheet
(step 7 in figure).
Southern’s procedure has been extended to the identification of
specific RNA and protein molecules. In a play on Southern’s name,
the identification of particular RNAs following separation by gel
electrophoresis, blotting, and probe hybridization is called North-
ern blotting. The analogous technique for identifying protein mol-
ecules is termed Western blotting. In Western blotting, the probe of
choice is usually an antibody specific for the target protein.

The Southern blotting technique involves the transfer of electrophoretically
separated DNA fragments to an absorbent sheet and subsequent detection of
specific DNA sequences. A preparation of DNA fragments [typically a restriction
digest, (1)] is separated according to size by gel electrophoresis (2). The separa-
tion pattern can be visualized by soaking the gel in ethidium bromide to stain
the DNA and then illuminating the gel with UV light (3). Ethidium bromide mol-
ecules intercalated between the hydrophobic bases of DNA are fluorescent
under UV light.The gel is soaked in strong alkali to denature the DNA and then
neutralized in buffer.Next, the gel is placed on a sheet of DNA-binding material
and concentrated salt solution is passed through the gel (4) to carry the DNA
fragments out of the gel where they are bound tightly to the sheet (5).
Incubation of the sheet with a solution of labeled, single-stranded probe DNA
(6) allows the probe to hybridize with target DNA sequences complementary to
it.The location of these target sequences is then revealed by an appropriate
means of detection, such as autoradiography (7).
12.2 What Is a DNA Library? 365
1
2

345
6
7
+–
DNA
Digest DNA with
restriction endonucleases
DNA restriction fragments
Perform agarose gel electrophoresis
on the DNA fragments from different digests
Buffer solution Agarose
gel
DNA fragments fractionated by size
(visible under UV light if gel is
soaked in ethidium bromide)
Longer DNA
fragments
Shorter DNA
fragments
Transfer (blot) gel to
absorbent sheet using
Southern blot technique
Soak gel in NaOH,
neutralize
Sheet of DNA-
absorbing
material
Gel
Wick
Buffer

Weight
Absorbent
paper
DNA fragments are bound to the
sheet in positions identical to
those on the gel
Hybridize sheet with
radioactively
labeled probe
Radioactive
probe solution
Expose sheet to X-ray film;
resulting autoradiograph
shows hybridized DNA
fragments
366 Chapter 12 Recombinant DNA: Cloning and Creation of Chimeric Genes
templates, and the cDNA is cloned into a suitable vector. Once a cDNA derived
from a particular gene has been identified, the cDNA becomes an effective probe
for screening genomic libraries for isolation of the gene itself.
Because different cell types in eukaryotic organisms express selected subsets of
genes, RNA preparations from cells or tissues in which genes of interest are selec-
tively transcribed are enriched for the desired mRNAs. cDNA libraries prepared
from such mRNA are representative of the pattern and extent of gene expression
that uniquely define particular kinds of differentiated cells. cDNA libraries of many
normal and diseased human cell types are commercially available, including cDNA
libraries of many tumor cells. Comparison of normal and abnormal cDNA libraries,
in conjunction with two-dimensional gel electrophoretic analysis (see Appendix to
Chapter 5) of the proteins produced in normal and abnormal cells, is a promising
new strategy in clinical medicine to understand disease mechanisms.
Expressed Sequence Tags When a cDNA library is prepared from the mRNAs syn-

thesized in a particular cell type under certain conditions, these cDNAs represent
the nucleotide sequences (genes) that have been expressed in this cell type under
these conditions. Expressed sequence tags (ESTs) are relatively short (ϳ200 nucleo-
tides or so) sequences obtained by determining a portion of the nucleotide se-
quence for each insert in randomly selected cDNAs. An EST represents part of a
gene that is being expressed. Probes derived from ESTs can be labeled, radioactively
or otherwise, and used in hybridization experiments to identify which genes in a ge-
nomic library are being expressed in the cell. For example, labeled ESTs can be hy-
bridized to a gene chip (see following discussion).
mRNA 5' 3'
Anneal oligo(dT)
12-18
primers
mRNA 5' 3'
Add reverse transcriptase and substrates
dATP, dTTP, dGTP, dCTP
(a)
First-strand
cDNA synthesis
mRNA
cDNA
5' 3'
3' 5'
Heteroduplex
Add RNase H, DNA polymerase, and dATP,
dTTP, dGTP, dCTP; mRNA degraded by RNase H
(b)
5' 3'
3' 5'
DNA polymerase copies first-strand

cDNA using RNA segments as primer
5' 3'
3' 5'
DNA fragments joined by DNA ligase
DNA
polymerase
5' 3'
3' 5'
cDNA duplex
(c)
(d)
EcoRI linkers,
T4 DNA ligase
EcoRI-ended cDNA duplexes for cloning
(e)
cDNA
cDNA
P
ACTIVE FIGURE 12.10 Reverse
transcriptase–driven synthesis of cDNA from oligo(dT)
primers annealed to the poly(A) tails of purified eukary-
otic mRNA. (a) Oligo(dT) chains serve as primers for
synthesis of a DNA copy of the mRNA by reverse tran-
scriptase. Following completion of first-strand cDNA
synthesis by reverse transcriptase, RNase H and DNA
polymerase are added (b). RNase H specifically digests
RNA strands in DNAϺRNA hybrid duplexes. DNA poly-
merase copies the first-strand cDNA, using as primers the
residual RNA segments after RNase H has created nicks
and gaps (c). DNA polymerase has a 5Ј→3Ј exonuclease

activity that removes the residual RNA as it fills in with
DNA.The nicks remaining in the second-strand DNA are
sealed by DNA ligase (d), yielding duplex cDNA. EcoRI
adapters with 5Ј-overhangs are then ligated onto the
cDNA duplexes (e) using phage T4 DNA ligase to create
EcoRI-ended cDNA for insertion into a cloning vector.
Test yourself on the concepts in this figure at
www.cengage.com/login.
12.2 What Is a DNA Library? 367
DNA Microarrays (Gene Chips) Are Arrays of Different Oligonucleotides
Immobilized on a Chip
Robotic methods can be used to synthesize combinatorial libraries of DNA
oligonucleotides directly on a solid support, such that the completed library is a
two-dimensional array of different oligonucleotides (see the Critical Developments
in Biochemistry box on combinatorial libraries, page 361). Synthesis is performed
by phosphoramidite chemistry (Figure 11.29) adapted into a photochemical
HUMAN BIOCHEMISTRY
The Human Genome Project
Completed in 2003, the Human Genome Project was a 13-year col-
laborative international, government- and private-sponsored effort
to map and sequence the entire human genome, some 3 billion
base pairs distributed among the two sex chromosomes (X and Y)
and 22 autosomes (chromosomes that are not sex chromosomes).
A primary goal was to identify and map at least 3000 genetic mark-
ers (genes or other recognizable loci on the DNA), which were
evenly distributed throughout the chromosomes at roughly 100-kb
intervals. At the same time, determination of the entire nucleotide
sequence of the human genome was undertaken. J. Craig Venter
and colleagues working at Celera, a private corporation, took an
alternative approach based on computer alignment of sequenced

human DNA fragments. A working draft of the human genome
was completed in June 2000 and published in February 2001. An
ancillary part of the project has focused on sequencing the
genomes of other species (such as yeast, Drosophila melanogaster
[the fruit fly], mice, and Arabidopsis thaliana [a plant]) to reveal
comparative aspects of genetic and sequence organization (Table
12.1). Information about whole genome sequences of organisms
has created a new branch of science called bioinformatics: the
study of the nature and organization of biological information.
Bioinformatics includes such approaches as functional genomics
and proteomics. Functional genomics addresses global issues of gene
expression, such as looking at all the genes that are activated dur-
ing major metabolic shifts (as from
growth under aerobic to
growth under anaerobic conditions) or during embryogenesis and
development of organisms. Transcriptome is the word used in
functional genomics to define the entire set of genes expressed (as
mRNAs transcribed from DNA) in a particular cell or tissue under
defined conditions. Functional genomics also provides new in-
sights into evolutionary relationships between organisms. Pro-
teomics is the study of all the proteins expressed by a certain cell or
tissue under specified conditions. Typically, this set of proteins is
revealed by running two-dimensional polyacrylamide gel elec-
trophoresis on a cellular extract or by coupling protein separation
techniques to mass spectrometric analysis.
The Human Genome Project has proven to be very beneficial
to medicine. Many human diseases have been traced to genetic de-
fects whose position within the human genome has been identi-
fied. As of 2007, the Human Gene Mutation Database (HGMD)
listed more than 56,000 mutations in more than 2100 nuclear

genes associated with human disease. Among these are
cystic fibrosis gene
the breast cancer genes, BRCA1 and BRCA2
Duchenne muscular dystrophy gene* (at 2.4 megabases, one of the
largest known genes in any organism)
Huntington’s disease gene
neurofibromatosis gene
neuroblastoma gene (a form of brain cancer)
amyotrophic lateral sclerosis gene (Lou Gehrig’s disease)
melanocortin-4 receptor gene (obesity and binge eating)
fragile X-linked mental retardation gene*
as well as genes associated with the development of diabetes, a
variety of other cancers, and affective disorders such as schizophre-
nia and bipolar affective disorder (manic depression).
*X-chromosome–linked gene. As of 2007, more than 295 disease-related
genes have been mapped to the X chromosome (source: the GeneCards
website at the Weizmann Institute of Science, Israel.)
Genome Year
Genome Size
2
Completed
Bacteriophage ␾X174 0.0054 1977
Bacteriophage ␭ 0.048 1982
Marchantia
3
chloroplast genome 0.187 1986
Vaccinia virus 0.192 1990
Cytomegalovirus (CMV) 0.229 1991
Marchantia
3

mitochondrial genome 0.187 1992
Variola (smallpox) virus 0.186 1993
Haemophilus influenzae
4
1.830 1995
(Gram-negative bacterium)
Mycobacterium genitalium 0.58 1995
(mycobacterium)
Escherichia coli (Gram-negative 4.64 1996
bacterium)
Saccharomyces cerevisiae (yeast) 12.1 1996
Methanococcus jannaschii 1.66 1998
(archaeon)
Arabidopsis thaliana (green plant) 115 2000
Caenorhabditis elegans (simple 88 1998
animal: nematode worm)
Drosophila melanogaster (fruit fly) 117 2000
Homo sapiens (human) 3038 2001
Pan troglodytes (chimpanzee) 3109 2005
1
Data available from the National Center for Biotechnology Information at the National
Library of Medicine. Website: />2
Genome size is given as millions of base pairs (mb).
3
Marchantia is a bryophyte (a nonvascular green plant).
4
The first complete sequence for the genome of a free-living organism.
TABLE 12.1
Completed Genome Nucleotide Sequences
1

368 Chapter 12 Recombinant DNA: Cloning and Creation of Chimeric Genes
process that can be controlled by light. Computer-controlled masking of the light
allows chemistry to take place at some spots in the two-dimensional array of grow-
ing oligonucleotides and not at others, so each spot on the array is a population of
identical oligonucleotides of unique sequence. The final products of such proce-
dures are referred to as “gene chips” because the oligonucleotide sequences syn-
thesized upon the chip represent the sequences of chosen genes. Typically, the
oligonucleotides are up to 25 nucleotides long (there are more than 10
15
possible
sequence arrangements for 25-mers made from four bases), and as many as
100,000 different oligonucleotides can be arrayed on a chip 1 cm square. The
oligonucleotides on such gene chips are used as the probes in a hybridization ex-
periment to reveal gene expression patterns. Figure 12.11 shows one design for
gene chip analysis of gene expression.
1 2
OR
PCR amplification
purification
Laser 1 Laser 2
Excitation
Robotic synthesis
of oligonucleotide
arrays
ESTs or other
DNA clones
Hybridize
target to
microarray
Reverse

transcription
Label with
fluor dyes
Test Reference
Gene chip
Emission
Computer analysis
(a)
(b)
FIGURE 12.11 Gene chips (DNA microarrays) in the analysis of gene expression. Here
is one of many analytical possibilities based on DNA microarray technology: (1) Gene
segments (for example, ESTs) are isolated and amplified by PCR (see Figure 12.18),
and the PCR products are robotically printed onto coated glass microscope slides to
create a gene chip.The gene chip usually is considered the “probe”in a “targetϺprobe”
screening experiment. (2) Target preparation:Total RNA from two sets of cell treat-
ments (control and test treatment) are isolated, and cDNA is produced from the two
batches of RNA via reverse transcriptase. During cDNA production, the control is la-
beled with a specific fluorescent marker (green, for example) and the test treatment
is labeled with a different fluorescent marker (red, for example), so the wavelength of
fluorescence allows discrimination between the two different sets of cDNAs.The two
batches of labeled cDNA are pooled and hybridized to the gene chip. Laser excitation
of the hybridized gene chip with light of appropriate wavelength allows collection of
data indicating the intensities of fluorescence, and hence the degree of hybridization
of the two different probes with the gene chips.Because the location of genes on
the gene chip is known, which genes are expressed (or not) and the degree to which
they are expressed is revealed by the fluorescent patterns. (Adapted from Figure 1 in
Duggan, D.J., et al.,1999. Expression profiling using cDNA microarrays. Nature Genetics 21 supple-
ment:10–14.)
12.3 Can the Cloned Genes in Libraries Be Expressed? 369
12.3 Can the Cloned Genes in Libraries Be Expressed?

Expression Vectors Are Engineered So That the RNA or Protein Products
of Cloned Genes Can Be Expressed
Expression vectors are engineered so that any cloned insert can be transcribed into
RNA, and, in many instances, even translated into protein. cDNA expression li-
braries can be constructed in specially designed vectors. Proteins encoded by the
various cDNA clones within such expression libraries can be synthesized in the host
cells, and if suitable assays are available to identify a particular protein, its corre-
sponding cDNA clone can be identified and isolated. Expression vectors designed
for RNA expression or protein expression, or both, are available.
RNA Expression A vector for in vitro expression of DNA inserts as RNA transcripts
can be constructed by putting a highly efficient promoter adjacent to a versatile
cloning site. Figure 12.12 depicts such an expression vector. Linearized recombi-
nant vector DNA is transcribed in vitro using SP6 RNA polymerase. Large amounts
of RNA product can be obtained in this manner; if radioactive or fluorescent-
labeled ribonucleotides are used as substrates, labeled RNA molecules useful as
probes are made.
Protein Expression Because cDNAs are DNA copies of mRNAs, cDNAs are unin-
terrupted copies of the exons of expressed genes. Because cDNAs lack introns, it
is feasible to express these cDNA versions of eukaryotic genes in prokaryotic hosts
that cannot process the complex primary transcripts of eukaryotic genes. To ex-
press a eukaryotic protein in E. coli, the eukaryotic cDNA must be cloned in an ex-
pression vector that contains regulatory signals for both transcription and transla-
tion. Accordingly, a promoter where RNA polymerase initiates transcription as well
as a ribosome-binding site to facilitate translation are engineered into the vector just
upstream from the restriction site for inserting foreign DNA. The AUG initiation
codon that specifies the first amino acid in the protein (the translation start site) is
contributed by the insert (Figure 12.13).
Strong promoters have been constructed that drive the synthesis of foreign pro-
teins to levels equal to 30% or more of total E. coli cellular protein. An example is the
hybrid promoter, p

tac
, which was created by fusing part of the promoter for the E. coli
genes encoding the enzymes of lactose metabolism (the lac promoter) with part of
the promoter for the genes encoding the enzymes of tryptophan biosynthesis (the
trp promoter) (Figure 12.14). In cells carrying p
tac
expression vectors, the p
tac
pro-
moter is not induced to drive transcription of the foreign insert until the cells are ex-
posed to inducers that lead to its activation. Analogs of lactose (a ␤-galactoside) such
as isopropyl-␤-thiogalactoside, or IPTG, are excellent inducers of p
tac
. Thus, expression
of the foreign protein is easily controlled. (See Chapter 29 for detailed discussions
of inducible gene expression.)
Perhaps the most widely used protein expression system is based on the pET plas-
mid. Transcription of the cloned gene insert is under the control of the bacterio-
phage T7 RNA polymerase promoter in pET. This promoter is not recognized by
the E. coli RNA polymerase, so transcription can only occur if the T7 RNA po-
lymerase is present in host cells. Host E. coli cells are engineered so that the T7 RNA
polymerase gene is inserted in the host chromosome under the control of the lac
promoter. IPTG induction triggers T7 RNA polymerase production and subsequent
transcription and translation of the pET insert. The bacteriophage T7 RNA po-
lymerase is so active that most of the host cell’s resources are directed into protein
expression and levels of expressed protein approach 50% of total cellular protein.
The bacterial production of valuable eukaryotic proteins represents one of the
most important uses of recombinant DNA technology. For example, human insulin
for the clinical treatment of diabetes is now produced in bacteria.
2

1
3
Polylinker cloning site
Foreign
DNA
RNA transcription by
SP6 RNA polymerase
Runoff SP6
RNA transcri
p
t
SP6 RNA
p
olymerase
Insert foreign DNA at
polylinker cloning site
SP6 promoter
Linearize
ANIMATED FIGURE 12.12 Expression
vectors carrying the promoter recognized by the RNA
polymerase of bacteriophage SP6 are useful for the pro-
duction of multiple RNA copies of any DNA inserted at
the polylinker. Before transcription is initiated, the circu-
lar expression vector is linearized by a single cleavage at
or near the end of the insert so that transcription termi-
nates at a fixed point. See this figure animated at
www.cengage.com/login.
370 Chapter 12 Recombinant DNA: Cloning and Creation of Chimeric Genes
Analogous systems for expression of foreign genes in eukaryotic cells include vec-
tors carrying promoter elements derived from mammalian viruses, such as simian

virus 40 (SV40), the Epstein–Barr virus, and the human cytomegalovirus (CMV). A sys-
tem for high-level expression of foreign genes uses insect cells infected with the bac-
ulovirus expression vector. Baculoviruses infect lepidopteran insects (butterflies and
moths). In engineered baculovirus vectors, the foreign gene is cloned downstream
of the promoter for polyhedrin, a major viral-encoded structural protein, and the
recombinant vector is incorporated into insect cells grown in culture. Expression
from the polyhedrin promoter can lead to accumulation of the foreign gene prod-
uct to levels as high as 500 mg/L.
Technologies for the expression of recombinant proteins in mammalian cell cul-
tures are commercially available. These technologies have the advantage that the
unique post-translational modifications of proteins (such as glycosylation; see Chap-
ter 31) seen in mammalian cells take place in vivo so that the expressed protein is
produced in its naturally occurring form.
Screening cDNA Expression Libraries with Antibodies Antibodies that specifically
cross-react with a particular protein of interest are often available. If so, these anti-
bodies can be used to screen a cDNA expression library to identify and isolate cDNA
clones encoding the protein. The cDNA library is introduced into host bacteria,
which are plated out and grown overnight, as in the colony hybridization scheme pre-
viously described. DNA-binding nylon membranes are placed on the plates to obtain
a replica of the bacterial colonies. The nylon membrane is then incubated under con-
ditions that induce protein synthesis from the cloned cDNA inserts, and the cells are
treated to release the synthesized protein. The synthesized protein binds tightly to the
nylon membrane, which can then be incubated with the specific antibody. Binding of
the antibody to its target protein product reveals the position of any cDNA clones ex-
pressing the protein, and these clones can be recovered from the original plate. Like
other libraries, expression libraries can be screened with oligonucleotide probes, too.
Fusion Protein Expression Some expression vectors carry cDNA inserts cloned
directly into the coding sequence of a vector-borne protein-coding gene (Figure
12.15). Translation of the recombinant sequence leads to synthesis of a hybrid pro-
tein or fusion protein. The N-terminal region of the fused protein represents amino

acid sequences encoded in the vector, whereas the remainder of the protein is en-
coded by the foreign insert. Keep in mind that the triplet codon sequence within
the cloned insert must be in phase with codons contributed by the vector se-
quences to make the right protein. The N-terminal protein sequence contributed
by the vector can be chosen to suit purposes. Furthermore, adding an N-terminal
p
t
a
c
o
r
i
a
m
p
r
HindIII
EcoRI
EcoRI
EcoRI
Pst
I
Bgl I
Polylinker
cloning site
pUR278
5.2 kbp
ANIMATED FIGURE 12.14 A p
tac
protein

expression vector contains the hybrid promoter p
tac
derived from fusion of the lac and trp promoters.
Isopropyl-␤-D-thiogalactoside, or IPTG, induces expres-
sion from p
tac
. See this figure animated at www
.cengage.com/login.
Image not available due to copyright restrictions
12.3 Can the Cloned Genes in Libraries Be Expressed? 371
signal sequence that targets the hybrid protein for secretion from the cell simpli-
fies recovery of the fusion protein. A variety of gene fusion systems have been de-
veloped to facilitate isolation of a specific protein encoded by a cloned insert. The
isolation procedures are based on affinity chromatography purification of the fu-
sion protein through exploitation of the unique ligand-binding properties of the
vector-encoded protein (Table 12.2).
Reporter Gene Constructs Are Chimeric DNA Molecules Composed
of Gene Regulatory Sequences Positioned Next to an Easily
Expressible Gene Product
Potential regulatory regions of genes (such as promoters) can be investigated by
placing these regulatory sequences into plasmids upstream of a gene, called a
reporter gene, whose expression is easy to measure. Such chimeric plasmids are
Ec
o
RI
ClaI
HindIII
XbaI
Sa
l I

BamHI
Ec
o
RI
P
st
I
a
m
p
r
l
a
c
Z
o
r
i
Cloning
site
pUR278
5.2 kbp
p
tac
Codon: Cys Gln Lys Gly Asp Pro Ser Thr Leu Glu Ser Leu Ser Met
Cloning site: TGT CAA AAA GGG GAT CCG TCG ACT CTA GAA AGC TTA TCG ATG
BamHI SalI XbaI HindIII ClaI
ANIMATED FIGURE 12.15 A typical expression vector for the synthesis of a hybrid protein.
The cloning site is located at the end of the coding region for the protein ␤-galactosidase. Insertion of foreign
DNAs at this site fuses the foreign sequence to the ␤-galactosidase coding region (the lacZ gene). IPTG

induces the transcription of the lacZ gene from its promoter p
lac
, causing expression of the fusion protein.
(Adapted from Figure 2, Rüther, U., and Müller-Hill, B., 1983. EMBO Journal 2:1791–1794. See this figure animated at
www.cengage.com/login.
Fusion Protein Secreted?* Affinity Ligand
␤-Galactosidase No p-Aminophenyl-␤-D-thiogalactoside
(APTG)
Protein A Yes Immunoglobulin G (IgG)
Chloramphenicol acetyltransferase Yes Chloramphenicol
(CAT)
Streptavidin Yes Biotin
Glutathione-S-transferase (GST) No Glutathione
Maltose-binding protein (MBP) Yes Starch
Hexahistidine tag No Nickel or cobalt
Hemagglutinin (HA) peptide No HA-peptide antibody
*This indicates whether combined secretion–fusion gene systems have led to secretion of the protein product from the
cells, which simplifies its isolation and purification.
TABLE 12.2
Gene Fusion Systems for Isolation of Cloned Fusion Proteins
372 Chapter 12 Recombinant DNA: Cloning and Creation of Chimeric Genes
then introduced into cells of choice (including eukaryotic cells) to assess the po-
tential function of the nucleotide sequence in regulation because expression of
the reporter gene serves as a report on the effectiveness of the regulatory ele-
ment. A number of different genes have been used as reporter genes. A reporter
gene with many inherent advantages is that encoding the green fluorescent pro-
tein (or GFP), described in Chapter 4. Unlike the protein expressed by other re-
porter gene systems, GFP does not require any substrate to measure its activity,
nor is it dependent on any cofactor or prosthetic group. Detection of GFP re-
quires only irradiation with near-UV or blue light (400-nm light is optimal), and

the green fluorescence (light of 500 nm) that results is easily observed with the
naked eye, although it can also be measured precisely with a fluorometer. Figure
12.16 demonstrates the use of GFP as a reporter gene. EGFP is an engineered ver-
sion of GFP that shows enhanced fluorescent properties.
Specific Protein–Protein Interactions Can Be Identified Using
the Yeast Two-Hybrid System
Specific interactions between proteins (so-called protein–protein interactions) lie at
the heart of many essential biological processes. One method to identify specific
protein–protein interactions in vivo is through expression of a reporter gene whose
transcription is dependent on a functional transcriptional activator, the GAL4 pro-
tein. The GAL4 protein consists of two domains: a DNA-binding (or DB) domain and
a transcriptional activation (or TA) domain. Even if expressed as separate proteins,
these two domains will still work, provided they can be brought together. The method
depends on two separate plasmids encoding two hybrid proteins, one consisting of
the GAL4 DB domain fused to protein X and the other consisting of the GAL4 TA do-
main fused to protein Y (Figure 12.17a). If proteins X and Y interact in a specific
protein–protein interaction, the GAL4 DB and TA domains are brought together so
Nerve cord Oviducts Ovaries
FIGURE 12.16 Green fluorescent protein (GFP) as a re-
porter gene. In the experiment here, GFP expression
depends on the promoter for the Drosophilia melano-
gaster Tdc2 gene. Tdc2 encodes a neuronal tyrosine de-
carboxylase (TDC) whose expression is necessary for
egg laying in fruit flies. (Bottom) Green fluorescence
highlights neuronal projections expressing the Tdc2
gene. (Top) Diagram of a fly, its nervous system, and
ovaries. Note that Tdc2 neurons innervate the ovaries
and oviducts of flies. (See Cole, S. H., et al., 2005.Two func-
tional but noncomplementing Drosophila tyrosine decarboxy-
lase genes. Journal of Biological Chemistry 280:14948–14955. GFP

image courtesy of Shannon H. Cole and Jay Hirsh, the University
of Virginia. Fly image derived from the Atlas of Drosophila Devel-
opment by Volker Hartenstein, http://flybase.bio.indiana.edu/
allied-data/lk/interactive-fly/atlas/00contents.htm.)
TA
DB
lacZ Reporter Gene
(b)
(a)
lacZ Reporter Gene
X
X
Y
Y
TA
DB
FIGURE 12.17 The yeast two-hybrid system for identify-
ing protein–protein interactions. If proteins X and Y
interact, the lacZ reporter gene is expressed. Cells
expressing lacZ exhibit ␤-galactosidase activity.

×