Tải bản đầy đủ (.pdf) (7 trang)

Báo cáo khoa học: Semi-nested PCR analysis of unknown tags on serial analysis of gene expression potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (223.25 KB, 7 trang )

Semi-nested PCR analysis of unknown tags on serial
analysis of gene expression
Wang-Jie Xu
1
, Qiao-Li Li
1
, Chen-Jiang Yao
1
, Zhao-Xia Wang
1
, Yang-Xing Zhao
1
and
Zhong-Dong Qiao
1,2
1 College of Life Science and Technology, Shanghai Jiao Tong University, Shanghai, China
2 Shanghai Institute of Medical Genetics, Shanghai Jiao Tong University, Shanghai, China
The serial analysis of gene expression (SAGE) tech-
nique allows the construction of a comprehensive
expression profile, in which each mRNA is defined by
a specific 14-mer [1–4]. By analyzing a short sequence
tag for a transcript, SAGE significantly decreases the
overall scale of sequencing analysis and makes it possi-
ble to analyze nearly all of the expressed transcripts
from the genome, a capability matched by no other
currently available method [5]. Application of the
SAGE technique has provided valuable information in
various biological systems [6,7]. Recently, millions of
short cDNA sequences called SAGE tags have been
collected from human tissues through the SAGE
method [8,9]. It has been frequently observed that a


large number of SAGE tags do not match the existing
expressed sequences upon analysis of the SAGE data
Keywords
modified lock-docking oligo(dT); mRNA;
RACE; serial analysis of gene expression
(SAGE); two-step analysis of unknown
SAGE tags (TSAT-PCR)
Correspondence
Z. Qiao, Shanghai Institute of Medical
Genetics, Shanghai Jiao Tong University,
Shanghai, China
Fax: +86 21 54747330
Tel: +86 21 34204925
E-mail:
(Received 3 August 2008, revised 3
September 2008, accepted 5 September
2008)
doi:10.1111/j.1742-4658.2008.06671.x
Serial analysis of gene expression (SAGE) is a powerful technique for
studying gene expression at the genome level. However, short SAGE tags
limit the further study of related data. In this study, in order to identify a
gene, we developed a semi-nested PCR-based method called the two-step
analysis of unknown SAGE tags (TSAT-PCR) to generate longer 3¢-end
cDNA fragments from unknown SAGE tags. In the procedure, a modified
lock-docking oligo(dT) with two degenerate nucleotide positions at the
3¢-end was used as a reverse primer to synthesize cDNAs. Afterwards,
the full-length cDNAs were amplified by PCR based on 5¢-RACE and
3¢-RACE. The amplified cDNAs were then used for the subsequent two-
step PCR of the TSAT-PCR process. The first-step PCR was carried out at
an appropriately low annealing temperature; a SAGE tag-specific primer

was used as the sense primer, and an 18 bp sequence (universal primer I)
located at the 5¢-reverse primer end was used as the antisense primer. After
15–20 PCR cycles, the 3¢-end cDNA fragments containing the tag could be
enriched, and the PCR products could be used as templates for the second-
step PCR to obtain the specific products. The second-step PCR was per-
formed with a SAGE tag-specific primer and a 22-bp sequence (universal
primer II) upstream of universal primer I at the 5¢-reverse primer with a
high annealing temperature. With our innovative TSAT-PCR method, we
could easily obtain specific PCR products covering SAGE from those tran-
scripts, especially low-abundance transcripts. It can be used as a method to
identify genes expressed in different cell types.
Abbreviations
GLGI, generation of longer cDNA fragments from serial analysis of gene expression tags for gene identification; PLF, primary library forward
primer; PLR, primary library reverse primer; RAST-PCR, rapid reverse transcription–PCR analysis of unknown serial analysis of gene
expression tags; rSAGE, reverse serial analysis of gene expression; SAGE, serial analysis of gene expression; TSAT-PCR, two-step analysis
of unknown SAGE tags; UP-I, universal primer I; UP-II, universal primer II.
5422 FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS
[10,11]. It is possible, then, that the unmatched SAGE
tags originating from potentially novel transcripts or
novel genes are unidentified in the human genome.
We have constructed a SAGE library on human
spermatozoa in which we obtained more than 2500
unique tags. Of these, 54 were considered to be high-
frequency tags, and no homology could be found in
the GenBank database [12]. Therefore, those tags
might represent unidentified genes. However, there was
a major problem when the SAGE tag sequence was
applied to the process of gene identification. Owing to
the short length of SAGE tag sequences, it became dif-
ficult to produce the 3¢-longer cDNA fragments and

even whole cDNA sequences by PCR, which affected
further studies on SAGE data. Moreover, the short
tag has hindered the application of SAGE to the vast
majority of eukaryotes, including expressed sequence
tags and genome sequences without sufficient genomic
resources [13].
In order to solve this problem, we have developed
a technique called the two-step analysis of unknown
SAGE tags (TSAT-PCR) to generate the 3¢-longer
cDNA ends. The three key points of our method are
as follows: first, it uses a modified lock-docking
oligo(dT) primer, with two degenerate nucleotide
positions at the 3¢-end, as a reverse primer to syn-
thesize the first-strand cDNA; second, the primary
cDNAs were enriched by PCR, and then served as
templates for the subsequent TSAT-PCR experiment;
and third, the semi-nested PCR principle was used
as a reference in designing the two-step PCR method
in order to obtain the 3¢-end cDNA tag-specific
fragments. Currently, we have successfully used this
procedure to test and analyze 11 of the 54
unmatched SAGE tags.
Results and Discussion
Enrichment of cDNA template
Owing to RACE technology, we could now amplify
full-length cDNAs to generate enough templates for
the subsequent PCR, especially a few low-abundance
cDNAs (Fig. 1A). In this study, the amplification of
cDNAs was carried out as follows: first, owing to
two degenerate nucleotide positions at the 3¢-end of

the modified oligo(dT) primer in the RT-PCR pro-
cess, these nucleotides position the primer at the start
of the poly(A)
+
tail, thereby eliminating the 3¢-heter-
ogeneity inherent in conventional oligo(dT) priming
[14]. As the PrimeScript Reverse Transcriptase exhib-
ited terminal transferase activity upon reaching the
end of an RNA template, it added three to five resi-
dues (predominantly dC) to the 3¢-end of the first-
strand cDNA. The 5¢-cap oligonucleotide contained a
terminal stretch of G residues that annealed to the
dC-rich cDNA tail and served as an extended tem-
plate for reverse transcription. In the subsequent
PCR process, the reverse transcription product above
was used as template. Primary library forward primer
(PLF) and primary library reverse primer (PLR)
paired with the 5¢-end and 3¢-end of all cDNAs,
respectively, and after 25 cycles, the entire cDNAs
were largely amplified for the next experiment.
Figure 2 shows the amplified cDNAs. As can be
seen, the length of the smear is distributed from about
mRNA
mRNA
NBAAAAAAA-3′
NBAAAAAAA-3′
NBAAAAAAA
Modified oligo (dT)
NVTTTTTTT
NBAAAAAAA

NVTTTTTTT
NBAAAAAAA
NVTTTTTTT
NBAAAAAAA
NVTTTTTTT
NBAAAAAAA
NVTTTTTTT
16
16
16
16
16
16
16
16
5′
5′
5′-cap oligo
NVTTTTTTT
NVTTTTTTT
NBAAAAAAA
NVTTTTTTT
GGG
CCC
GGG
CCC
GGG
CCC
GGG
CCC

GGG
CCC
Anneal first strand
Primer to mRNA
cDNA first
strand synthesis
Modified oligo (dT)
Tag-specific primer
UP-I
The 2
nd
PCR
The 1
st

PCR
UP-II
UP-II
PLF
PLR
GGATCC
GGATCC
GGATCC
cDNAs synthesis
cDNA library
AB
Fig. 1. Detailed mechanism of the amplification of the whole cDNAs and the TSAT-PCR technique. (A) In this process, double-stranded
cDNAs synthesized by modified lock-docking oligo(dT) and 5¢-cap oligonucleotides were used for PCR. During the PCR process, PLF and PLR
were used as sense primer and antisense primer, respectively, to amplify the cDNAs. (B) The procedure involved two PCR reactions. The first
PCR reaction was performed with a tag-specific primer containing a SAGE tag sequence and an 18 bp primer (UP-I) located at the 5¢-reverse

primer end. The first PCR product was then used as the template for the second PCR reaction. The tag-specific primer and a 22-bp primer
(UP-II) located near UP-I located at the 5¢-reverse primer were used as the sense primer and the antisense primer, respectively.
W J. Xu et al. New method of 3¢-end amplification from SAGE tags
FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS 5423
100 bp to over 2 kb, and is mostly focused on the 0.3–
1 kb range. The results demonstrate that high-abun-
dance genes are not very variable in terms of length,
as they mostly concentrate on a narrow span (0.3–
1 kb). Aside from the range, we can see that there are
a few low-abundance genes that are either very long
(50 kb) or short (50 bp). It seems that the smear of the
genes did not become obvious because of their low
abundance or short extension time in the PCR, or
both.
TSAT-PCR general strategy
The amplified cDNAs served as primary templates for
TSAT-PCR, as illustrated in Fig. 1B. The antisense
primers [PLR, universal primer I (UP-I) and universal
primer II (UP-II)] were all designed from the sequence
of the modified oligo(dT) primer. The three primers
shared some overlap with each other and their length
was different considering the consistency of their
equivalent sense primers (Fig. 3). Both UP-I and
UP-II were used as nested primers in the TSAT-PCR
reactions. The TSAT-PCR technique was developed
from the principle of nested PCR, and the procedure
included a two step-PCR reaction. For 15–20 cycles of
the first PCR, an appropriately low annealing tempera-
ture (about 55 °C) was used, a SAGE tag-specific
primer and UP-I. As a result, the 3¢-end cDNA frag-

ments containing the tag could be enriched while some
nonspecific products were also generated simulta-
neously, and then the PCR products could be used
as templates for the second-step PCR to obtain the
specific products. The second-step PCR was performed
with a SAGE tag-specific primer and a nested primer
(UP-II) at a high annealing temperature (‡ 60 °C).
Afterwards, the specific products corresponding to tags
could be amplified.
Amplification of longer sequences from
SAGE tags
To test the TSAT-PCR procedure, we chose five tags
corresponding to known genes, as well as 11 different-
abundance tags corresponding to unknown genes, all
identified in SAGE analysis of human spermatozoa
(Table 1). Among the 16 tags, tag 4, A and E were
used as representatives of low-frequency genes in order
to help us determine whether or not the process
worked on low-frequency tags. Upon application of
the TSAT-PCR method, we obtained the PCR prod-
ucts (Fig. 4) of all tags tested using the standard PCR
condition (first PCR, 94 °C for 30 s, 55 °C for 30 s
and 72 °C for 30 s for 15 cycles; second PCR, 94 °C
for 30 s, 60 °C for 30 s and 72 °C for 30 s for 25
cycles). The PCR products were electrophoresed
through a 2.0% agarose gel, and cloned into a plasmid
vector for sequencing analysis. As compared with the
others, tag 1, 2, 3, 4 and A displayed very weak PCR
bands in the agarose gel, especially the two low-
frequency tags (Fig. 4). Aside from this, there were

also two clear bands in the PCR product of no. 10.
We further optimized the PCR annealing tempera-
ture, as well as the cycle number, for each of the
weak-band tags. Moreover, these bands were obviously
clearer than the pervious ones (data not shown). We
then verified whether or not each PCR product indeed
represented a sequence downstream of the most 3¢
NlaIII site in the full-length cDNA by analyzing the
M
12
3530 bp
1584 bp
947 bp
564 bp
Fig. 2. PCR amplification of the full-length cDNAs. The cDNAs
were amplified with PLF and PLR. M: kDNA ⁄ HindIII + EcoRI mar-
ker. Lane 1: amplified full-length cDNAs. Lane 2: glyceraldehyde-3-
phosphate dehydrogenase (GAPDH). GAPDH was used as control.
Fig. 3. The sequences and relationships of the primers [modified oligo(dT), PLR, UP-I and UP-II] discussed in this article.
New method of 3¢-end amplification from SAGE tags W J. Xu et al.
5424 FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS
sequences of the products. If the tag sequence was
presented at the predicted location, no NlaIII site
would be present in the sequence of the obtained PCR
product, whereas the PCR product would include the
oligo(dT
16
) sequence. All PCR products were cloned
and sequenced successfully (Table S1). Through analy-
sis of the sequencing result, we identified 16 of 17 PCR

products (Figs 4 and 5) that met the standard men-
tioned above. This indicates that the 16 PCR products
represented a sequence downstream of the most 3¢
NlaIII restriction site. In contrast, the remaining PCR
product was a large size band of the no. 10 product, in
which sequences of UP-II and oligo(dT
16
) were not
found, although the tag-specific primer was found
only in its sequence. This meant that the PCR product
was amplified by PCR using only a single primer (the
tag-specific primer no. 10). Sequencing could only
determine the single primer-prone product. The
sequencing results (Table 1) were analyzed using the
blast program of the NCBI server (http://www.
ncbi.nlm.nih.gov/BLAST/). Among the five fragments
containing known tags (Table 1), four sequences corre-
sponding to the tags A, B, C and E were matched to
the 3¢-cDNA of genes predicted by Zhao based on the
spermatozoa SAGE tags [12], whereas no. D was not
matched to the gene (Hs. 436980). The reason for this
was further investigated, and it was found that the
no. D tag could not represent the gene (Hs. 436980),
because seven NlaIII (CATG) sites were found
between the site of the no. tag D tag and a poly(dA)
among the cDNA of the gene (Hs. 436980). The blast
results of another 11 sequences in the GenBank
Table 1. Overview of all tags analyzed with the TSAT-PCR technique. The sequences from nos. 1 and 7 matched a single sequence. No. 11
matched multiclusters. The rest of the sequences did not match any clusters.
Tag Tag sequence UniGene ID Abundance

PCR product
size (bp)
Presence
of NlaIII site
Presence
of oligo(dT) Blast results
A ACTTACCTGC Hs. 431668 6 89 No Yes Consistency
B GCGTGCCTGC Hs. 372658 302 211 No Yes Consistency
C GCCCCTGCGC Hs. 435464 214 217 No Yes Consistency
D GTGACCACGG Hs. 436980 126 189 No Yes Inconsistency
E GTGGCACACG Hs. 34114 5 192 No Yes Consistency
1 AACGAGGAAT – 84 254 No Yes AK027322
2 GTAAGTGTAC – 44 97 No Yes Unmatched
3 AGAGGTGTAG – 30 232 No Yes Unmatched
4 TTGCCAACAC – 4 94 No Yes Unmatched
5 GAAGTCGGAA – 58 101 No Yes Unmatched
6 GCCGTTCTTA – 21 198 No Yes Unmatched
7 ATTAAGAGGG – 16 165 No Yes NR_003286
8 ATGCCTGTAG – 16 182 No Yes Mismatch
9 GCCTTGTTCA – 13 184 No Yes Unmatched
10 TTCTCAATGA – 10 274 No Yes Unmatched
317 No No –
a
11 CCCATCGTCC – 9 123 No Yes BC010864
BC021246
BC013387
AY211920
BC092442
a
Single-prime PCR product.

500 bp
M
1 2 3 4 5 6 7 8 9 10 11 a
b
c
d
e
400 bp
300 bp
200 bp
100 bp
Fig. 4. TSAT-PCR analysis of 16 tags. Lanes 1–11 were unknown SAGE tags corresponding to tags 1–11 in Table 1. Lanes a, b, c, d
and e were known SAGE tags corresponding to tags A, B, C, D and E in Table 1. TSAT-PCR was performed as described in Results and
Discussion.
W J. Xu et al. New method of 3¢-end amplification from SAGE tags
FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS 5425
database (refseq_rna: reference mRNA sequence and
expressed sequence tags) revealed several cases
(Table 1): match, multimatch, unmatch and mismatch.
The corresponding accession numbers of matched and
multimatched sequences are given in Table 1. No. 8
was defined as a mismatch, because the blast result
showed that the site of the tag did not exactly match
sequences in the GenBank database, due to nonspecific
amplification. The genes corresponding to the matched
sequences (corresponding to tags A and E) are
Hs. 431668 (COX6B1, cytochrome c oxidase subunit
Vib polypeptide 1) and Hs. 34114 [ATP1A2, ATPase,
Na
+

⁄ K
+
-transporting, a2(+) polypeptide], which are
related to energy production for motility of the human
spermatozoa. Hs. 372658, corresponding to no. B, is a
gene coding for spermatogenesis-related protein 7,
which could take part in spermatogenesis. The rest of
the genes corresponding to tags C, 1 and 7 are
Hs. 435464 (Homo sapiens neuritin 1-like), AK027322
(highly similar to signal recognition particle 68 kDa
protein), and NR_003286 (Homo sapiens 18S ribo-
somal RNA). Currently, as little is known of the func-
tion of mRNAs in human spermatozoa, it was difficult
to estimate whether the rest of the genes were related
to the function of human spermatozoa, or just retained
during spermatogenesis. For the unmatched sequences
and multimatched sequences, the 5¢-RACE experiment
should be carried out to obtain its full-length cDNA
sequences and to determine whether the sequences
represent new genes.
During the course of our research on the SAGE
data of the human spermatozoa, we became aware that
other methods [rapid reverse transcription–PCR analy-
sis of unknown SAGE tags (RAST-PCR) [15], genera-
tion of longer cDNA fragments from SAGE tags for
gene identification (GLGI) [16] and reverse SAGE
(rSAGE) [17]] hardly generate the 3¢-fragment
sequences of these unmatched tags. Although GLGI is
more effective than RAST-PCR [17], the antisense pri-
mer in GLGI is only composed of oligo(dT), so the

rigorous PCR conditions, the Mg
2+
concentration, the
number of PCR cycles and the annealing temperature
would be optimized for each SAGE tag. In experi-
ments, we often encountered nonspecific amplification
or multiple fragments, and met difficulties in amplify-
ing the product of low-frequency tags, due to the short
antisense primer. The rSAGE method was derived
from SAGE, and many steps and reagents are shared
by these two protocols. However, step 4 (linker liga-
tion) in the rSAGE protocol does not avoid self-
ligation of the cDNA, and the self-ligation would lead
to smearing in the following PCR amplification. In
addition, the method requires more initial total RNA
and poly(A)
+
than SAGE, because of the loss of
RNA in each step. Thus, the demand for RNA
restricts the application of this method during the low
total RNA experiment, as each human spermatozoon
is estimated to contain just 0.015 pg of total RNA
[18], only 1 ⁄ 600 of the amount of somatic total RNA.
To avoid this problem, we have used semi-nested
PCR to improve the specific amplification, and devel-
oped the method called TSAT-PCR. Using the condi-
tions described in that article [17], we compared the two
methods with six tags and obtained the results that we
expected (Fig. 5). The bands obtained with TSAT-PCR
are obviously clearer than those obtained within GLGI;

moreover, the tags (4, A and E) with low abundance
(< 6) were all obtained with TSAT-PCR.
In comparison with other methods, ours is able to
amplify our target PCR products from low-abundance
transcripts. Also, the method needs a lower initial
amount of mRNA than the with others. Furthermore,
our method possesses the advantages of being simple,
rapid, low in cost, and highly efficient. We have dem-
onstrated that we could obtain a clear band of PCR
products for each case, as well as enough full-length
cDNAs as PCR templates for subsequent experiments
through the novel PCR amplification method described
above.
Although the improved version of SAGE can gener-
ate tags with lengths of 21 bases [19] and 26 bases [13],
which theoretically can be uniquely assigned to a single
500 bp
400 bp
300 bp
200 bp
100 bp
M E C B A 10 4 E C B A 10 4
TSAT-PCR GLGI
Fig. 5. Comparison between GLGI and TSAT-PCR. A set of six SAGE tags was chosen for the analysis. Among the six tags, three tags
(4, A and E) with low abundance (< 6) were examined. The same RNA from human spermatozoa and sense primers was used for both
methods. The conditions used for GLGI followed the procedures described in [16].
New method of 3¢-end amplification from SAGE tags W J. Xu et al.
5426 FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS
genomic position [20], there still exists a much earlier
SAGE database constructed with the use of the

conventional SAGE technique, which consists of
shorter tags (14 bp). Converting short tags to 3¢-longer
cDNA is a key step and a breakthrough for further
studies on SAGE data. Our method would help SAGE
to become a high-throughput technique that could be
widely applied to gene expression.
In summary, the study could be applied to further
analyses of SAGE data gathered from humans and
some eukaryotic species. Our approach has several
important advantages, such he following: (a) it can
obtain enough full-length cDNA templates for sub-
sequent experiments, such as 5¢-RACE, 3¢-RACE and
northern blotting, among others; (b) it can convert
short SAGE tag sequences into 3¢-complementary
DNAs; (c) it can obtain full-length DNA sequences
containing specific tags from mRNA transcripts, espe-
cially low-abundance mRNA transcripts, through the
combined application of TSAT-PCR and 5¢-RACE;
and (d) it can identify novel genes from SAGE data
and confirm the existence of exons predicted by bio-
informatic tools in genomic sequences.
Experimental procedures
Tag sequences
In our SAGE library generated from human spermatozoa,
each tag was homologously screened in the Unigene data-
base ( />to identify its respective match. We chose 16 SAGE tags,
including four tags corresponding to known genes, which
served as a positive control for this experiment, and 11 dif-
ferent-abundance tags from the 54 unmatched tags corre-
sponding to unknown genes.

RNA samples and cDNA synthesis
Total RNA of purified spermatozoa was extracted using
Trizol RNA isolation reagent (Invitrogen, Carlsbad, CA,
USA), according to the manufacturer’s protocol (http://www.
invitrogen.com/content/sfs/manuals/10296010.pdf). The
quantity of extracted RNA was determined by UV absorp-
tion. Meanwhile, cDNAs were generated with a modified
RACE method through the PrimeScript Reverse Transcrip-
tase (TaKaRa, Dalian, China), following the manufacturer’s
instructions. Briefly, two kinds of primers were added in the
RT-PCR reaction: one was the modified oligo(dT) primer
(5¢-CCAGA CACTATGCTCATACGACGCAG-T
16
-VN-3¢;
N =A,C,G,orT;V = A, G, or C), which was used as a
reverse transcription primer to generate the first-strand
cDNA; and the other was the 5¢-cap oligonucleotide primer
(5¢-AAGCAGTGGTATCAACGCAGAGTACGCGGG-3¢),
which annealed to the dC-rich cDNA tail and served as an
extended template for reverse transcription. Thus, a set of
full-length cDNAs can now serve as a primary library of
spermatozoa cDNAs to be used for further studies.
Amplification of primary library
The full-length cDNAs in spermatozoa were amplified by
PCR with the use of Takara Ex Taq Hot Start Version
(TaKaRa), with the primary library sequences serving as the
template. Briefly, PLF (5¢-AAGCAGTGGTATCAACGCA
GAGT-3¢) was used as the sense primer, and was located at
the 5¢-end of all cDNAs generated from the 5¢-cap oligonu-
cleotide primer. Meanwhile, PLR, which used the sequence

(5¢-CCAGACACTATGCTCATACGACG-3¢) in the 3¢-ends
of all cDNAs incorporated from the reverse transcription
primer, was used as the antisense primer in the PCR. The
PCR program consisted of 25 cycles of 94 °C for 30 s, 66 °C
for 30 s and 72 °C for 3 min. The final extension step con-
sisted of 72 °C for 5 min. Ten microliters of the PCR product
was checked by 1.2% agarose gel electrophoresis.
TSAT-PCR
The amplified primary library was diluted 10
3
-fold with
sterile H
2
O for TSAT-PCR analyses. A 1-lL aliquot was
directly used as a template for the first PCR amplification
with the tag-specific primer (5¢-GGATCCXXXXXXXXXX,
X represents each tag) and UP-I (5¢-CCAGACACTAT
GCTCATA-3¢). The reaction was then carried out for 15
cycles with the following conditions: 94 °C for 30 s, 53–
55 °C for 30 s and 72 °C for 30 s extension with TaKaRa
Ex Taq (TaKaRa), using a Bio-Rad Cycler (Bio-Rad, Her-
cules, CA, USA). The resulting PCR product was diluted
10
3
-fold with sterile H
2
O, and a 1 lL aliquot was used as a
template for the second nested PCR amplification with the
tag-specific primer and UP-II (5¢-CACTATGCTCATAC
GACGCAGT-3¢) with the following conditions: 25–30

cycles of 94 °C for 30 s, 60 °C for 30 s and 72 °C for 30 s,
using TaKaRa Ex Taq (TaKaRa).
DNA cloning and sequencing
The PCR products were cloned into pT19G-T vector (Gen-
eray Biotech, Shanghai, China). Positive clones were
screened by PCR with M13 reverse and M13 forward
(220 bp) primers while located in the vector; sequencing
reactions were performed by Sanny Bio-Tech (Shanghai,
China).
Acknowledgements
This work was supported by Shanghai Leading
Academic Discipline Project (B205).
W J. Xu et al. New method of 3¢-end amplification from SAGE tags
FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS 5427
References
1 Velculescu VE, Zhang L, Vogelstein B & Kinzler KW
(1995) Serial analysis of gene expression. Science 270,
484–487.
2 Madden SL, Galella EA, Zhu J, Bertelsen AH & Beau-
dry GA (1997) SAGE transcript profiles for p53-depen-
dent growth regulation. Oncogene 15, 1079–1085.
3 Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai
MA, Bassett DE Jr, Hieter P, Vogelstein B & Kinzler
KW (1997) Characterization of the yeast transcriptome.
Cell 88, 243–251.
4 Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban
RH, Hamilton SR, Vogelstein B & Kinzler KW (1997)
Gene expression profiles in normal and cancer cells.
Science 276, 1268–1272.
5 Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J,

Rago C, Lal A, Wang CJ, Beaudry GA, Ciriello KM
et al. (1999) Analysis of human transcriptomes. Nat
Genet 23, 387–388.
6 Hashimoto S, Suzuki T, Dong HY, Nagai S, Yamazaki
N & Matsushima K (1999) Serial analysis of gene
expression in human monocyte-derived dendritic cells.
Blood 94, 845–852.
7 Hibi K, Liu Q, Beaudry GA, Madden SL, Westra WH,
Wehage SL, Yang SC, Heitmiller RF, Bertelsen AH,
Sidransky D et al. (1998) Serial analysis of gene
expression in non-small cell lung cancer. Cancer Res 58,
5690–5694.
8 Lal A, Lash AE, Altschul SF, Velculescu V, Zhang L,
McLendon RE, Marra MA, Prange C, Morin PJ, Pol-
yak K et al. (1999) A public database for gene expres-
sion in human cancers. Cancer Res 59 , 5403–5407.
9 Boon K, Osorio EC, Greenhut SF, Schaefer CF, Shoe-
maker J, Polyak K, Morin PJ, Buetow KH, Strausberg
RL, De Souza SJ et al. (2003) An anatomy of normal
and malignant gene expression. Proc Natl Acad Sci
USA 99, 11287–11292.
10 Lee S, Zhou G, Clark T, Chen J, Rowley JD & Wang
SM (2001) The pattern of gene expression in human
CD15+ myeloid progenitor cells. Proc Natl Acad Sci
USA, 98, 3340–3345.
11 Zhou G, Chen J, Lee S, Clark T, Rowley JD & Wang
SM (2001) The pattern of gene expression in human
CD34+ hematopoietic stem ⁄ progenitor cells. Proc Natl
Acad Sci USA 98, 13966–13971.
12 Zhao YX, Li QL, Yao CJ, Wang ZX, Zhou Y, Wang

YJ, Liu LM, Wang YF, Wang LY & Qiao ZD (2006)
Characterization and quantification of mRNA tran-
scripts in ejaculated spermatozoa of fertile men by serial
analysis of gene expression. Hum Reprod 21, 1583–1590.
13 Matsumura H, Reuter M, Kru
¨
ger DH, Winter P, Kahl
G & Terauchi R (2008) SuperSAGE. Methods Mol Biol
387, 55–70.
14 Borson ND, Sato WL & Drewes LR (1992) A lock-
docking oligo(dT) primer for 5¢ and 3¢RACE PCR.
PCR Methods Appl 2, 144–148.
15 van den Berg A, van der Leij J & Poppema S (1999)
Serial analysis of gene expression: rapid RT-PCR
analysis of unknown SAGE tags. Nucleic Acids Res 27
,
e17.
16 Chen JJ, Rowley JD & Wang SM (2000) Generation of
longer cDNA fragments from serial analysis of gene
expression tags for gene identification. Proc Natl Acad
Sci USA 97, 349–353.
17 Richards M, Tan SP, Chan WK & Bongso A (2006)
Reverse serial analysis of gene expression (SAGE) char-
acterization of orphan SAGE tags from human embry-
onic stem cells identifies the presence of novel
transcripts and antisense transcription of key pluri-
potency genes. Stem Cells 24 , 1162–1173.
18 Miller D, Ostermeier GC & Krawetz SA (2005) The
controversy, potential and roles of spermatozoal RNA.
Trends Mol Med 11, 156–163.

19 Wahl MB, Heinzmann U & Imai K (2005) Long SAGE
analysis significantly improves genome annotation: iden-
tifications of novel genes and alternative transcripts in
the mouse. Bioinformatics 21, 1389–1392.
20 Saha S, Sparks AB, Rago C, Akmaev V, Wang CJ,
Vogelstein B, Kinzler KW & Velculescu VE (2002)
Using the transcriptome to annotate the genome.
Nat Biotechnol 20, 508–512.
Supporting information
The following supplementary material is available:
Table S1. The amplified longer cDNA sequences.
This supplementary material can be found in the
online version of this article.
Please note: Wiley-Blackwell is not responsible for
the content or functionality of any supplementary
materials supplied by the authors. Any queries (other
than missing material) should be directed to the
corresponding author for the article.
New method of 3¢-end amplification from SAGE tags W J. Xu et al.
5428 FEBS Journal 275 (2008) 5422–5428 ª 2008 The Authors Journal compilation ª 2008 FEBS

×