Tải bản đầy đủ (.pdf) (16 trang)

Báo cáo khoa học: Genome-wide identification of glucosinolate synthesis genes in Brassica rapa potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (695.71 KB, 16 trang )

Genome-wide identification of glucosinolate synthesis
genes in Brassica rapa
Yun-Xiang Zang
1,2,
*, Hyun Uk Kim
1,
*, Jin A Kim
1
, Myung-Ho Lim
1
, Mina Jin
1
, Sang Choon Lee
1
,
Soo-Jin Kwon
1
, Soo-In Lee
1
, Joon Ki Hong
1
, Tae-Ho Park
1
, Jeong-Hwan Mun
1
, Young-Joo Seol
1
,
Seung-Beom Hong
3
and Beom-Seok Park


1
1 Genomics Division, Department of Agricultural Bio-resources, National Academy of Agricultural Science (NAAS), Rural Development
Administration (RDA), Suwon, Korea
2 School of Agricultural and Food Science, Zhejiang Forestry University, Lin’an, Hangzhou, China
3 Department of Biology, San Jacinto College, Houston, TX, USA
Keywords
bioinformatics; biosynthesis pathway;
Brassica rapa; gene identification;
glucosinolate
Correspondence
B. S. Park, Genomics Division, Department
of Agricultural Bio-resources, National
Academy of Agricultural Science (NAAS),
Rural Development Administration (RDA),
Suwon 441-707, Korea
Fax: +82 31 299 1672
Tel: +82 31 299 1671
E-mail:
*These authors contributed equally to this work
Database
The following have been deposited to the
GenBank database. Accession numbers are
shown in parenthesis: BrBCAT4 (FJ376036–
FJ376037), BrMAM (FJ376038–FJ376041),
BrBCAT3 (FJ376042–FJ376043), BrCYP79F1
(FJ376044), BrCYP79B2 (FJ376045–
FJ376046), BrCYP79B3 (FJ376047),
BrCYP79A2-1 (FJ376048), BrCYP83A1
(FJ376049–FJ376050), BrCYP83B1
(FJ376051), BrC-S lyase (FJ376052–

FJ376053), BrUGT74B1-1 (FJ376054),
BrUGT74C1 (FJ376055–FJ376057), BrST5a
(FJ376058–FJ376059), BrST5b (FJ376060–
FJ376068), BrST5c-1 (FJ376069), BrFMO
GS-OX1
(FJ376070), BrFMO
GS-OX5
(FJ376071),
BrAOP2 (FJ376073), BrGSL-OH (FJ376074),
BrBZO1p (FJ376075), BrDof1.1 (FJ584284–
FJ584285), BrIQD1-1 (FJ584286), BrMYB28
(FJ584287–FJ584289), BrMYB29 (FJ584290–
FJ584292), BrMYB34 (FJ584293–FJ584295),
BrMYB51 (FJ584296–FJ584299),
BrMYB122-1 (FJ584300)
(Received 15 February 2009, revised 31
March 2009, accepted 24 April 2009)
doi:10.1111/j.1742-4658.2009.07076.x
Glucosinolates play important roles in plant defense against herbivores and
microbes, as well as in human nutrition. Some glucosinolate-derived isothi-
ocyanate and nitrile compounds have been clinically proven for their anti-
carcinogenic activity. To better understand glucosinolate biosynthesis in
Brassica rapa, we conducted a comparative genomics study with Arabidopsis
thaliana and identified total 56 putative biosynthetic and regulator genes.
This established a high colinearity in the glucosinolate biosynthesis path-
way between Arabidopsis and B. rapa. Glucosinolate genes in B. rapa share
72–94% nucleotide sequence identity with the Arabidopsis orthologs and
exist in different copy numbers. The exon ⁄ intron split pattern of B. rapa is
almost identical to that of Arabidopsis, although inversion, insertion, dele-
tion and intron size variations commonly occur. Four genes appear to be

nonfunctional as a result of the presence of a frame shift mutation and
retrotransposon insertion. At least 12 paralogs of desulfoglucosinolate
sulfotransferase were found in B. rapa, whereas only three were found in
Arabidopsis. The expression of those paralogs was not tissue-specific but
varied greatly depending on B. rapa tissue types. Expression was also
developmentally regulated in some paralogs but not in other paralogs.
Most of the regulator genes are present as triple copies. Accordingly, gluc-
osinolate synthesis and regulation in B. rapa appears to be more complex
than that of Arabidopsis. With the isolation and further characterization of
the endogenous genes, health-beneficial vegetables or desirable animal feed
crops could be developed by metabolically engineering the glucosinolate
pathway.
Abbreviations
BAC, bacterial artificial chromosome; CDS, coding sequence; EST, expressed sequence tag; LTR, long terminal repeat; MAM,
methylthioalkylmalate synthase; NCBI, National Center for Biotechnology Information.
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3559
Glucosinolates, a group of sulfur-rich secondary
metabolites, have received much attention because
their breakdown products display several potent bio-
activities that serve as plant defense, as well as anti-
carcinogenesis compounds, in mammals [1–3]. Upon
tissue disruption, the enzyme myrosinase cleaves off
the glucose group from a glucosinolate, and the
remaining molecule then quickly converts to a bioac-
tive substance (i.e. an isothiocyanate, nitrile or thiocya-
nate). Among the isothiocyanates, sulforaphane, a
derivative of glucoraphanin, is known to be the most
promising anticancer agent because of its strong and
broad spectrum activity against several types of cancer
cells [3–10]. Indole-3-carbinol, a derivative of gluco-

brassicin, also comprises a good anticarcinogen. Both
exhibit their effects by inducing phase II detoxification
enzymes, altering estrogen metabolism, blocking the
cell cycle or protecting against oxidative damages
[11–15]. Phenethyl isothiocyanate, a derivative of
gluconasturtiin, was reported to be effective for
chemoprotection [16–18], although it possesses geno-
toxic activity [19–21]. Crambene (1-cyano-2-hydroxy-
3-butene), an aliphatic nitrile derived from progoitrin,
upregulates the synthesis of glutathione S-transferase
in the liver and other organs [22].
Glucosinolates are classified into three major groups,
namely aliphatic, indolyl and aromatic glucosinolates,
based on the amino acids from which they are synthe-
sized [23]. Biosynthesis of aliphatic and aromatic gluc-
osinolates generally involves three steps (Fig. 1) and
begins with the elongation of methionine and phenyl-
alanine, respectively. The initial step of aliphatic
glucosinolate synthesis is catalyzed by methylthioalkyl-
malate synthase (MAM) to form the elongated
homologs [24,25]. The core structures are made via
oxidation by cytochrome P450 enzymes, CYP79 and
CYP83, followed by C-S cleavage, glucosylation and
sulfation. Finally, the side chains are modified by
oxidation, elimination, akylation or esterification.
Some of the genes involved in this step, FMO
GS-OX15
,
AOP, GSL-OH and BZO1, have been isolated recently
[26–31].

Cruciferous vegetables, including broccoli, cabbage,
Chinese cabbage, cauliflower, and brussels sprouts, are
rich in glucosinolates. A high intake of cruciferous
vegetables was shown to significantly reduce the risk
of certain cancers and cardiovascular diseases [32–34].
Chinese cabbage (Brassica rapa ssp. pekinensis) is one
of the most highly consumed vegetable crops in Asia.
However, unlike broccoli, many Chinese cabbage culti-
vars do not produce detectable levels of glucoraphanin.
To date, most of the structural genes responsible for
the biosynthesis of the three groups of glucosinolates
have been identified and characterized in Arabidopsis
[23,35]. In addition, several regulators that control
glucosinolate biosynthesis have been identified recently
in Arabidopsis [36–43]. However, little is known about
the specific genes existing in Brassica crops, except for
the MAM and AOP genes in Brassica oleracea [44–46].
The glucosinolate profile is highly dependent on
genotype, although it is also affected by developmental
or environmental changes [47–49]. Previously, we
reported that the ectopic expression of Arabidopsis
glucosinolate synthesis genes altered the glucosinolate
profile in Chinese cabbage [50,51]. Because most of the
Arabidopsis genes encoding glucosinolate biosynthesis
pathways have been identified and Chinese cabbage is
a close relative of Arabidopsis, comparative genomic
studies will allow for the easy identification of relevant
genes in Brassicas. The identification and characteriza-
tion of glucosinolate synthesis genes in Chinese cab-
bage would pave the way for further improvement of

agronomic traits via genetic engineering. In the present
study, we report the genome-wide identification of
B. rapa glucosinolate synthesis (BrGS) and regulator
genes using our B. rapa genome sequence in conjunc-
tion with the available Arabidopsis sequence. We also
show that many BrGS genes exist in a small multigene
family and that at least 12 desulfoglucosinolate
sulfotransferase (BrST) paralogs are present and are
differentially expressed.
Results
BrGS gene identification from cDNA and bacterial
artificial chromosome (BAC) libraries
As part of the B. rapa genome sequencing project, we
produced 127 143 expressed sequence tags (ESTs) from
28 different cDNA libraries that were released to the
National Center for Biotechnology Information
(NCBI) database and a new B. rapa EST database,
BrEMD ( with
microarray data. Furthermore, we determined more
than 127 000 BAC end sequences, and approximately
589 seed BACs were sequenced and anchored in
Arabidopsis whole chromosomes. The 65.8 Mb seed
BAC sequence information covered approximately
75.3% of the Arabidopsis genome and 40% of the
B. rapa euchromatin region [52]. On the basis of these
databases, homologous genes were identified by a
blastn search using the Arabidopsis gene sequence as
query. All the ESTs that matched each query sequence
were aligned to remove the redundant clones, and
EST clones containing a start codon were resequenced

to generate the full-length cDNA sequence. Through
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3560 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS
Fig. 1. Biosynthesis pathways of the three major groups of glucosinolates in B. rapa. The genes involved in each step are shown. Numbers
in parenthesis denote gene copy numbers.
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3561
this alignment, a total of 35 different genes was found
from ESTs. In the same way, blastn searches were per-
formed against the BAC sequence databases, yielding
44 different genes, among which 23 overlapped the EST
sequences. Thus, a total of 56 individual genes was
identified from both EST and BAC clones, of which
44 contained the full-length coding sequence (CDS)
(Fig. 1, Tables 1 and 2). They contain all the homologs
of Arabidopsis except for CYP79F2, FMO
GS-OX24
,
AOP3 and MYB76.InArabidopsis, AOP2 and AOP3
are tandemly located on chromosome IV [29]; however,
AOP2 was only found in B. rapa. The same observation
was also made in B. oleracea [45]. This suggests that
duplication occurred in Arabidopsis after its divergence
from Brassica. Four genes, BrUGT74C1-1, BrST5b-6,
BrST5b-4 and BrMYB122-1, appear to be nonfunc-
tional as a result of a frame shift or retrotransposon–
insertion mutations (Fig. 2).
To estimate the total number of putative BrGS genes
in the whole genome of B. rapa , a genomic blot was
performed using the CYP79F1 ⁄ F2, CYP79B2 ⁄ B3,

CYP83A1 and CYP83B1 genes as probes (see Support-
ing information, Fig. S1) [53]. This analysis predicted
the presence of a total of eight genes (two, three, two
Table 1. Comparison of putative BrGS biosynthetic genes with the Arabidopsis orthologs. The nucleotide sequence of the coding region
was used for comparison analysis; the BrGS gene sequence is from the partial- or full-length CDS; the single percentage indicates the single
B. rapa orthologous sequence that was available. Most of the genes are full length except those marked with an asterisk.
Glucosinolate pathway
B. rapa gene
name
Corresponding
AGI
No. of
genes
found
Corresponding clones
% Identity At
and B. rapaBAC EST
Amino acid side chain
elongation
BrBCAT4 At3g19710 2 KBrH046K16 BR069190 83.8–83.9
h BR005855
78.4–87.0
BrMAM At5g23010 4 BR043724*
KBrB010E08F* h
At5g23020 h BR003821
KBrH121C04F* h
BrBCAT3 At3g49680 2 KBrH045F08R BR008244* 84.7–87.0
h BR080925*
Core structure formation
step

BrCYP79F1 At1g16410 1 KBrB035G16 BR007081 85.4
BrCYP79B2 At4g39950 2 KBrB022O03 BR046183 89.0–89.3
KBrH106H11 BR1 20984
BrCYP79B3 At2g22330 1 h BR098069 89.5
BrCYP79A2 At5g05260 1 KBrS003K07 h 85.6
BrCYP83A1 At4g13770 2 KBrH086H05R BR058540 87.3–87.4
KBrH009l04 BR061092
BrCYP83B1 At4g31500 1 h BR091686 90.1
BrC-S lyase At2g20610 2 h BR087395 86.6–87.1
KBrH010M17 BR098736
BrUGT74B1 At1g24100 1 KBrH015M19 BR100939 84.5
h BR043439
BrUGT74C1 At2g31790 3 h BR043626 85.1–88.4
KBrH036G21R* h
BrST5a At1g74100 2 KBrB119E11F BR015379 86.6–86.6
KBrB056L1 BR082516
KBrB056L15 BR059286
KBrH096A10 h
BrST5b At1g74090 9 KBrB041J04 h 76.0–85.9
KBrB034H04 h
KBrB069A23 h
BrST5c At1g18590 1 h BR094491 85.7
Side chain modification BrFMO
GS-OX1
At1g65860 1 KBrS002F02F BR021429 83.3
BrFMO
GS-OX5
At1g12140 1 KBrB032C15 BR116105 83.3
BrAOP2 At4g03060 1 KBrB002P01 BR067797 71.9
BrGSL-OH At2g25450 1 KBrH047C14 BR097443 85

BrBZO1p At1g65880 1 KBrB083K19 h 81.1
References [24], [25], [26], [27], [28], [29], [31], [57], [58], [59], [79]
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3562 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS
and one copies for each gene, respectively). On the
other hand, a total of seven genes was found from our
database search for those genes, suggesting that the
percentage of BrGS genes identified in the present
study is approximately 87.5%.
BrGS gene identity with Arabidopsis and other
Brassica orthologs
BrGS biosynthetic genes share 72–90% nucleotide
sequence identity with Arabidopsis orthologs and 28
genes exist in a small multigene family (Table 1). This
close relatedness is further substantiated by our phy-
logenetic tree analyses (Fig. 3; see also Supporting
information, Figs S2–S11). However, most of the
BrGS genes share more than 90% identity with other
Brassica orthologs (Table 3). This is consistent with
the notion that the Brassica species evolved after
divergence from the Arabidopsis lineage. Notably,
BrAOP2 has the lowest sequence identity with the
orthologs of Arabidopsis and B. oleracea. Identities
within the BrGS paralogs are usually higher than
those with Arabidopsis and other Brassica species. All
of the BrST5b paralogous genes except BrST5b-4
share more than 80% sequence identity with AtST5b
(Table 4). Identities between BrST5b and AtST5b
(76–86%) are comparable to those between tandem
BrST5b repeats (77–89%) and between nontandem

repeats BrST5b-6 and BrST5b-9 (88%) (Fig.4,
Table 4). This suggests that duplication occurred after
a very recent divergence between Arabidopsis and
B. rapa. One putative benzoate-CoA ligase gene
BrBZO1p was identified (see also Supporting informa-
tion, Fig. S11). It has a similarity of 81% compared
to both BZO1 and At1g65890.
Similar to the biosynthetic genes, BrGS regulator
genes share 81–94% nucleotide sequence identity with
Arabidopsis orthologs and 15 genes exist in a small
multigene family (Table 2). Most of the genes are trip-
licated, indicating that regulator genes are mostly
retained after the Brassica genome triplication.
Structure of BrGS genes
Ordered assembly of the overlapping sequences of
BAC and EST clones yielded the overall gene struc-
tures shown in Fig. 5. The exon and intron structures
of the BrGS genes were identical to those of
Arabidop-
sis homologs. However, insertion, deletion and intron
size variations were commonly noted in BrGS genes.
One of the two BrC-S lyase genes had a 2 bp deletion
at the last exon, which resulted in a 3¢ truncated pro-
tein with a 16 amino acid deletion compared to the
Arabidopsis homolog. The truncation of 3¢ end exon
might alter either gene function or the expression pat-
tern in such a way to change feedback regulation, as
previously proposed by Gao et al. [46]. Desulfogluco-
sinolate sulfotransferase genes did not have any intron
in both Arabidopsis and B. rapa (Fig. 5A). The AOP2

structure of B. rapa was compared with that of B. oler-
acea and Arabidopsis. All three species contained four
exons and three introns, along with considerable
changes in intron sizes (Fig. 5B). One of the two
BrST5a genes contained a 3 bp insertion (Fig. 5A),
which did not lead to a frame shift mutation.
Insertion or deletion often gives rise to a frame shift
mutation that causes the loss of gene function. This
type of mutation occurred in two BrGS genes with pre-
mature stop codons immediately after the deletion sites
(Fig. 2A). Among nine BrST5b paralogs, BrST5b-4
A
B
Fig. 2. Structures of the predicted nonfunctional BrGS genes. (A) Three of the four carried deletion mutations and (B) the fourth one carried a
putative restrotransposon insertion. A non-LTR retrotransposon insertion is marked by approximately 6 kb insertion. Asterisks indicate the posi-
tion of a premature stop codon. Thick, thin and dotted lines denote the exon, intron and the gap between BrST5b-4 and BrST5b-x, respectively.
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3563
appears to be a pseudogene because it contains an
approximately 6 kb insert of a putative non-long ter-
minal repeat (LTR) retrotransposon that encodes a
reverse transcriptase (Fig. 2B). Transposon insertion
mutations in coding sequences or intergenic regions
were also previously observed in B. oleracea [45].
Another gene, BrST5b-x, with only a 150 bp 3¢ end
partial sequence, was found to be located in the inter-
genic region approximately 500 bp downstream of
BrST5b-4 (Fig. 2B). However, we did not consider this
as another copy of BrST5b because of the presence of
only a small amount of remainder sequence as a result

of a massive deletion event. Pseudogenes are assumed
to arise frequently during genome evolution and are
often regarded as ‘molecular fossils’ in evolutionary
genomics [54]. Pseudogenes might be the result of
natural selection reducing functional redundancy.
However, the divergent copies of duplicated genes
would be further diversified to evolve for neofunction-
alization or subfunctionalization [55,56].
Divergent duplication and differential expression
of BrST genes
In comparison with three orthologs in Arabidopsis,
desulfoglucosinolate sulfotransferase exists in a small
Table 2. Comparison of putative BrGS regulator genes with the Arabidopsis orthologs. The nucleotide sequence of the coding region was
used for comparison; the BrGS gene sequences used are either partial or full-length CDS. Most of the genes are full length except those
marked with an asterisk.
Transcription factors
B. rapa gene
name
Corresponding
AGI
No. of
genes
found
Corresponding clones
% Identity At
and B. rapaBAC EST
Nuclear-localized regulators BrDof1.1 At1g07640 2 KBrH010M08 h 81.7–81.9
KBrB056G08R* h
BrIQD1-1 At3g09710 1 KBrB055G10 h 81.4
R2R3-Myb transcription

factors for aliphatic
glucosinolates
BrMYB28 At5g61420 3 KBrB034G03 BR046041 84.1–85.4
KBrB051M06 h
KBrH005L20 BR078654
BrMYB29 At5g07690 3 KBrBOO1B07 BR005887 83.2–87.0
KBrS005P19F* h
KBrB132A06R* h
R2R3-Myb transcription
factors for indole
glucosinolates
BrMYB34 At5g60890 3 KBrB051M06 BR012922 82.1–94.3
KBrB118H07R &
KBrH078K01R
BR115967
h BR102850*
BrMYB51 At1g18570 4 KBrH092O19 BR104839 81.0–89.6
KBrB065D24 BR101256
h BR116816*
KBrE052E18F* h
BrMYB122 At1g74080 1 KBrB056L15 h 82.4
References [36], [37], [38], [39], [40],[41],[42],[43]
Fig. 3. Nonrooted neighbor-joining phylogenetic tree of B. rapa
desulfoglucosinolate sulfotransferases and Arabidopsis sulfotrans-
ferases. Coding sequences of AtST5b were used to identify the
orthologs between these two species because some of BrST5b are
pseudogenes. Bootstrap values with 500 replicates are denoted as
percentages.
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3564 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS

Fig. 4. Comparative map of the five BACs
containing BrST paralogs and their counter-
parts in Arabidopsis. At Chr1, Arabidopsis
chromosome 1; Br R7 (Chr7), B. rapa link-
age group R7 (chromosome 7); Br R9
(Chr1), B. rapa linkage group R9 (chromo-
some 1); Mb, megabase; cM, centimorgan.
The loci of AtST5a,b (At1g74100 and
At1g74090) and BrST counterparts are indi-
cated by oval-shaped bars. The loci that cor-
respond to the five Brassica BACs are all
located on Arabidopsis chromosome 1 and
are marked by stick bars. Colinear and non-
colinear genes are indicated by dashed and
dotted lines, respectively. The location of
KBrB034H04 on the B. rapa chromosome
has not yet been established.
Table 3. Sequence similarities between BrGS genes and other Brassica orthologs. The nucleotide sequence of the coding region was used
for comparison analysis; the highest percentages are shown when a gene has several copies.
B. rapa
gene
Gene of other
Brassica species
%of
homology Sources
BrMAM BoGSL-Elong(L) 97.8 Brassica oleracea BAC clone B19N3 [80]
BrCYP79F1 BoCYP79F1 97.6 Brassica oleracea BAC clone B77C13 (accession EU579455, NCBI)
BrCYP79B2 SaCYP79B2 93.9 Sinapis alba cytochrome P450 CYP79B1 (accession AF069494) [81]
BnCYP79B2 99 Brassica napus cytochrome P450 CYP79B5 (accession AF453287) [82]
BrUGT74B1 BnUGT74B1 97.4 Brassica napus thiohydroximate S-glucosyltransferase

BoUGT74B1 97.3 Brassica oleracea BAC clone B16J1 (accession EU579454)
BrGSL-OH BoGSL-OH 84.3 Brassica oleracea BAC clone B67C16
BrAOP2 BoGSL-ALKa(b) 77 Brassica oleracea BAC clone B21H13 [45]
Table 4. Similarity and divergence among desulfoglucosinolate sulfotransferase genes of Arabidopsis and B. rapa. Values represent the per-
centage similarity in the upper triangle area and percentage divergence in the lower triangle area as demarcated by the diagonally aligned
black squares; full-length CDS was employed for the analyses using
DNASTAR software (DNASTAR Inc., Madison, WI, USA).
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3565
multigene family with 12 paralogs in which two EST
clones are not mapped on B. rapa (Table 1, Fig. 4).
Most of them are clustered in a tandem array, as
shown in the chromosomal loci of BAC clones
(Fig. 4). In addition, they are usually clustered
together in the phylogenetic tree (Fig. 3).
Two Arabidopsis desulfoglucosinolate sulfotrans-
ferases, AtST5a (At1g74100) and AtST5b (At1g74090),
are involved in the biosynthesis of indolyl and ali-
phatic glucosinolates, respectively [57]. Nevertheless,
they share 80% nucleotide identity and are tandemly
located on chromosome 1 (Fig. 4). Thus, we examined
the expression patterns of BrST genes in different tis-
sues by RT-PCR (Fig. 6). BrActin1,anactin gene of
B. rapa, was used as an internal control to adjust the
amount of cDNA template for PCR because it is con-
stitutively expressed in all types of tissues. Primers
were designed from the gene specific untranslated
region (see Supporting information, Table S1). All of
the genes except BrST5b-4 were expressed in all six dif-
ferent tissues, although the expression profiles were dif-

ferent. Generally, BrST5b was expressed at higher
levels than BrST5c but at lower levels than BrST5a.
Specifically, BrST5b-6 and BrST5b-7 were expressed at
the lowest levels because their products were not
shown until after 40 cycles of PCR (Fig. 6). All of the
PCR products were sequenced and were matched to
individual gene sequences (data not shown). BrST5a-1
A
B
Fig. 5. Structures of representative BrGS genes. (A) Comparison
with Arabidopsis orthologs. (B) Structural comparison of AOP2
orthologs of Arabidopsis (At), B. rapa (Br) and B. oleracea (Bo).
Representative BrGS gene structures were composed based on
the full-length genomic, cDNA, or coding sequences of BAC and
EST clones. Arabidopsis gene structures were generated according
to NCBI sequence information. Each pair of genes was aligned in a
colinear form. Positions of introns are indicated by the triangles,
above which intron sizes (bp) are shown as numerals. The position
and size of the nucleotide insertion and deletion are also marked as
In ⁄ Del.
Fig. 6. RT-PCR analysis of BrST genes in different types of tissues.
L, mature leaf; R, mature root; FB, floral bud; SL, seedling; S, sta-
men; C, carpel. The PCR products of the BrST genes are approxi-
mately 1 kb; BrActin1 is approximately 450 bp, which serves as an
internal control.
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3566 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS
was strongly expressed in all tissues except the stamen.
By contrast, BrST5a-2 was strongly expressed in the
stamen, but weakly in the floral bud and carpel. Over-

all, BrST5b paralogs were expressed at a very low level
in the floral bud. However, some genes (i.e. BrST5b-1
and BrST5b-2) were expressed strongly in the carpel,
whereas others (i.e. BrST5b-2, BrST5b-8 and BrST5b-9)
were expressed strongly in the stamen. The results
obtained demonstrate that the expression of the
paralogs is not tissue-specific but varies greatly
depending on tissue type. In terms of the overall
expression level, mature leaf and root expressed BrST
paralogs at higher levels than other tissues, demonstra-
ting functional redundancy for differential expression.
In seedling tissue, BrST5a paralogs were more strongly
expressed than BrST5b paralogs. No significant differ-
ences in the expression levels of BrST5a paralogs were
noted between the seedling and mature leaf and root
tissues. On the other hand, significant differences
between those tissues were found for the expression
levels of BrST5b paralogs except BrST5b-1. Thus,
expression is developmentally regulated in some BrST5b
paralogs but not in BrST5a paralogs.
Discussion
Similarity between B. rapa and Arabidopsis
in the glucosinolate biosynthesis pathway
Our B. rapa genome sequence database searches identi-
fied the counterparts of most of Arabidopsis glucosino-
late synthesis genes, and they are present in various
copy numbers (Fig. 1). Only a few genes that corre-
spond to Arabidopsis CYP79F2, FMO
GS-OX24
and

AOP3 were not found in B. rapa. Thus, a high colin-
earity in the glucosinolate biosynthesis pathway exists
between Arabidopsis and B. rapa despite the difference
in gene copy numbers.
As the first step, two different genes, BCAT and
MAM, are known to be involved in the chain elongation
of Met-derived aliphatic glucosinolate biosynthesis.
BCAT4 and BCAT3 enzymes catalyze the deamination
and transamination, respectively [58,59]. B. rapa con-
tains two BCAT4 paralogs that have 92% nucleotide
sequence identity and are the same size. B. rapa also
carries two BCAT3 paralogs, one of which has a full-
length CDS. MAM enzyme catalyzes the condensation
of acetyl-coenzyme A with a series of x-methylthio-
2-oxoalkanoic acids. MAM1 ⁄ MAM2, two tandem
paralogs found in some of Arabidopsis ecotypes, are
responsible for the first two cycles of chain elongation
[24]. MAM3 enzyme catalyzes all the different cycles of
Met chain elongation [25]. We identified four MAM
paralogs in the B. rapa genome that share approxi-
mately 78–87% identities with the Arabidopsis ortho-
logs, although we were unable to determine which
of these is individually equivalent to MAM1, 2 and 3.
Two of them are not identical in an approximately
200 bp region of the 3¢ ends. This is also the case in the
B. oleracea MAM (BoGSL-ELONG) gene family [46]
and did not affect its enzymatic function equivalent to
the Arabidopsis ortholog MAM1 [60]. In addition to
tissue-dependent differential expression, the members of
BrMAM gene family may encode enzymes of different

biochemical properties with respect to chain elongation,
such as Arabidopsis MAM orthologs. Two Arabidopsis
genes, IPMS1 and IPMS2, that encode isopropylmalate
synthase are similar to the MAM family genes, with
60% similarity in their amino acid sequence [46,61].
Nevertheless, they are not involved in Met chain
elongation but are involved in leucine biosynthesis. Phy-
logenetic analysis indicates that the BrMAM genes do
not belong to the IPMS family but, instead, belong to
the the MAM gene family. We were unable to identify
the genes responsible for phenylalanine chain elonga-
tion, an initial step of aromatic glucosinolate synthesis,
because the corresponding genes have not yet been
isolated in Arabidopsis and other Brassica species.
As the second step, the formation of the glucosino-
late core structure is initiated by the conversion of
amino acid to the corresponding aldoxime, and this is
catalyzed by the CYP79 enzymes [23]. Three groups of
CYP79 family genes, CYP79F1,2, CYP79B2,3 and
CYP79A2, are involved in aliphatic, indolyl and aro-
matic glucosinolate biosynthesis, respectively. Our
database searches indicate two copies of the CYP79B2
and C-S lyase genes and three copies of UGT74C1 in
B. rapa, unlike the single copy genes in Arabidopsis.
Such duplication may necessate a redundant function
for tissue- or development-dependent differential
expression. Excluding two copies of nonfunctional
BrST5 carrying frameshift and transposon insertion
mutations, eight copies of BrST5 are actually involved
in aliphatic glucosinolate synthesis in B. rapa (Table 1,

Figs 2 and 4), whereas two copies of the orthologs
exist in Arabidopsis. The expression level of BrST5 is
not only developmentally regulated, but also highly
dependent on tissue type (Fig. 6). Because sulfonation
is a penultimate step of glucosinolate biosynthesis, the
expression of BrST5 may play a crucial role in the tis-
sue-specific and developmental accumulation of gluco-
sinolates. It remains to be determined whether BrST5
transcript levels are correlated with the accumulated
levels of indole and aliphatic glucosinolates.
The final step of glucosinolate synthesis is side
chain modification and, currently, this step is well
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3567
characterized only for aliphatic glucosinolate in
Arabidopsis. Glucoraphanin (4-methylsulfinylbutyl-
glucosinolate) is abundant in Columbia but is absent in
the Landsberg ecotype of Arabidopsis. This difference is
attributed to the AOP2 gene, whose expression diverts
glucoraphanin into alkenyl-glucosinolate [29]. Our gen-
ome database search yielded the presence of a single
copy of AOP2 in B. rapa. However, two AOP2 quanti-
tative trait loci, Ali-QTL3.1 and Ali-QTL9.1, were
recently reported to be involved in determining the type
and concentration of glucosinolates found in B. rapa
leaves [62]. Consistent with this finding, our Southern
blot analysis indicated the presence of two copies of
AOP2 in B. rapa (data not shown). The presence of
AOP2 explains why glucoraphanin was not detectable
in Chinese cabbage [50,51]. In B. oleracea, two tan-

demly repeated copies of AOP2 contain a 2 bp deletion
at the third exon, which is responsible for the high
accumulation of glucoraphanin (Fig. 5B) [45]. Brassica
napus, a species resulting from interspecific hybridiza-
tion between B. rapa and B. oleracea, was reported to
be absent in glucoraphanin [63]. This is most likely as
result of the AOP2 gene introduced from B. rapa. The
content of glucoraphanin in B. rapa and B. napus could
be elevated by inhibiting AOP2 expression via antisense
or RNA interference approaches.
Similarity of the genes controlling glucosinolate
biosynthesis between B. rapa and Arabidopsis
B. rapa contains the orthologs of the Arabidopsis
glucosinolate regulators, except MYB76 (Table 2).
Unlike Arabidopsis, they are normally triplicated, con-
sistent with the Brassica genome triplication event. The
duplication and divergence of the regulators in a small
multigene family along with multiple duplications of
their target biosynthesis genes may result in phenotypic
variation. AOP2 ⁄ AOP3 null accessions of Arabidopsis
were shown to accumulate an increased level of the
precursor methylsulfinylalkyl glucosinolate but also a
considerably lower level of total aliphatic glucosino-
lates than the accessions with a functional AOP2 allele,
which has been explained by the differential feedback
regulation of transcript regulators MYB28, 76 and 29
by the side chain modification end products [29,30,64].
Similarly, epistatic interactions between AOP2 and
transcript regulators MYB28 and
MYB29 may exist in

B. rapa.
BrGS gene duplication
The Brassica genome is believed to have triplicated
soon after its divergence from Arabidopsis [65–67]. The
genome sizes of B. rapa (550 Mb) and B. oleracea
(696 Mb) are more than four- and five-fold greater
than that of Arabidopsis (125 Mb), respectively [68,69].
This could be explained in part by the presence of big-
ger gene families as a result of genome diploidization,
segmentation or gene duplication. In B. oleracea,
genome rearrangement is commonly followed by gene
loss, fragmentation and dispersal [70]. Many gene
duplications arose as a result of the triplication event,
and those genes involved in signal transduction or tran-
scriptional control are more extensively retained than
others during the evolution process [70]. Some tandemly
duplicated genes in B. rapa and B. oleracea are likely to
be the result of an unequal crossover during the rear-
rangement process after Brassica genome triplication
[45,46]. Approximately 14% of B. rapa genome is esti-
mated to consist of transposable elements, the majority
of which are retrotransposons [69]. It has been proposed
that gene duplication also is facilitated by retrotranspo-
son carrying a LTR [71]. BrST is a good example of a
multigene family with tandem arrays of genes in
B. rapa. The genes adjacent to two tandem repeats,
BrST5b-1 and BrST5b-2, were colinear with their
Arabidopsis counterparts, and all the other BrST genes
jumped to completely new positions (Fig. 4). BrST5b-1,
3, 4 and 5 were found to be flanked by LTR Copia-like

retrotransposons (data not shown). BrST5b-4 was dis-
rupted by insertion of a putative non-LTR retrotrans-
poson and shares 89% sequence identity with BrST5b-5
in a tandem array. They also are tandemly arranged
with BrST5b-3 in the same BAC clone, but with lower
sequence identities compared to that between them. This
suggests two consecutive steps of duplication occur at
the same locus.
Sequence comparison of glucosinolate synthesis
genes reflects evolution of Brassica lineage
Soon after divergence from the Arabidopsis lineage and
genome triplication, extensive interspersed gene loss or
gain events and large-scale chromosomal rearrange-
ments gave rise to three basic diploid species: B. rapa
(AA genome), Brassica nigra (BB genome) and B. oler-
acea (CC genome) [66]. Our data support this pre-
sumptive evolution order; BrGS sequence similarities
among the Brassicas (mostly > 90%) are normally
higher than those between Brassica and Arabidopsis
(mostly 80–90%). Individual tandem repeats or dis-
persed duplication events are indicative of the self-rear-
rangements occurring within each species. A
convincing example is that AOP3 is only present in
Arabidopsis and that AOP2
is tandemly duplicated in
B. oleracea but not in B. rapa. Even within B. oleracea,
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3568 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS
AOP2 is a nonfunctional gene in broccoli, but is an
active functional gene in collard [44], suggesting a very

recent evolutionary event.
Perspective for metabolic engineering of
glucosinolate in B. rapa
Although a high colinearity in the glucosinolate bio-
synthetic pathway generally exists between Arabidopsis
and B. rapa (Fig. 1), the glucosinolate profiles of
B. rapa are quite different from those of Arabidopsis.
This could be explained in part by increases in BrGS
gene number compared to the corresponding gene
number of Arabidopsis having an effect on a regulatory
circuit controlling the gene expression of multicopy
genes. Currently, no information is available about the
rate-limiting step of metabolic flux and the regulation
of glucosinolate biosynthesis at the post-transcriptional
and post-translational level. Nevertheless, assuming
that, overall, the glucosinolate biosynthetic pathway
and regulatory networks of B. rapa are analogous to
those of Arabidopsis, glucosinolate profiles could be
quantitatively and qualitatively changed by combining
at least three approaches. The different enzymes of the
CYP79 family are responsible for various types of
glucosinolates. Thus, altering the endogenous level of
CYP79s and introducing exogenous CYP79s may
make it possible to generate custom designed glucosin-
olate profiles. The glucosinolate concentration could
be changed by altering the side chain modification step
because biosynthesis of glucosinolates is shown to be
regulated via a feedback mechanism in Arabidopsis,
although this remains to be confirmed in B. rapa.
Altered expression of the BrGS regulators, including

MYB transcription factors, would provide not only
very efficient metabolic engineering tools to manipulate
both the content and composition of glucosinolates in
B. rapa vegetable plants, but also the genetic tools to
understand how such plants control the production of
these anticarcinogenic, antioxidative and antimicrobial
compounds.
Experimental procedures
Construction of the cDNA and BAC libraries
B. rapa cultivar ‘Chiifu’ was used for library construction
based on the agreement of the Multinational Brassica
Genome Project Consortium. Total RNA was prepared
from 28 different kinds of tissues and used to construct
individual cDNA libraries. Double-stranded cDNA was
synthesized from 5 lg of total RNA with the ZAP-cDNA
synthesis kit (Stratagene, La Jolla, CA, USA) in accor-
dance with the manufacturer’s instructions. Messenger
RNA was reverse-transcribed into cDNA with a hybrid
oligo(dT) linker-primer containing a XhoI restriction site.
In addition, the first-strand cDNA was hemimethylated
with 5-methyl dCTP to protect from restriction enzyme
digestion. The double-stranded cDNA was linked to
EcoRI adapter at its 5¢-end and was then digested with
XhoI and EcoRI. This final product was purified and
ligated into several different vectors, such as pBlueScript,
pDNR-LIB and pCNS-D2, for the subsequent sequencing
project. Three large-insert BAC libraries had been con-
structed for the sequencing project [72]. Using high infor-
mation content fingerprinting, the first BAC-based
physical map was recently generated for B. rapa ssp.

Pekinensis [52].
Sequencing of EST and BAC clones
EST clones were sequenced using the primers for the flank-
ing sites of cloning vectors, such as T3 and T7 primers of
pBlueScript vector. Most of the clones were subjected to
5¢ end sequencing, which enabled us to quickly identify
full-length clones by comparison with Arabidopsis sequen-
ces. Large size BAC clones were cut into small fragments,
which were then inserted into pCUGIblu31 vector as previ-
ously described [73–75]. All the EST and shotgun clones
and both ends of most BAC clones were sequenced using
ABI3730 automatic DNA sequencer and BigDye terminator
chemistry, version 3.1 (Applied Biosystems, Foster City,
CA, USA). Putative full-length cDNA clones were again
sequenced for both ends. Computational sequence assembly
of BAC contigs was conducted using phred ⁄ phrap ⁄ consed
software [76].
BLAST and data analyses for BrGS gene
identification
CDS of Arabidopsis glucosinolate biosynthesis and regula-
tory genes obtained from NCBI database were used to search
for B. rapa homologs in the sequence databases of BAC and
cDNA clones. Target sequences with 100% identity with
the overlapped region were counted as the same gene. Both
BAC and cDNA clones often yielded the same homologs.
In this case, the exon ⁄ intron split pattern was identified by
sequence alignment. However, for the homologs found only
in BAC sequences, gene structures were predicted by
sequence comparison with the Arabidopsis CDS. Nucleotide
insertion, deletion and transposable elements in B. rapa

homologs were also confirmed by comparison with the corre-
sponding Arabidopsis genes. The unrooted neighbor-joining
phylogenetic tree was generated using the full-length CDS of
BrST5 genes and mega, version 4.1, software (Biodesign
Institute, A240, Arizona State University, Tempe, AZ,
USA).
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3569
Southern blot analysis
Genomic DNA was isolated from young leaf tissue of
Chinese cabbage cultivar ‘Jangwon’ using the cetyl tri-
methyl ammonium bromide method [77]. Approximately
5 lg of genomic DNA was digested with BamHI, DraI,
EcoRI, HindIII, EcoRV, XbaI and ScaI, electrophoresized
in 0.8% agarose gel at 30 V overnight, and transferred to a
nylon membrane (Hybond N-Filter; Amersham Pharmacia,
Piscataway, NJ, USA). The probe DNA either comprised
restriction enzyme fragments of EST clones or PCR prod-
ucts of genomic DNA or BAC clones. All of the probes
were labeled with radioactive P
32
using the Ladderman kit
(Takara Bio Inc., Shiga, Japan). Hybridization and detec-
tion were carried out as described previously [78].
RT-PCR
Total RNA was prepared using RNeasy mini kit (Qiagen
GmbH, Hilden, Germany) and treated with RNase-free
DNase to remove any genomic DNA contaminants. All
RNA samples were quantified using a NanoDrop ND-1000
spectrophotometer (Thermo Fisher Scientific Inc., Wal-

tham, MA, USA) and were adjusted to the same concentra-
tion with diethylpyrocarbonate-treated water. The first
strand cDNA was synthesized using 5 lg of total RNA and
SprintÔ PowerScriptÔ PrePrimed Single Shots (Clontech
Laboratories, Inc., a Takara Bio company, Mountain View,
CA, USA) with oligo(dT)18 primers in accordance with
the manufacturer’s instructions. PCR was then performed
using gene-specific primers and ExTaq DNA polymerase
(Takara, Shiga, Japan). The reaction was initiated by
predenaturating at 94 °C for 5 min, followed by 35 cycles
of denaturation (94 °C for 30 s), annealing (53 ° C for 30 s)
and extension (72 °C for 1.5 min), and was terminated
with a final extension of 10 min at 72 °C. The amplifi-
cation products were analyzed by 1.2% agarose gel
electrophoresis.
Acknowledgement
This work was supported by the grant from National
Academy of Agricultural Science (Code numbers
200713906220 0001502, 200901FHT020710397 and 200901
FHT020711430), Rural Development Administration,
Korea.
References
1 Brader G, Mikkelsen MD, Halkier BA & Palva ET
(2006) Altering glucosinolate profiles modulates disease
resistance in plants. Plant J 46, 758–767.
2 Kim JH, Lee BW, Schroeder F & Jander G (2008)
Identification of indole glucosinolate breakdown
products with antifeedant effects on Myzus persicae
(green peach aphid). Plant J 54, 1015–1026.
3 Zhang Y, Kensler TW, Cho C, Posner GH & Talalay P

(1994) Anticarcinogenic activities of sulforaphane and
structurally related synthetic norbornyl isothiocyanates.
Proc Natl Acad Sci USA 91, 3147–3150.
4 Fimognari C & Hrelia P (2007) Sulforaphane as a
promising molecule for fighting cancer. Mutat Res 635,
90–104.
5 Myzak MC & Dashwood RH (2006) Chemoprotection
by sulforaphane: keep one eye beyond keap. Cancer
Lett 233, 208–218.
6 Myzak MC, Dashwood WM, Orner GA, Ho E &
Dashwood RH (2006) Sulforaphane inhibits histone
deacetylase in vivo and suppresses tumorigenesis in
Apcmin mice. FASEB J 20, 506–508.
7 Myzak MC, Karplus PA, Chung F & Dashwood RH
(2004) A novel mechanism of chemoprevention by sul-
foraphane: inhibition of histone deacetylase. Cancer Res
64, 5767–5774.
8 Parnaud G, Li P, Cassar G, Rouimi P, Tulliez J,
Combaret L & Gamet-Payrastre L (2004) Mechanism of
sulforaphane-induced cell cycle arrest and apoptosis in
human colon cancer cells. Nutr Cancer 48, 198–206.
9 Pledgie-Tracy A, Sobolewski MD & Davidson NE
(2007) Sulforaphane induces cell type-specific apoptosis
in human breast cancer cell lines. Mol Cancer Ther 6,
1013–1021.
10 Solowiej E, Kaspizycka-Guttman T, Fiedor P & Rowin-
ski W (2003) Chemoprevention of cancerogenesis – the
role of sulforaphane. Acta Pol Pharm 60, 97–100.
11 Takahashi N, Dashwood RH, Bjeldanes LF, Williams
DE & Bailey GS (1995) Mechanisms of indole-3-carbi-

nol (I3C) anticarcinogenesis: inhibition of aflatoxin
B1-DNA adduction and mutagenesis by I3C acid
condensation products. Food Chem Toxicol 33,
851–857.
12 Bradlow HL, Sepkovic DW, Telang NT & Osborne MP
(1999) Multifunctional aspects of the action of indole-3-
carbinol as an antitumor agent. Ann NY Acad Sci 889,
204–213.
13 Chinni SR, Li Y, Upadhyay S, Koppolu PK & Sarkar
FH (2001) Indole-3-carbinol (I3C) induced cell growth
inhibition, G1 cell cycle arrest and apoptosis in prostate
cancer cells. Oncogene 20, 2927–2936.
14 Nho CW & Jeffery E (2001) The synergistic upregula-
tion of phase II detoxification enzymes by glucosinolate
breakdown products in cruciferous vegetables. Toxicol
Appl Pharmacol 174, 146–152.
15 Meng Q, Qi M, Chen DZ, Goldberg ID, Rosen EM,
Auborn K & Fan S (2000) Suppression of breast cancer
invasion and migration by indole-3-carbinol: associated
with up-regulation of BRCA1 and E-cadherin ⁄ catenin
complexes. J Mol Med 78, 155–165.
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3570 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS
16 Chung FL, Conaway CC, Rao CV & Reddy VS (2000)
Chemoprevention of colonic aberrant crypt foci in
Fischer rats by sulforaphane and phenethyl isothiocya-
nate. Carcinogenesis 21, 2287–2291.
17 Huang C, Ma WY, Li J, Hecht SS & Dong Z (1998)
Essential role of p53 in phenethyl isothiocyanate-
induced apoptosis. Cancer Res 58, 4102–4106.

18 Yao S, Zhang Y & Li J (2006) c-jun ⁄ AP-1 activation
does not affect the antiproliferative activity of phen-
ethyl isothiocyanate, a cruciferous vegetable-derived
cancer chemopreventive agent. Mol Carcinog 45,
605–612.
19 Canistro D, Croce CD, Iori R, Barillari J, Bronzetti G,
Poi G, Cini M, Caltavuturo L, Perocco P & Paolini M
(2004) Genetic and metabolic effects of gluconasturtiin,
a glucosinolate derived from cruciferae. Mutat Res 545,
23–35.
20 Hirose M, Yamaguchi T, Kimoto N, Ogawa K,
Futakuchi M, Sano M & Shirai T (1998) Strong
promoting activity of phenylethyl isothiocyanate and
benzyl isothiocyanate on urinary bladder carcinogenesis
in F344 male rats. Int J Cancer 77, 773–777.
21 Kassie F & Knasmu
¨
ller S (2000) Genotoxic effects of
allyl isothiocyanate (AITC) and phenethyl isothiocya-
nate (PEITC). Chem Biol Interact 127, 163–180.
22 March TH, Jeffery EH & Wallig MA (1998) The
cruciferous nitrile, crambene, induces rat hepatic and
pancreatic glutathione S-transferases. Toxicol Sci 42,
82–90.
23 Halkier BA & Gershenzon J (2006) Biology and
biochemistry of glucosinolates. Annu Rev Plant Biol 57,
303–333.
24 Kroymann J, Textor S, Tokuhisa JG, Falk KL,
Bartram S, Gershenzon J & Mitchell-Olds T (2001)
A gene controlling variation in Arabidopsis glucosino-

late composition is part of the methionine chain
elongation pathway. Plant Physiol 127, 1077–1088.
25 Textor S, de Kraker JW, Hause B, Gershenzon J &
Tokuhisa JG (2007) MAM3 catalyzes the formation of
all aliphatic glucosinolate chain lengths in Arabidopsis.
Plant Physiol 144, 60–71.
26 Hansen BG, Kerwin RE, Ober JA, Lambrix VM,
Mitchell-Olds T, Gershenzon J, Halkier BA &
Kliebenstein DJ (2008) A novel 2-oxoacid-dependent
dioxygenase involved in the formation of the goitero-
genic 2-hydroxybut-3-enyl glucosinolate and generalist
insect resistance in Arabidopsis. Plant Physiol 148,
2096–2108.
27 Hansen BG, Kliebenstein DJ & Halkier BA (2007)
Identification of a flavin-monooxygenase as the
S-oxygenating enzyme in aliphatic glucosinolate biosyn-
thesis in Arabidopsis. Plant J 50, 902–910.
28 Li J, Hansen BG, Ober JA, Kliebenstein DJ & Halkier
BA (2008) Subclade of flavin-monooxygenases involved
in aliphatic glucosinolate biosynthesis. Plant Physiol
148, 1721–1733.
29 Kliebenstein DJ, Lambrix VM, Reichelt M, Gershenzon
J & Mitchell-Olds T (2001) Gene duplication in the
diversification of secondary metabolism: tandem
2-oxoglutarate-dependent dioxygenases control
glucosinolate biosynthesis in Arabidopsis. Plant Cell 13,
681–693.
30 Kliebenstein DJ, Kroymann J, Brown P, Figuth A,
Pedersen D, Gershenzon J & Mitchell-Olds T (2001)
Genetic control of natural variation in Arabidopsis

glucosinolate accumulation. Plant Physiol 126, 811–
825.
31 Kliebenstein DJ, D’Auria JC, Behere AS, Kim JH,
Gunderson KL, Breen JN, Lee G, Gershenzon J, Last
RL & Jander G (2007) Characterization of seed-specific
benzoyloxyglucosinolate mutations in Arabidopsis thali-
ana. Plant J 51, 1062–1076.
32 Keck AS & Finley JW (2004) Cruciferous vegetables:
cancer protective mechanisms of glucosinolate
hydrolysis products and selenium. Integr Cancer Ther 3,
5–12.
33 Wu L, Ashraf MHN, Facci M, Wang R, Paterson PG,
Ferrie A & Juurlink BH (2004) Dietary approach to
attenuate oxidative stress, hypertension, and inflamma-
tion in the cardiovascular system. Proc Natl Acad Sci
USA 101, 7094–7099.
34 Tang L, Zirpoli GR, Guru K, Moysich KB, Zhang Y,
Ambrosone CB & McCann SE (2008) Consumption of
raw cruciferous vegetables is inversely associated with
bladder cancer risk. Cancer Epidemiol Biomarkers Prev
17, 938–944.
35 Grubb CD & Abel S (2006) Glucosinolate metabolism
and its control. Trends Plant Sci 11, 89–100.
36 Celenza JL, Quiel JA, Smolen GA, Merrikh H,
Silvestro AR, Normanly J & Bender J (2005) The
Arabidopsis ATR1 Myb transcription factor controls
indolic glucosinolate homeostasis. Plant Physiol 137,
253–262.
37 Levy M, Wang Q. Kaspi R, Parrella MP & Abel S
(2005) Arabidopsis IQD1, a novel calmodulin-binding

nuclear protein, stimulates glucosinolate accumulation
and plant defense. Plant J 43, 79–96.
38 Skirycz A, Reichelt M, Burow M, Birkemeyer C, Rolcik
J, Kopka J, Zanor MI, Gershenzon J, Strnad M, Szopa
J et al. (2006) DOF transcription factor AtDof1.1
(OBP2) is part of a regulatory network controlling
glucosinolate biosynthesis in Arabidopsis. Plant J 47,
10–24.
39 Gigolashvili T, Berger B, Mock HP, Mu
¨
ller C,
Weisshaar B & Flu
¨
gge UI (2007) The transcription
factor HIG1 ⁄ MYB51 regulates indolic glucosinolate
biosynthesis in Arabidopsis thaliana. Plant J 50,
886–901.
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3571
40 Gigolashvili T, Engqvist M, Yatusevich R, Mu
¨
ller C &
Flu
¨
gge UI (2008) HAG2 ⁄ MYB76 and HAG3 ⁄ MYB29
exert a specific and coordinated control on the
regulation of aliphatic glucosinolate biosynthesis in
Arabidopsis thaliana. New Phytol 177, 627–642.
41 Gigolashvili T, Yatusevich R, Berger B, Mu
¨

ller C &
Flu
¨
gge UI (2007) The R2R3-MYB transcription factor
HAG1 ⁄ MYB28 is a regulator of methionine-derived
glucosinolate biosynthesis in Arabidopsis thaliana. Plant
J 51, 247–261.
42 Sønderby IE, Hansen BG, Bjarnholt N, Ticconi C,
Halkier BA & Kliebenstein DJ (2007) A systems
biology approach identifies a R2R3 MYB gene
subfamily with distinct and overlapping functions in
regulation of aliphatic glucosinolates. PLoS ONE 2,
e1322.
43 Hirai MY, Sugiyama K, Sawada Y, Tohge T, Obayashi
T, Suzuki A, Araki R, Sakurai N, Suzuki H, Aoki K
et al. (2007) Omics-based identification of Arabidopsis
Myb transcription factors regulating aliphatic glucosin-
olate biosynthesis. Proc Natl Acad Sci USA 104, 6478–
6483.
44 Li G & Quiros CF (2003) In planta side-chain glucosin-
olate modification in Arabidopsis by introduction of
dioxygenase Brassica homolog BoGSL-ALK. Theor
Appl Genet 106, 1116–1121.
45 Gao M, Li G, Yang B, McCombie WR & Quiros CF
(2004) Comparative analysis of a Brassica BAC clone
containing several major aliphatic glucosinolate genes
with its corresponding Arabidopsis sequence. Genome
47, 666–679.
46 Gao M, Li G, Potter D, McCombie WR & Quiros CF
(2006) Comparative analysis of methylthioalkylmalate

synthase (MAM) gene family and flanking DNA
sequences in Brassica oleracea and Arabidopsis thaliana.
Plant Cell Rep 25, 592–598.
47 Kang JY, Ibrahim KE, Juvik JA, Kim DH & Kang WJ
(2006) Genetic and environmental variation of
glucosinolate content in Chinese cabbage. HortSci 41,
1382–1385.
48 Engelen-Eigles G, Holden G, Cohen JD & Gardner G
(2006) The effect of temperature, photoperiod, and light
quality on gluconasturtiin concentration in watercress
(Nasturtium officinale R. Br.). J Agric Food Chem 54,
328–334.
49 Himanen SJ, Nissinen A, Auriola S, Poppy GM,
Stewart CN Jr, Holopainen JK & Nerg AM (2007)
Constitutive and herbivore-inducible glucosinolate
concentrations in oilseed rape (Brassica napus) leaves
are not affected by Bt Cry1Ac insertion but change
under elevated atmospheric CO2 and O
3
. Planta 227,
427–437.
50 Zang YX, Lim MH, Park BS, Hong SB & Kim DH
(2008) Metabolic engineering of the indole glucosino-
lates in Chinese cabbage plants expressing Arabidopsis
CYP79B2, CYP79B3, and CYP83B1. Mol Cells 25,
231–241.
51 Zang YX, Kim JH, Park YD, Kim DH & Hong SB
(2008) Metabolic engineering of the aliphatic glucosino-
lates in Chinese cabbage plants expressing Arabidopsis
MAM1, CYP79F1, and CYP83A1. BMB rep 41,

472–478.
52 Mun JH, Kwon SJ, Yang TJ, Kim HS, Choi BS, Baek
S, Kim JS, Jin M, Kim JA, Lim MH et al. (2008) The
first generation of a BAC-based physical map of Bras-
sica rapa. BMC Genomics 9, 280.
53 Kim JS, Chung TY, King GJ, Jin M, Yang TJ, Jin
YM, Kim HI & Park BS (2006) A sequence-tagged link-
age map of Brassica rapa. Genetics 174, 29–39.
54 Lee JT (2003) Complicity of gene and pseudogene.
Nature 423, 26–28.
55 Force A, Lynch M, Pickett FB, Amores A, Yan YL &
Postlethwait J (1999) Preservation of duplicate genes by
complementary, degenerative mutations. Genetics 151,
1531–1545.
56 Blanc G & Wolfe KH (2004) Functional divergence of
duplicated genes formed by polyploidy during Arabidop-
sis evolution. Plant Cell 16, 1679–1691.
57 Piotrowski M, Schemenewitz A, Lopukhina A, Mu
¨
ller
A, Janowitz T, Weiler EW & Oecking C (2004) Desul-
foglucosinolate sulfotransferases from Arabidopsis
thaliana catalyzing the final step in biosynthesis of
the glucosinolate core structure. J Biol Chem 279,
50717–50725.
58 Schuster J, Knill T, Reichelt H, Gershenzon J & Binder
S (2006) Branched-chain aminotransferase4 is part of
the chain elongation pathway in the biosynthesis of
methionine-derived glucosinolates in Arabidopsis. Plant
Cell 18, 1–16.

59 Knill T, Schuster J, Reichelt M, Gershenzon J &
Binder S (2008) Arabidopsis thaliana branched-chain
aminotransferase 3 functions in both amino acid and
glucosinolate biosynthesis. Plant Physiol 146, 1028–
1039.
60 Li G & Quiros CF (2002) Genetic analysis, expression
and molecular characterization of BoGSL-ELONG,
a major gene involved in the aliphatic glucosinolate
pathway of Brassica species. Genetics 162, 1937–1943.
61 De Kraker JW, Luck K, Textor S, Tokuhisa J & Ger-
shenzon J (2007) Two Arabidopsis genes (IPMS1
and
IPMS2) encode isopropylmalate synthase, the branch-
point step in the biosynthesis of leucine. Plant Physiol
143, 970–986.
62 Lou P, Zhao J, He H, Hanhart C, Pino Del Carpio D,
Verkerk R, Custers J, Koornneef M & Bonnema G
(2008) Quantitative trait loci for glucosinolate
accumulation in Brassica rapa leaves. New Phytol 179,
1017–1032.
63 Rangkadilok N, Nicolas ME, Bennett RN, Premier
RR, Eagling DR & Taylor PWJ (2002) Determination
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3572 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS
of sinigrin and glucoraphanin in Brassica species using
a simple extraction method combined with ion-pair
HPLC analysis. Sci Hortic 96, 27–41.
64 Wentzell AM, Rowe HC, Hansen BG, Ticconi C,
Halkier BA & Kliebenstein DJ (2007) Linking
metabolic QTLs with network and cis-eQTLs

controlling biosynthetic pathways. PLoS Genet 3,
1687–1701.
65 Bowers JE, Chapman BA, Rong JK & Paterson AH
(2003) Unravelling angiosperm genome evolution by
phylogenetic analysis of chromosomal duplication
events. Nature 422, 433–438.
66 Lysak MA, Koch M, Pecinka A & Schubert I (2005)
Chromosome triplication found across the tribe Brassi-
ceae. Genome Res 15, 516–525.
67 Yang YW, Lai KN, Tai PY & Li WH (1999) Rates of
nucleotide substitution in angiosperm mitochondrial
DNA sequences and dates of divergence between
Brassica and other angiosperm lineages. J Mol Evol 48,
597–604.
68 The Arabidopsis Genome Initiative (2000) Analysis of
the genome sequence of the flowering plant Arabidop-
sis thaliana. Nature 408, 796–815.
69 Hong CP, Kwon SJ, Kim JS, Yang TJ, Park BS & Lim
YP (2008) Progress in understanding and sequencing
the genome of Brassica rapa. Int J Plant Genomics
2008, doi: 10.1155/2008/582837.
70 Town CD, Cheung F, Maiti R, Crabtree J, Haas BJ,
Wortman JR, Hine EE, Althoff R, Arbogast TS, Tallon
LJ et al. (2006) Comparative genomics of Brassica oler-
acea and Arabidopsis thaliana reveal gene loss, fragmen-
tation, and dispersal after polyploidy. Plant Cell 18,
1348–1359.
71 Xiao H, Jiang N, Schaffner E, Stockinger EJ & van der
Knaap E (2008) A retrotransposon-mediated gene
duplication underlies morphological variation of tomato

fruit. Science 319, 1527–1530.
72 Park J, Koo DH, Hong CP, Lee SJ, Jeon JW, Lee SH,
Yun PY, Park BS, Kim HR, Bang JW et al. (2005)
Physical mapping and microsynteny of Brassica rapa
ssp. pekinensis genome corresponding to a 222 kbp
gene-rich region of Arabidopsis chromosome 4 and par-
tially duplicated on chromosome 5. Mol Genet Genomics
274, 579–588.
73 Kim JA, Yang TJ, Kim JS, Park JY, Kwon SJ, Lim
MH, Jin M, Lee SC, Lee SI, Choi BS et al. (2007) Iso-
lation of circadian-associated genes in Brassica rapa by
comparative genomics with Arabidopsis thaliana. Mol
Cells 23, 145–153.
74 Yang TJ, Yu Y, Frisch DA, Lee S, Kim HR, Kwon SJ,
Park BS & Wing RA (2004) Construction of various
copy number plasmid vectors and their utility for gen-
ome sequencing. Genomics Inform 2, 174–179.
75 Yang TJ, Kim JS, Lim KB, Kwon SJ, Kim JA, Jin M,
Park JY, Lim MH, Kim HI, Kim SH et al. (2005) The
Korea Brassica Genome Project: a glimpse of the
Brassica genome based on comparative genome
analysiswith Arabidopsis. Comp Funct Genomics 6,
138–146.
76 Gordon D, Abajian C & Green P (1998) Consed: a
graphical tool for sequence finishing. Genome Res 8,
195–202.
77 Murray MG & Thompson WF (1980) Rapid isolation
of high molecular weight plant DNA. Nucleic Acids Res
8, 4321–4325.
78 Cho YG, Eun MY, McCouch SR & Chae YA (1994)

The semidwarf gene, sd-1, of rice (Oryza sativa L.) II.
Molecular mapping and marker-assisted selection.
Theor Appl Genet 89, 54–59.
79 Zhao Y, Hull AK, Gupta N, Goss KA, Alonso J, Ecker
JR, Normanly J, Chory J & Celenza JL (2002) Trp-
dependent auxin biosynthesis in Arabidopsis: involve-
ment of cytochrome P450s CYP79B2 and CYP79B3.
Genes Dev 16, 3100–3112.
80 Gao M, Li G, McCombie WR & Quiros CF
(2005) Comparative analysis of a transposon-rich
Brassica oleracea BAC clone with its corresponding
sequence in A. thaliana. Theor Appl Genet 111, 949–
955.
81 Bak S, Nielsen HL & Halkier BA (1998) The presence
of CYP79 homologues in glucosinolate-producing
plants show evolutionary conservation of the enzymes
in the conversion of amino acid to aldoxime in the
biosynthesis of cyanogenic glucosides and glucosino-
lates. Plant Mol Biol 38, 725–734.
82 Naur P, Hansen CH, Bak S, Hansen BG, Jensen NB,
Nielsen HL & Halkier BA (2003) CYP79B1 from Sina-
pis alba converts tryptophan to indole-3-acetaldoxime.
Arch Biochem Biophys 409, 235–241.
Supporting information
The following supplementary material is available:
Fig. S1. Southern blot analysis for B. rapa CYP79F1 ⁄ F2,
CYP79B2 ⁄ B3, CYP83A1 and CYP83B1.
Fig. S2. Nonrooted neighbor-joining phylogenetic tree
of Branched chain aminotransferases.
Fig. S3. Nonrooted neighbor-joining phylogenetic tree

of MAM and IPMS family genes.
Fig. S4. Nonrooted neighbor-joining phylogenetic tree
of P450 CYP79 genes.
Fig. S5. Nonrooted neighbor-joining phylogenetic tree
of P450 CYP83 genes.
Fig. S6. Nonrooted neighbor-joining phylogenetic tree
of C-S lyases and close aminotranferases.
Fig. S7. Nonrooted neighbor-joining phylogenetic tree
of glucosyltransferase family genes.
Fig. S8. Nonrooted neighbor-joining phylogenetic tree
of flavin-containing monooxygenase family genes.
Y X. Zang et al. Glucosinolate biosynthesis genes in Brassica rapa
FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS 3573
Fig. S9. Nonrooted neighbor-joining phylogenetic tree
of AOP family genes and close 2-oxoglutarate-depen-
dent dioxygenase family genes.
Fig. S10. Nonrooted neighbor-joining phylogenetic tree
of GSL-OH family genes and close 2-oxoglutarate-
dependent dioxygenase family genes.
Fig. S11. Nonrooted neighbor-joining phylogenetic tree
of benzoate-CoA ligase family genes and close acyl-
activating enzyme and AMP-dependent synthetase and
ligase family genes.
Table S1. Gene-specific primers for semi-quantitative
RT-PCR of BrST genes with PCR product sizes.
This supplementary material can be found in the
online version of this article.
Please note: Wiley-Blackwell is not responsible for
the content or functionality of any supplementary
materials supplied by the authors. Any queries (other

than missing material) should be directed to the
corresponding author for the article.
Glucosinolate biosynthesis genes in Brassica rapa Y X. Zang et al.
3574 FEBS Journal 276 (2009) 3559–3574 ª 2009 National Academy of Agricultural Science, RDA, Korea. Journal compilation ª 2009 FEBS

×