Tải bản đầy đủ (.pdf) (12 trang)

báo cáo khoa học: "A 48 SNP set for grapevine cultivar identification" pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (473.29 KB, 12 trang )

RESEARCH ARTICLE Open Access
A 48 SNP set for grapevine cultivar identification
José A Cabezas
1,6†
, Javier Ibáñez
2†
, Diego Lijavetzky
1,7
, Dolores Vélez
3
, Gema Bravo
1
, Virginia Rodríguez
1
,
Iván Carreño
4
, Angelica M Jermakow
5
, Juan Carreño
4
, Leonor Ruiz-García
4
, Mark R Thomas
5
and
José M Martinez-Zapater
1,2*
Abstract
Background: Rapid and consistent genotyping is an important requirement for cultivar identification in many crop
species. Among them grapevine cultivar s have been the subject of multiple studies given the large number of


synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple
sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their
codominant nature and their high profile repeatability. However, the rapid application of partial or complete
genome sequencing approaches is identifying thousands of single nucleotide polym orphisms (SNP) that can be
very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as
microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their
highly repeatable results under any analytical procedure make them the future markers of choice for any type of
genetic identification.
Results: We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of
11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes
with allele frequencies ba lanced enough as to provide sufficient information content for gene tic identification in
grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected
group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification.
Conclusions: We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform
genome distribution (2-3 markers/chromosome), which is prop osed as a standard set for grapevine (Vitis vinifera L.)
genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run
reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP markers
are bi-allelic, allele identification and genotype naming are extremely simple and genotypes obtained with
different equipments and by different laboratories are always fully comparable.
Background
Grapevine (Vitis vinifera L.) is one of the most valuable
horticultural crops in the world. Many of the widely cul-
tivated varieties are very ancient genotypes that have
been vegetatively multiplied for centuries and spread
worldwide. In many places the same genotypes were re-
named leading to synonyms (different names for the
same variety) as well as homonyms (different varieties
identified under the same name). Cu rrently, there is a
large but imprecise number of grapevine varieties in the
world (several thousands, [1]): This number could likely

be reduced once all varieties are properly genotyped and
compared.
When genetic identification is taken into account, two
goals have to be fulfilled: i) the availability of a large
enough number of polymorphic markers; and ii) the
existence of publ ic genotype databases allowing for
comparisons with previously characterized genotypes.
Markers should provide a h igh discrimination power
and yield reproducib le genotype data among different
laboratories and detection platforms as well as over
time. Markers should also be stable, meaning that they
produce consistent and repeatable res ults after repeated
propagation of the varieties. This is especial ly important
in the case of grapevine where many varieties have been
* Correspondence:
† Contributed equally
1
Departamento de Genética Molecular de Plantas, Centro Nacional de
Biotecnología, CSIC, C/Darwin 3, 28049 Madrid, Spain
Full list of author information is available at the end of the article
Cabezas et al. BMC Plant Biology 2011, 11:153
/>© 2011 Cabezas et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons
Attribution License (http://creati vecommons.org/lice nses/by/2.0), which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
under cultivation for centuries, and some molecular
markers have been shown not to be fully stable in cer-
tain old varieties, due to somatic mutation [2]. In a ddi-
tion, genotyping methodologies should be easily
accessible at low cost and comparable and genotype
data should be easily stored in databases and publicly

accessed.
Grapevine genotyping i s currently based on microsa-
tellite markers or simple sequence repeats (SSR), which
have been very useful not only for genetic identification
[3] but also for parentage analysis [4]. These markers
have some relevant advantages for research such as their
co-dominance, multi-allelism a nd high levels of poly-
morphism [5]. However, there are a number of disad-
vantages in using SSR markers. The most important
problem is related to allele binning: The process that
converts raw allele lengths into allele classes normally
expressed by integer numbers [6]. Problems stemming
from allele miscalling derive in part from the wide use
of SSR based on di-nucleotide repea ts and the frequent
addition of one Adenine nucleotide by the DNA poly-
merase, which gives rise to alleles very close in size and
difficult to distinguish. This problem can be partially
solved with the use of SSR with core repeats three to
five nucleotides long such as those recently developed,
based on the information provided by the whole genome
sequence [7]. However, even if longer repeat length
markers are used, it is also important to take into
account the fact that different analytical systems (e.g.
DNA sequencers of different brands) could produce dif-
ferent allele sizes and consequently different bins,
increasing the hardship of comparing genotype tables
produced by different laboratories. To overcome these
difficulties, standardization and exchange of information
concerning grapevine genetic resources using reference
varieties for certain microsatellite markers and alleles

have been proposed [6] and discussed within European
Projects such as GENRES 081 and Grapegen06, aiming
at integrating genotypic information obtained by differ-
ent laboratories.
In recent years, numerous sequencing projects have
generated an abundance of sequence information and
nucleotide polymorphisms. These belong to two basic
types: single nucleotide polymorphisms (SNP) and inser-
tions-deletions of different lengths (INDEL). Among
them, SNP markers have the advantage that they are
mostly bi-allelic and are very frequent in genomes.
Although SNP polymorphism information content (PIC)
is lower than that of SSR markers, tens, hundreds or
even thousands SNP can be easily used when required.
SNP are highly reproducible among laboratories and
detection techniques, since the different alleles are not
distinguished on t he basis of their size but on the basis
of the nucleotide present at a given position. All these
feat ures and their unlimited availability are making SNP
the markers of choi ce for the development of identifica-
tion panels in many animal and plant species [8-12].
In this work, we characterized the genetic features of
332 SNP to select a panel of 48 markers suitable for cul-
tivar identification in grapevine. We show here that the
panel has a similar discrimination power as a set of 15
SSR markers and can represent a very robust genetic
identification system, problem-free of allele miscalling
among laboratories or detection te chnologies. We also
demonstrate that markers have a very low genotyping
error rate, a low rate of appearance of new mutations

when c ompared to SSR, and are amenable for easy sto-
rage in genotype databases. Given the state of revision
and integration of genetic resources in grapevine, our
SNP panel may be come a rapid tool for genetic identifi-
cation and genotype calling in the crop.
Results and Discussion
Single Nucleotide Polymorphisms (SNP) Detection
Identification of SNP markers in the grapevine genome
was carried out based on a re-sequencing strategy in a
selected sample of grapevine genotypes as previously
described [13]. The sample was chosen to include non-
related wine and table g rape cultivars of ancient origin
as we ll as wild accessions. Based on the available infor-
mation, cultivars corresponded to different genetic
groups [14] and had chlorotypes belonging to the four
major types described in grapevine [15]. A total of 270
SNP markers were identified in this way to which we
added 62 SNP validated at CSIRO across a range of gen-
otypes. For the f inal 332 SNP we developed genotyping
strategies based on SNPlex™. A first step to analyze the
quality of these polymorphisms in grapevine and to esti-
mate their allele frequencies was to genotype a sample
of 300 accessio ns of grapevine including wine and table
grape varieties as well as wild accessions (Additional file
1, Table S1). This approa ch allowed for discarding 61
SNP that did not worked in the analyses and 33 that,
although initially identified as polymorphic in sequence
comparisons, either behaved as monomorphic in the
analyzed sample or were genotyped as heterozygous in
100% of the samples suggesting the existence of dupli-

cated loci. As a result only 238 SNP markers were con-
sidered for further analyses (Additional file 1, Table S3).
Genomic Location of SNP markers
Genotyping of four grapevine segregating progeny popu-
lations with the seven SNPlex™ sets allowed us to
genetically map most of the 238 polymorphic SNP,
which were heterozygous in one or both parents in at
least one of the progeny populations (Additional file 1,
Tables S4 and S5). On average, the use of the seven
SNPlex™ sets allowed for including 114 markers in the
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 2 of 12
consensus map of any given mapping population: 42 for
each progenitor (segregation types aaxab and abxaa) and
29 common markers (abxab).
The integrated map developed for the eight parental
cultivars included 168 micros atellites and 202 SNP (85%
of the polymorphic SNP) allowing for identifying the
relative positions of markers not segregating in the same
progeny population (Figure 1, Additional file 1, Table
S3). Three additional segregating SNP could not be
mapped due to inconsistencies in linkage analyses
(Additional file 1, Table S3). Molecular markers were
distributed along all 19 chromosomes with an average
dis tance betw een adjacent markers of 3.4 cM (5.7 when
considering o nly SNP). The integrated map had a total
size of 1204 cM (Additional file 1, Table S4), similar to
other complete linkage maps publishe d for Vitis vi nifera
[16-19]. Because the int egrated map was based on mean
recombination frequencies [20] and a total of 313 pro-

geny individuals was considered, it should provide a
good estimation of genetic distances. However, the accu-
racy of the genetic position assigned to each marker is
limited by the number of progenies in which it is segre-
gating, the se gregation types in each progeny, the pre-
sence of markers with distorted segregations and the
possible existence of differe nces in recombination rates
among t he progenitor cultivars. Sixty-seven percent of
the 202 SNP marker s mapped were segregating in more
than one mapping population (25%, 27% and 15% in
two, three and four, respectively) and o nly 11 SNP
showed the l ess informative segregati on type < abxab >.
Finally, distorted segregation rates were low in Dominga
× Autumn Seedless, Monastrell × Cabernet Sauvignon
and Muscat Hamburg × Sugraone crosses (ranging
between 7 and 12%), but higher in Ruby × Moscatuel
SNP945_88
SNP1345_60
SNP709_258
SNP873_244
SNP1213_99
SNP1251_94
SNP915_88
VVI_2021
SNP1393_62
SNP1473_95
SNP561_120
SNP559_110
SNP895_382
SNP1043_378

SNP1033_76
327
745
4036
4259
4629
5103
5420
5725
5945
15514
15607
16409
17593
18029
20813
0
3 (vvip72)
5
11
14
15 (vmc2h9)
16
19
21 (vmc5c5)
28 (vvmd21)
35
36 (vmc4h5)
39 (vmc4h5)
47 (vmc5g1)

50
52 (vvin31)
57
62 (vvim43)
69
Chromosome 6
1 (udv-109)
12 (vvib01)
21 (vvib23)
24 (vmc6f1)
32 (vmc6b11)
40 (vmc5g7)
50 (vmc2c10_1)
59 (vmc7g3)
SNP829_281
SNP457_192
SNP841_308
SNP1293_294
SNP437_129
SNP293_20
SNP1487_41
SNP581_114
VVI_9227
SNP1229_21
VVI_805_2
415
1640
2166
2497
3709

4426
4875
5142
6474
17198
17733
0
13
18
22
25
37
52
53
Chromosome 2
Chromosome 3
0
13 (vmc8f10)
15 (vmc2e7)
18
19
25 (udv-043)
28 (vmc3f3)
30
31
32
33
34 (vvmd36)
37 (vmc1g7)
38 (vvin54)

40 (vvmd28)
46
SNP613_315
SNP425_205
SNP497_281
SNP553_98
SNP1493_58
SNP867_170
SNP1563_280
SNP1219_191
1348
3676
5115
5489
8186
8253
8414
18009
Chromosome 8
SNP289_84
VVI_6936
SNP593_149
VVI_1810
SNP941_38
SNP699_31
SNP929_81
SNP853_31
SNP1203_8
SNP1323_155
SNP1553_395

SNP377_251
SNP865_80
SNP1481_156
VVI_2283
SNP1499_126
SNP1385_86
SNP925_320
SNP1055_141
SNP1295_225
SNP881_202
275
3320
3321
4662
7782
10703
11360
11780
12277
13401
15009
16226
16262
17310
17966
18223
18753
21376
21664
21673

21759
(vmc5g6)
(vmc7h2)
0
1 (vmc1f10)
7 (vmc2F12)
8
9
12 (vvip04)
18
22
27
28
36 (vmc5h2)
37
40 (vmc1b11)
41
44
45
49
51 (vvib66)
53
55
57
63 (vmc2h10)
66
67
0
1 (vmcng1f1_1)
4

8
18
22 (vmc4d4)
24
27 (vmc7h3)
29
38 (vrzag21)
46 (vmc2e10)
50
52
54 (vvmd32)
55 (vvip77)
58
61
62 (vrzag83)
65
67 (vmc6a8)
68
Chromosome 4
SNP1513_153
SNP255_265
SNP1409_48
SNP655_93
SNP191_100
VVI_6668
SNP715_260
SNP281_64
SNP891_109
SNP135_316
SNP551_351

SNP811_42
SNP1559_291
VVI_10516
SNP1399_81
VVI_2543
681
1555
1856
4516
6409
6769
7521
17662
18620
19381
19605
19931
20092
21396
21849
22947
0 (vmc4f8)
1
3
8 (vmc8a7)
11
13
16
20 (vviq57)
21

23
25
29 (Vvip60)
34
35
38
40 (vvim25)
43
47
49 (vmc8d1)
53
55
58
61
62 (vvif52)
65 (vmc9d3)
(vmc9f2)
(vvit60)
Chromosome 1
SNP1453_40
SNP1439_90
SNP229_112
SNP683_120
VVI_1196
SNP391_170
SNP129_237
SNP1427_120
SNP269_308
SNP1527_144
SNP1517_271

SNP851_110
SNP357_371
SNP517_224
SNP1241_207
SNP1375_272
SNP477_239
SNP1025_100
SNP1021_163
SNP1157_64
730
1259
1564
1588
3101
3357
4170
5369
5949
6024
6902
10068
10307
12082
12215
15892
21161
21938
22716
22829
SNP1057_505

SNP663_578
SNP311_198
SNP1211_166
VVI_10992
VVI_7871
SNP571_227
VVI_10329
358
429
740
2272
3124
6356
20057
21409
0
2 (vmc1c10)
7
11 (vmc3g8.2)
14
19 (vvio52)
20 (vmc6d12)
21 (vviu37)
33 (vmc4h6)
42 (vmc2d9)
45
48 (vmc3H5)
49 (vviq52)
51 (vmc2e11)
53

Chromosome 9
0
1
3 (vmc3e12)
16 (vvmd25)
21 (vvs2)
23
25 (vvib19)
36 SNP317_155
40 (vmc6g1)
48 SNP1423_265
49 (vvip02)
64
Chromosome 11
SNP197_82
SNP635_21
SNP1011_337
SNP987_26
VVI_10353
312
803
2575
4723
19390
SNP691_139
SNP1347_100
SNP871_167
VVI_2623
SNP1549_375
VVI_13076

SNP1387_83
VVI_3400
SNP1397_215
SNP1583_159
SNP843_76
SNP1419_186
SNP429_101
SNP1151_397
SNP1445_218
VVI_377
SNP209_255
VVI_12805
1127
1389
1770
2279
3202
3856
3988
4509
5702
8141
15731
15822
16324
16804
18046
18701
20481
20483

0
3
4 (vmc16f3)
6 (vvmd7)
11 (vrzag62)
14
21 (vvmd6)
22
24 (vvis58)
26 (vvmd31)
29 (vmc7a4)
30
35
36 (vmc1a2)
40 SNP1015_67
41 VVI_1731
45 SNP241_201
55 VVI_5629
58 (vmc8d11)
59 SNP961_139
72 (vmc1a12)
73 SNP1495_148
79
80 (vviv04)
82
83
88 (vvin56 )
94
Chromosome 7
SNP249_125

SNP1215_138
SNP557_104
SNP1201_99
SNP189_131
SNP651_658
SNP569_266
VVI_12882
VVI_589
SNP533_161
SNP1119_176
150
740
932
2888
3358
4176
6669
7769
8065
20424
22228
0
16
17 (vmc8g6)
21
22
24
33 (vmc2h4)
41
42

52 (vmc4f3)
56 (vmc8g9)
67
Chromosome 12
0
5
6 (vvih54)
19 (vmc3d12)
36
37 (vmc9h4)
38
43
45 (vvmd29)
47
48 (vmc2c7)
49
52 (vvip10)
56 (vmcng1d12)
Chromosome 13
SNP659_73
VVI_4146
SNP697_296
SNP653_90
SNP1187_35
SNP351_85
VVI_7387
SNP259_199
SNP1363_171
342
3293

5614
15950
16901
18212
18841
21618
22370
SNP1335_204
SNP1231_54
SNP1079_58
VBFT_469
VBFT_361
VBFT_298
SNP1349_174
1964
7045
13454
16173
16174
21202
0 (vvin52)
6
9
13 (vvit65)
15 (vviv17)
16 (udv_104)
19
21 (vmc1e11)
28 (udv-052)
35

45 (vvmd5)
47 (vvmd37)
49 (scu14vv)
50
53 (vmc5a1)
Chromosome 16
Chromosome 17
2
5
12 (vmc3c11_1)
20 (scu06vv)
27 (vmc3a9)
35 (vviq22b)
39
47
48 (vmc9g4)
54 (vvib09)
64
66 (vvip44)
33 (vvin73)
LFY-ET2_29
VVI_6987
SNP455_141
SNP643_344
SNP579_187
SNP877_268
SNP879_308
73
1092
2265

5528
6001
7126
12206
SNP1023_227
SNP1045_291
SNP1003_336
SNP1001_250
SNP1519_47
VVI_221
SNP219_172
SNP355_154
VVI_1617
SNP453_375
VVI_196
VVI_9920
VVI_1113
SNP415_209
SNP883_160
SNP859_294
SNP1391_48
1102
1780
3829
4140
4439
4495
5606
6488
6556

7891
11139
11631
11768
12456
19527
20208
0 (vmc3e5)
5 (vmc2a3)
7
14
15 (vviv16)
17
18
20 (scu10vv)
21
24 (vmcng1b9)
26
28
32
33 (vvim93)
43 (vmc8f4_2)
45 (vvin83)
49
50
54 (vmc2a7)
63 (vvmd17)
68 (vvin16)
70 (vmc6f11)
73 VVI_10777

78 (vmc7f2)
Chromosome 18
SNP817_209
5276
SNP253_145
5984
SNP459_140
6548
SNP819_210
7217
VVI_1187
9027
VVI_7824
9073
SNP463_296
15781
SNP1127_70
17751
0 (vmc9a2_1)
2 (vmc5h11)
10 (udv-023)
25 (vmc5e9)
28
34
35
38 (vvip31)
42
46 (vmc6c7)
49
54 (viv33)

61 (vmc7b1)
Chromosome 19
SNP605_120i
SNP251_159
SNP325_65
VVI_2319
SNP897_57
SNP217_190
VVI_2292
VVI_1222
SNP1411_565
SNP421_234
VVI_3163
VVI_3947
SNP1161_328
SNP1035_226
537
657
5688
6134
8588
15876
16580
18873
23135
24484
25222
26713
29513
29591

(vmc2h5)
(vmc2a5)
0 (vmcng1e1)
10 (udv-050)
15
19
21 (vmc1e12)
25 (vvip22)
28
32 (vmc2B11)
34 (vmc6c10)
38
40
44 (vvmd24)
45
48
57 (vvis70)
58 (vmcng1g1)
63
65 (vvin70)
Chromosome 14
Chromosome 15
SNP451_287
SNP1507_64
SNP1371_290
SNP227_191
VVI_3212
VVI_1280
SNP555_132
VVI_11273

SNP591_148
SNP1311_48
2231
10851
11043
15145
15944
17142
18032
18136
18496
19340
0
8 (vviv67)
9 (udv-047)
10 (vvib63)
12 (vviq61)
13 (vvip33)
20 (vmc5g8)
21
24 (vvim42a)
25
28
32 (vmc8g3)
34
37
45
(vmc4d9_2)
(vvmd30)
SNP1027_69

SNP1053_81
SNP1431_584
SNP1071_151
SNP625_278
VVI_5316
SNP855_103
SNP1471_179
VVI_10113
SNP1235_35
SNP567_341
VVI_11572
VVI_10383
1786
2542
2876
3991
4943
5358
5772
5773
6745
7144
9208
18257
22524
0
3
14
18
19

22
23
25
26
27
29 (vrzag79)
30 (vvit68)
43
50 (vmc9b5)
61
77 (vmc4c6)
Chromosome 5
(vrzag47)
(vvmd27)
SNP649_567
SNP283_32
SNP543_268
SNP447_244
SNP1437_100
SNP345_421
SNP397_331
638
2460
4634
5489
6376
6528
13114
SNP947_288
(vmc3d7)

0 (vvih01)
SNP1029_57
5
6
21 (vrzag67)
22
23 (vrzag25)
33 (vmc2a10)
38 (vmc8d3)
42 (vmc3e11.2)
50 (vviv37)
Chromosome 10
3
Figure 1 SNP genetic and physical po sition. For each chromosome, th e map on the left (gray bars) shows the physical pos ition of studied
SNP markers on the 12X grapevine sequence of the PN40024 near homozygous line [40] indicated in kilobases; and the map on the right
(empty bars) shows the genetic position, indicated in centiMorgans, of microsatellites (between brackets) and SNP genetically mapped using the
four segregating progenies. Markers with known position in only one of these maps are indicated in bold: in the map on the left, the SNP with
known physical position that could not be mapped genetically; and in the map on the right SNP mapped genetically but with unknown or
uncertain physical position.
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 3 of 12
(23%), which is likely due to the smaller size of the pro-
geny (Additional file 1, Table S4).
Sequence searches for the SNP surrounding sequences
(Additional file 1, Table S3) within the 12× genomic
sequence of Vitis vinifera />externe/GenomeBrowser/Vitis/ allowed for physically
positioning most of the studied SNP (Figure 1, Addi-
tional file 1, Table S3). Two-hundred and twenty-five
out of the 238 polymorphic SNP could be positioned on
the physical ma p with an average of 12 SNP per chro-

mosome (from 7 SNP on linkage groups 10, 11, 16 and
17, to 21 SNP on linkage group 8). The average distance
among physically mapped SNP was 1.76 Mb. Thirteen
SNPcouldnotbephysicallylocated.Thiscouldbe
either due to the lack of significant matches with the
12× genomic sequenc e (VV5629 and SNP575_128), the
identification of different locations with the same likeli-
hood (SNP241_201 and SNP1495_148) or their localiza-
tion on unlinked chromosome scaffolds. Linkage
mapping allowed for localizing 12 out of t he 13 SNP
that could not be positioned in the physical map (Addi-
tional file 1, Table S3, Figure 1). The only marker that
could not be mapped either physical or genetically
(SNP575_128) corresponds with one of the two SNP
where adjacent sequences could not be found in the
search on the 12× genome sequence.
Marker order was generally conserved between physi-
cal and genetic maps, although discrepancies were
found on chromosomes 1, 3, 10, 12 involving differences
ofupto7.6Mband12cM.Inaddition,smalllocal
marker inversions, involving < 1.5 Mb and < 6 cM dis-
tances, were observed for chromosomes 1, 2, 4, 5, 6, 7,
8, 13 and 19 (Figure 1). Most of these discrepancies
could be attributed to some of the previously mentioned
factors affecting the accuracy of the genetic position
assigned to each marker. However, none of these factors
were present in the most important differences (chro-
mosomes 3 and 10) , which point s out some problems in
the current physical map of those regions and that may
be related to genome rearrangements or assembly erro rs

on the 12× grapevine sequence of the PN40024 near
homozygous line />GenomeBrowser/Vitis/. For example, marker
SNP425_205 (on e of the two SNP markers on chromo-
some 3 included in the SNP set for varietal identifica-
tion) showed significant discrepancies between physical
and genetic distances with the surrounding markers
leading to differences in marker order for this region
(Figure 1, Additional file 1, Table S3). In the current
12× version of the genomic sequence of Vitis vinifera,
this marker is at 1.4 Mb from SNP613_315 (the second
marker included in the 48 SNP set for varietal identifi-
cation for this chromosome). However, marker order on
the genetic map aligns with marker order in the version
of the genomic sequence (8× at NCBI, data not shown)
in which both SNP are separated by 4.4 Mb as well as
with marker order in the Pinot Noir sequenc e http://
genomics.research.iasma.it/gb2/gbrowse/grape/.
Selection of the SNP Set for Genetic Identification
Currently, intra-laboratory genetic identification of
grapevine varieties does not represent a major problem
given the large number of microsatellite and SNP mar-
kers that have become available over the years
[6,7,21-23]. However, it is very important to develop a
system that is efficient, rapid and cheap for identifying
the several thousand cultivars currently available in
grapevine. This requires the careful design of a set of
highly polymorphic and stable markers with proven
quality and reproducibility that allow for constructing
databases easy to share among different laboratories. In
order t o develop such a system based on SNP markers,

three s election criteria were considered: high frequency
of genotyping success, high minor allele frequency
(MAF) to provide higher PIC and good chromosomal
distribution to end up with a total of 48 SNP distributed
at a rate of 2-3 SNP per chromosome. When these cri-
teria were applied on the available SNP (Additional file
1, Table S3 and Figure 1), a selection that was used for
the design of a 48 SNP set (Table 1) was obtained. A
completely new design with only the selected 48 SNP
set was built, and their stability and quality for genetic
identification was thoroughly evaluated.
Evaluation of the Stability of the SNP Set for Genetic
Identification
Stability of the 48 SNP markers was evaluated through
the analysis of the genotypes obtained for an average of
85 plants for each 15 cultivars (Additional file 1, Table
S2). This study also allowed for scoring the rate of geno-
typing success. The 15 cultivars represent a large pheno-
typic diversity for impo rtant traits in grapevine
regarding their use (wine, table, and raisin), berry colour
(black, red and white), maturity time (early, medium
and late), p resence of seeds (seeded an d seedless) and
other t raits [24]. In addition to their diverse geographi-
cal origin ( France, Spain, Near East, Middle East), the
15 cultivars exhibit age differences as well: from very
ancient cultivars, likely more than thousand years old (e.
g. ‘Muscat of Alexandria’, ‘Thompson Seedless’), to culti-
vars originating o nly a few centuries ago (e.g. ‘Cabernet
Sauvignon’ and those bred in the 20
th

century (e.g. ‘Car-
dinal’, ‘Crimson Seedless’).
A total of 1342 plants were analyzed with the newly
designed 48 SNP set. Table 2 shows the genotypes
obtained for each variety. No genotype could be estab-
lished in any of the plants for SNP VV1617 and, there-
fore, was excluded from the analysis. Nevertheless, this
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 4 of 12
Table 1 Main features of the 48 SNP Set
Physical position Genetic position
SNP Polymorphism Chromosome Nucleotide LG cM
SNP1003_336 A/C 18 3829207 18 16.7
SNP1015_67 A/G unknown 8839239 7 40.4
SNP1027_69 C/T 5 1785979 5 3.3
SNP1035_226 C/T 14 29590769 14 63.4
SNP1079_58 A/G 16 13454358 16 18.9
SNP1119_176 A/C 12 22228357 12 67.4
SNP1127_70 G/T 19 17751334 19 53.6
SNP1157_64 A/T 1 22828604 1 60.6
SNP1215_138 C/T 12 739916 12 21.9
SNP1229_219 G/C 2 17198115 2 52.7
SNP1323_155 A/C 8 13401437 8 36.6
SNP1347_100 A/G 7 1388822 7 0
SNP1349_174 A/G 16 21202286 16 50.1
SNP1399_81 A/G 4 21849155 4 66.5
SNP1411_565 A/T 14 23135445 14 38.1
SNP1445_218 A/G 7 18046355 7 81.7
SNP1453_40 A/G 1 729514 1 0.7
SNP1471_179 C/T 5 5773320 5 26

SNP1513_153 C/T 4 680574 4 0
SNP191_100 C/T 4 6409234 4 24
SNP197_82 A/C 11 311765 11 0
SNP227_191 A/C 15 15145042 15 21.3
SNP259_199 A/T 13 21618145 13 48.7
SNP269_308 A/G 1 5948674 1 29.3
SNP325_65 A/T 14 5687725 14 15.2
SNP425_205 A/C 3 3676120 3 29.9
SNP447_244 C/T 10 5489212 10 37.5
SNP555_132 A/C 15 18031506 15 34
SNP579_187 C/T 17 6000914 17 38.5
SNP581_114 A/G 2 5141894 2 24.6
SNP593_149 C/T 8 3320936 8 7.9
SNP613_315 C/T 3 1348328 3 0
SNP697_296 A/G 13 5613947 unknown unknown
SNP819_210 A/T 19 7217380 19 42.1
SNP829_281 A/G 2 415342 2 0
SNP873_244 C/T 6 4258638 6 14
SNP879_308 A/G 17 12206201 17 64
SNP895_382 A/T 6 17593092 6 56.7
SNP945_88 A/G 6 327200 6 0
SNP947_288 A/G unknown 9111477 10 4.8
VV10113 A/G 5 6744629 5 25
VV10329 C/T 9 21409416 9 53.1
VV10353 G/A 11 19390306 11 64.1
VV10992 A/T 9 3123999 9 14.1
VV12882 T/C 12 7768973 12 40.5
VV1617 A/C 18 6487636 18 27.9
VV9227 T/A 2 6474327 2 37.4
VV9920 A/G 18 11138668 18 48.5

Cabezas et al.
BMC Plant Biology 2011, 11:153
http://ww
w.biomedcentral.com/1471-2229/11/153
Page 5 of 12
Table 2 Genotypes for the 48 SNP set in the cultivars used for the stability study
AIR CBS CAR CRI FLA MER MON MOA NAP OHA PAL REG SAU TEM THO
N° plants with complete genotype 70 56 79 80 55 77 75 86 82 84 81 64 64 58 54
SNP1003_336 AC AA AA AC AA CC AC AC AC AC AC AC AC CC AC
SNP1015_67 GG GG GG AA GG GG GG GG GG AG GG GG AG GG AG
SNP1027_69 CT CC CT CT CT CC CT CC CT CT TT CC CC CC CT
SNP1035_226 CT TT TT CC CT TT TT CT CC CT TT CT TT CT CT
SNP1079_58 AA AG AG AA AA AA AG AG AG AA AG GG GG AG AG
SNP1119_176 AA CC* CC AA CC AA CC* CC* AA AA CC* CC* CC CC CC
SNP1127_70 GG GT GG GG GG GT GG GT GG GG GG GT GT GG GG
SNP1157_64 TT AT AT AT TT TT TT TT AT AT TT TT AT TT TT
SNP1215_138 CC CC CT CC CC TT CT CT CT CT CT CT CT CC CT
SNP1229_219 CC CC CC CC CC CC CC CC CC CC CC CC CC CC CG
SNP1323_155 CC CC CC AA AA AC AA AC AA AC CC CC AC AC AC
SNP1347_100 AG AG AG AG AG AG AA GG GG GG AG AG AG GG AG
SNP1349_174 GG AG AA AG AA AG GG AA AG AG AG GG AA AA AG
SNP1399_81 AA AG AA AA AA AA AA AA AG AA AA AA AG AA AA
SNP1411_565 TT TT TT TT TT AA AT AT TT TT AT AA TT AT AT
SNP1445_218 AG AA GG AG GG AA AA GG GG GG GG AG AG GG AG
SNP1453_40 AA AA AG AG AA AG AA AG AG AA AG AG AG GG AA
SNP1471_179 TT CT CT TT TT TT TT CT CT TT TT TT CT CT TT
SNP1513_153 TT CT CT TT CT CC CT TT CT CT CT TT CC CC CT
SNP191_100 CC CT CC CC CC CC CC CC CC CC CC CC CT CC CC
SNP197_82 CC AC CC AC CC AC AA CC CC CC CC CC AC AA CC
SNP227_191 AA AC AC AA AA AC AC AA AA AA AA AC AA AC CC

SNP259_199 AT AT TT AT TT TT AT AT TT AT TT AA TT AA AA
SNP269_308 GG GG AG AA AG AG AG AA AA AA AG AG AG GG AA
SNP325_65 TT AT AT AA AA AA AT AA AA AA AT AA AA TT AA
SNP425_205 AA AC AA AA AA CC AA AA AA AA CC AA AC AA AA
SNP447_244 CT CT TT CT TT CT TT CT CT CT CC CT CT CC CT
SNP555_132 AA AA AC AC AA AC AC CC AA AA AA AA AC AC AA
SNP579_187 TT TT TT CT TT TT CT TT TT TT CT TT TT TT TT
SNP581_114 AG AG AA AG AG AG AG AG AG GG AG AA GG AG AG
SNP593_149 CT CT CT CT TT CT TT CT TT TT CT TT CT TT CT
SNP613_315 CT CC CC CC CC CC CT CC CC CT CT CC CC CC CT
SNP697_296 AG AA AA AA AA AA AA AA AA AA AA AA AA AA AA
SNP819_210 AT TT AT TT TT AT AA AT TT AA TT TT TT AA TT
SNP829_281 AG AG AG AG AG AA AA AG GG AG AG AG AG AA GG
SNP873_244 CT TT CC CC CC CC CT CT CT CC TT CC CT TT CT
SNP879_308 GG AG AG AG AA AA AG AA AA AG GG AA GG AA AA
SNP895_382 AT TT AT AT AT TT TT AA AA AT AT AT TT AA AT
SNP945_88 AA AG AA AG AA AG AA AG AG AG AG AG GG AG AG
SNP947_288 AG AG AG GG GG AG GG AG AG AG AG GG AG AG AG
VV10113 AA AG AA AG AA AA AA AA AA AG AG AA AA AA AG
VV10329 CT CT TT TT TT CC CT CT CC CC CT CT TT CT TT
VV10353 GG AG GG GG GG AG GG AG GG GG GG AG GG GG AG
VV10992 TT AT AT AT TT AT AT AT AT AT AT TT AA TT TT
VV12882 TT CT TT TT TT TT CC TT TT TT TT TT CC TT TT
VV1617
VV9227 AT AT AT TT TT TT - TT TT TT AT TT AT TT TT
VV9920 GG AG AG GG GG AG GG AA AA GG GG GG GG AA GG
AIR: Airén; CBS: Cabernet Sauvignon; CAR: Cardinal; CRI: Crimson Seedless; FLA: Flame Seedless; ITA: Italia; MER: Mer lot; MON: Monastrell; MOA: Muscat of
Alexandria; NAP: Napoleon; OHA: Ohanes; PAL: Palomino; REG: Red Globe; SAU: Sauvignon Bla nc; TEM: Tempranillo; THO: Thompson Seedless.
*The correct genotype is AC, according to data obtained later (see text)
Cabezas et al. BMC Plant Biology 2011, 11:153

/>Page 6 of 12
SNP worked regularly in other genotyping analyses and
was included in further tests. In addition, genotyping for
SNP325_65andVV9227failedcompletelyinthe‘ Mon-
astrell’ cultivar. The genotype for SNP325_65 could be
obtained for this cultivar after several analyses but this
was not the case for VV9227 (data not shown). The
existence of a homozygous null allele in this cultivar for
VV9227 was discarded because it presented an A/T gen-
otype for this SNP in the previous genotyping with the
332 SNP set.
A complete genotype (47 SNP) was obtained for 990
plants corresponding to an average of 66 plants per vari-
ety with a range from 54 to 86 plants (Table 2, Table 3)
excluding ‘Monastrell’. No genotype could be established
for 65 plants. This could be due to a low DNA concen-
tration in a number of cases (17 DNAs were below a
concentration of 4 ng/ul) but, in most cases, failures
were probably due to the presence of contaminants that
prevented amplification. Apart from the cases where no
plant (one SNP) nor SNP (65 plants) could be geno-
typed, the average genotyping rate was 97.1% (Table 3).
Marker SNP697_296 presented the highest genotyping
success rate and o nly failed in two plants. Ten SNP
markers presented a genotyping success rate above 0.99,
and 40 SNP above 0.95.
Regarding the sta bility analys is, 99.4% of all the geno-
typed plants showed the genotype expected for the culti-
var. Only three SNP showed a different genotype in
plants of the same cultivar: SNP1119_176 and

SNP 581_114 (in one ‘ Ohanes’ plant), and SNP1347_100
(in one ‘ Flame Seedless’ plant).Todetermineifthese
variations were due to mutations (lack of stability) or
genotyping errors, the analyses were repeated using the
same DNA extraction as well as independent DN A
extractions for each plant. T he results indicate that all
discrepancies corresponded to genotyping errors. I n
summary, no mutation could be found in the 58251
individual SNP genotypes established for the 15 varieties
studied and, therefore, the SNP marker set could be
considered highly stable.
Evaluation of the SNP Set for Genetic Identification
Purposes
A total of 200 grapevine accessions were genotyped with
the48SNPsetincludingasamplefromeachofthe
varieties studied in the stability analysis. Some of the
accessions resulted in identical genotypes but these
results always agreed with the expectations; since they
corresponded either to synonymous cultivars or s ports
(phenotypically different cultivars generated by sponta-
neous somatic mutations a nd later propagated through
cuttings). Sports are not expected to differ from their
initial cultivar by using molecular markers. This was
confirmed for several sports: ‘ Chasselas Apyrene’ ,a
seedless sport, did not differ from ‘ Chasselas Blanc’ .
Within the Pinot group, ‘Pinot Blanc’ showed an identi-
cal genotype for the 48 SNP set to ‘Pinot Noir’ and also
‘ Pinot Meunier’ , a genetic chimera [25], showed the
same genotype. Nevertheless, ‘Pinot Gris’,anothercol-
our sport, presented a homozygous genotype CC for

SNP1229_219, while the other cultivars of the group
were heterozygous CG. This is not surprising since the
‘Pinot’ group has the la rgest intra-varietal variation mea-
sured with microsatellite markers [26-29].
Another one-allele difference was observed when gen-
otyp es obta ined in this study were compared with those
obtainedforthesamevarieties in the stability analysis
(see above) but, while in the case of ‘Pinot Gris’ the dif-
ference was co nsistent and could be co nsidered a
genetic m utation, in the later cases they were shown to
be due to genotyping errors. The difference was
observed in 5 varieties for the SNP1119_176 (Table 2).
In all cases a mistaken homozygous genotype (CC) was
assigned to plants studied in the stability analysis, while
the correct one was heterozygote (AC). T hese SNP gen-
otyping mistakes are more frequent when most samples
in the plate have the same genotype, since reference
genotype clouds corresponding to the three possible
genotypes per SNP locus are more difficult to establish .
In fact, when some of these wrongly genotyped samples
were re-analyzed wit h samples fro m other plates, they
were assigned the correct heterozygous (AC) genotype.
A non-r edundant genotype sample was built to evalu-
ate genetic parameters related to the discrimination
power of the SNP set for grapevine cultivars. Of 200
accessions studied, 49 genotypes, corresponding to syno-
nym cultivars, sports and wild plants, were discarded. In
the resulting sample contain ing 151 non-redundant cul-
tivars (Additional file 1, Table S1), allelic frequencies
and several genetic parameters were determined. The

Table 3 Genotyping efficiency and reliability of the 48
SNP set
N° Plants Rate
Genotyped* 1277
Complete genotype 990 0.775
> 95% genotype 1155 0.904
SNP highest genotyping success
rate
1275
(SNP697_296)
0.998
SNP lowest genotyping success
rate
1139
(SNP325_65)
0.892
N° individual SNP
genotypes
Rate
Total 60019
Obtained 58256 0.971
N° mistaken genotypes 3 0.000051
* Excluding 1 SNP that did not work in this experiment and 65 plants for
which none SNP genotype could be established.
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 7 of 12
MAF is a measure of the discriminating ability of the
markers. In the case of bi-allelic markers, the closer
MAF is to 0.5, the better. In the study, 19 SNP showed
a MAF between 0.4 and 0.5, while only three SNP had

a MAF below 0.1. The unbiased expected heterozygos-
ity (He) was 0.404 ranging from 0.107 (SNP1399_81)
to 0.501 (SNP581_114, SNP829_281 and VV10992)
(Table 4). Only three SNP showed PIC values below
0.2, the remaining comprised between 0.2 and 0.4.
These values indicate that t he whole SNP set has a
very high discriminating capacity for grapevine vari-
eties, and is supported by the very low global probabil-
ity of identity (PI): 1.4·10
-17
. This value is much
smaller than that obtained with the 6 SSR markers
approved as descriptors by the International Organisa-
tion of Vine and Wine (OIV) in the analysis of 57
unique Spanish genotypes (10
-7
[30]) and with 9
microsatellites in the analysis of 164 European culti-
vars (10
-9
[31]), or of 991 grapevine accessions (7·10
-12
,
[23]). In contrast, the PI obtained for the 48 SNP set is
larger than the value obtained with 18 m icrosatellites
in 2,739 grapevine accessions (10
-22
,[21]),orwith34
microsatellite s in 745 accessions (10
-27

[32]). These
representative examples show that, on the average, the
probability of identity per microsatellite marker is
between 0.06 and 0.16 while the average in the SNP
set used here is 0.445 per marker. Therefore, 3-4 SNP
loci would be needed to provide the discriminating
power of one microsatellite locus in grapevine. Corre-
spondingly, the 48 SNP set would give a similar identi-
fication power as 14-16 microsatellites.
The task of cultivar characterization is often related to
legal issues. Of utmost importance is that in the techni-
cal test any variety has to overcome the authorization to
be cultivated in many countries and that distinctness is
the most important i ssue to be established in such tests:
a variety is considered distinct if it can be clearly distin-
guished from all the varieties of common knowledge
(Act of the International Union for the Protection of
New Varieties of Plants (UPO V) Convention, 1991;
/>act1991 .htm). The key concept for establishing distinct-
ness is the minimum distance between varieties, which
is currently established on a species by species basis,
using morphological descriptors. In recent years, some
efforts have been directed to incorporate molecular mar-
kers [23]. In the present study, the minimum distance
among t he varieties with non-redundant genotypes was
determined through their pair-wise c omparison and
measured by the number of different alleles (Figure 2).
The average difference between a nalyzed c ultivars was
30 alleles from a total of 96 while the most different
samples differed in 54 alleles. The closest cultivars

found were ‘Jaén Negra’ and ‘Zalema’, which differed i n
9 alleles out of the 90 that could be compared between
them. These two cultivars have genotypes that are com-
patible with being parent/offspring, both based on
microsatellites [33] as well as the SNP markers used in
this study. The next closest cultivars found were ‘Ciruela
Roja’ and ‘Colgar Roja’ that differed in 10 out of th e 96
alleles studied. These two cultivars have recently been
described as siblings of the same cross: ‘ Ohanes’ ×
‘ Ragol’ [34]. The same occurs with ‘ Chardonnay’ and
‘Melon’, which matched for 86 alleles and have microsa-
tellite genotypes consistent with being the progeny of a
single pair of parents, ‘Pinot’ and ‘ Gouais blanc’ [35].
Hence cultivars studied even those genetically close, pre-
sent large measured differences in the number of diverse
alleles.
From the data, a very clear border exists between the
highest intra-varietal variability (including here the
sports) with 1 different allele and the lowest inter-varie-
tal distance of 9 different alleles. Thus, there should not
be any difficulty in establishing a minimum distance
between 2 and 9 all eles for the 48 SNP set and it is
large enough as to be considered conclusive for estab-
lishing distinctness in grapevine cultivars (excluding that
of sports). Still a more extensive diversity study would
be needed to find a more reliable m inimum distance,
since it could be shorter in full siblings derived from
closely related progenit ors as those used in current table
grape breeding.
The Mendelian genetic inheritance of these 48 SNP

markers has been confirmed in seve ral previousl y
described mapping populations. This feature also per-
mits the genetic examination of pedigrees and parent/
offspring relationships. Using the selected 48 SNP set,
the total exclusion probabilit y of paternity found for the
set of 151 cultivars was high (0.9997) but the number of
markers is far too small for a reliable pedigree analysis.
Logarithm of odds (LOD) scores obtained for several
triosrangedfrom17to23,whicharenotlargeenough
to reach final conclusions.
Table 4 Genetic parameters estimated for SNP within the
48 SNP set
Min Max Average
He SNP1399_81 SNP829_281
0.107 0.501 0.404
Ho SNP425_205 SNP581_114
0.060 0.765 0.397
PIC SNP1399_81 SNP829_281
0.101 0.375 0.315
PI SNP829_281 SNP1399_81
0.375 0.804 0.457
He: Expected heterozygosity; Ho: Observed heterozygosity; PIC: Polymorphism
information content; PI: Probability of identity.
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 8 of 12
Conclusions
A set of 48 single nucleotide polymorphisms (SNP) have
been selected well distributed throughout the grapevine
genome and tested for genetic identification purposes.
The selected markers have proven to be highly stable

and repeatable and also have a high discriminating
power for grapevine cultivars. SNP data do not require
any allele binning and allows for direct databasing and
direct comparison of data arising from different labora-
tories. All these c haracteristics make our set of markers
very suitable for the building of a worldwide publicly
available genotype database for grapevine cultivars.
Methods
Plant Material and DNA Extraction
Three different cultivar sample sets and four segregating
populations were used in this study. For the determina-
tion of genetic parameters concerning the 332 SNP mar-
kers under study a sample of 300 accessions including
91 wild accessions as well as wine- and table- grape cul-
tivars (Additional file 1, Tab le S1) was used. These
accessio ns are mostly m aintained at t he germplasm col-
lection of “ Finca El Encín” (IMID RA, Alcalá de Henares,
Madrid, Spain).
Determination of chromosomal positions of SNP mar-
kers was carried out both genetically and physically. For
genetic determination four different segregating popula-
tions developed and maintained at the IMIDA (Murcia,
Spain) were used: Dominga × Autumn Seedle ss [36],
Monastrell × Cabernet Sauvignon, Ruby Seedless ×
Moscatuel and Muscat Hamburg × Sugraone. These
mapping populatio ns included 82, 85, 71 and 75 indivi-
duals, respectively.
The stability analysis for the selected 48 SNP set for
genetic identification was conducted using fifteen culti-
vars, representing a high amount of variation in the cul-

tivated Vitis vinifera species. Leaf materi al from a total
of 1277 plants belonging to those cultivars was collected
in 154 different plots in 7 different countries (Additional
file 1, Table S2).
Analysis of genetic diversity for the selected 48 SNP
set in terms of genetic identification was carried out on
200 accessions most of which came from the collection
of grape varieties of the IMIDRA at ‘El Encín’ and the
others from the CSIRO collection (Glen Osmond, Aus-
tralia) (Additional file 1, Table S1).
Total DNA was extracted from frozen young leaves of
each sample according to Lijavetzky et al. [37] and
stored at -20°C.
SNP Identification and Initial Genotyping
SNP discovery w as approached as described by Lija-
veztky et al. [13]. SNP genotyping was carried out at
the Centro Nacional de Genotipado http://www.
cegen.org using the SNPlex™ technology (Applied
Biosystems [38]). Usefulness of the 332 SNP was stu-
died using seven 48 SNP sets on the 300 accessions
sample set. After this initial genotyping, SNP mar-
kers with a low genotyping success rate and mono-
morphic SNP were discarded, while the remaining
ones were classified according to their minor allele
frequencies.
Figure 2 Representation of the genetic distances among varieties. The distances are measured in number of different alleles for the 11,325
pair-wise comparisons among the 151 non-redundant genotypes with 48 SNP. The small window is a zoom of the smallest distance zone.
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 9 of 12
Determination of SNP Positions

SNP genomic locations were determined based on both
genetic and physical information. Genetic positions were
established using four mapping populations following a
two-stage strategy. First, SNP markers were positioned
on the c onsensus framework map developed for each
cross using microsatellite markers. Molecular marker and
linkage analyses were carried out according to Cabezas et
al. 2006 [36] using a two way pseudo-testcross strategy
[39], and the Joinmap 3.0 software [20]. In this circum-
stance, SNP markers can only be mapped in segregating
progenies in which they segre gate as aaxab, abxaa or
abxab. Second, an integrated map for all progenies was
built chromosome by chromosome using microsatellites
as anchor markers and including all SNP segregating in
at least one progeny. The integrated map was con-
structed using the “combine groups for map integration”
function of Joinmap 3.0 [20]. Values of 3.5 for recombi-
nation frequency and 3 for LOD were used as initial
mapping thresholds. For chromosomes with regions
showing a low number of markers in common between
the different linkage maps values were moved up to 5 .0
and down to 0, respectively, allowing for map integration.
For SNP showing important discrepancies in their posi-
tion in the linkage maps of the different progenies physi-
cal mapping information and the “fixed order” function
[20] was used to establish marker order. SNP whose
inclusion led to large distortions in marker order were
discarded. Chromosome names were assigned following
the IGGP (International Grapevine Genome Program,
recommendations.

Physical positions of SNP markers were determined by
Blat searching for their adjacent sequences on the 12×
grapevine genomic sequence of the near homozygous
Pinot line PN40024 [40] and .
fr/externe/GenomeBrowser/Vitis/. Location of markers
involved in important discrepancies between genetic and
physical positions was also checked on the Pinot noir
genomic sequence />gbrowse/grape/[41].
Selection and Evaluation of a 48 SNP Set for Genetic
Identification
Over 48 SNP markers were selected from the previously
developed 332 according to their gen otyping success
rate, MAF as well as their genetic and physical positions.
The last step of selection of the set for genetic identifi-
cation was based on the technical requirements needed
for the design of a plex for the SNPlex™ platform.
Experimental design of the stability test for the
selected 48 SNP set included the analysis of 85 plants
from 10 different plots (on the average) for each o f 15
varieties. Plots had been planted in different years and
locations in 7 different countries (Additional file 1,
Table S2). Because grapevine varieties are clones, if mar-
kers used are stable, one expects to obtain the same
alleles for each SNP in every plant analyzed for the
same varie ty independently of their origin, age and
location.
The discriminating power of the selected 48 SNP set
for grapevine cultivar identification was evaluated with a
200 accessions sample.
Genotyping and genetic parameters were estimated

from these tests. For each SNP the rate of genotyping
success was calculated after excluding DNA samples
that failed in the amplification of all SNP. Genotyping
error was calculated based on the results obtained in
different analyses: by genotyping different DNA extrac-
tions of the same plant; by genotyping different plants
belonging to the same cultivar; or by studying known
sports of a given genotype such as those of the Pinot
family. Genetic parameters were estimated on non-
redundant genotypes. Minor allele frequency (MAF),
observed heterozygosity (H
o
), expected heterozygosity
(H
e
) and probability of identity (PI) were calculated
using the IDENTITY 1.0 tool [42] and the Excel Micro-
satellite Toolkit [43]. Pedigree relationships were ana-
lysed with the Cervus 3.0 software [44]. LOD scores
were obtained taking the natural log (log to base e) of
the overall likelihood ratios for the father-mother-off-
spring trios, as implemented in Cervus 3.0. [42].
Additional material
Additional file 1: Supplementary Tables S1 to S5. Table S1: Plant
samples analyzed. Table S2: Plant samples used for the stability studies of
the 48 SNP set. Table S3: Basic information on the 238 SNP analyzed.
Table S4: Genetic maps features. Table S5: Number of progenies with
heterozygous markers in at least one progenitor.
List of abbreviations used
cM: centimorgan; H

o
: Observed heterozygosity; H
e
: Expected heterozygosity;
IGGP: International Grapevine Genome Program; INDEL: Insertion-deletion;
LOD: Logarithm of odds; MAF: Minor allele frequency; Mb: Megabase; NCBI:
National Center for Biotechnology Information; OIV: International
Organisation of Vine and Wine; PI: Probability of identity; PIC: Polymorphism
information content; SNP: Single nucleotide polymorphism; SSR: Simple
sequence repeat; UPOV: International Union for the Protection of New
Varieties of Plants.
Acknowledgements
This study was financially supported by Grapegen and the 14322 Agreement
Projects from Genoma España as well as the VIN01-025 Project from the
Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria from
MICINN (Spanish Ministry for Science and Innovation) and in part by CSIRO
Plant Industry and the Grape and Wine Research and Development
Corporation (GWRDC). We also thank MICINN for a bilateral collaborative
grant with Argentina (AR2009-0021), Applied Biosystems for their support in
the design of the 48 SNPlex set and the Centro Nacional de Genotipado
for SNPlex genotyping. The research group
participates in COST Action FA1003. We are very grateful to the Spanish
National Grapevine Germplasm Collection at “El Encín”, IMIDRA, Madrid, for
its plant materials. We also thank Enrique Ritter and Mónica Hernández
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 10 of 12
(Neiker, Vitoria, Spain) for sharing with us their data on the progeny MnxCS
and José Antonio Machín (Genoma España) for manuscript revision. M.D.
Vélez was funded by a pre-doctoral fellowship from the Instituto Madrileño
de Investigación y Desarrollo Rural, Agrario y Alimentario (IMIDRA). The large

sampling needed in this study for the stability tests was made possible
thanks to the collaboration of numerous people and public and private
institutions in Spain and abroad. We are grateful to all of them. Specifically,
plant material from wine varieties was obtained thanks to the collaboration
of numerous regulator councils of wine origin denominations: Almansa,
Calatayud, Campo de Borja, Cariñena, Tarragona, Condado de Huelva,
Costers del Segre, Jerez, Jumilla, La Mancha, Méntrida, Monterrei, Montilla-
Moriles, Navarra, Penedés, Ribeira Sacra, Ribeiro, Ribera de Duero, Ribera del
Guadiana, Rueda, Rioja, Málaga, Utiel-Requena, Valdeorras, Valdepeñas,
Valencia, Vinos de Madrid and Yecla. Among the people who personally
contributed to the sampling are: José María Hurtado (Superior Frutícola S.A.,
Murcia, Spain); Edit Hajdu (FVM Szölészeti és Borászati Kutató Intézete,
Hungary); Patricio Hinrichsen (Centro experimental La Platina-INIA, Chile); Tim
Sheehan (Sheehan Genetics, USA); Jean Satterwhite (National Clonal
Germoplasm Repository for Nut Crops, USA); Erika Maul (Institute for
Grapevine Breeding, Germany); Jorge Zerolo (Agrovolcán, Tenerife, Spain);
Nuria Cid (Estación de Viticultura y Enología de Galicia, Orense, Spain); Peter
Allderman (Top Fruit, RSA); Thierry Lacombe (DGPC-Diversité et Génomes
des Plantes Cultivées, France); Miguel Lara (CIFA-Centro de Investigación y
Formación Agraria, Jerez de la Frontera, Spain); Joaquín Borrego, Paz
Fernández, Maite de Andrés, Carlos González, Alba Vargas, Gregorio Muñoz,
Cristina Rubio and Mariano Cabellos (IMIDRA, Spain). We apologize for any
non-deliberate omission in this list.
Author details
1
Departamento de Genética Molecular de Plantas, Centro Nacional de
Biotecnología, CSIC, C/Darwin 3, 28049 Madrid, Spain.
2
Instituto de Ciencias
de la Vid y del Vino (CSIC-Universidad de La Rioja-Gobierno de La Rioja).

Complejo Científico Tecnológico. C/Madre de Dios 51. 26006 Logroño. Spain.
3
Instituto Madrileño de Investigación y Desarrollo Rural, Agrario y
Alimentario (IMIDRA). Finca “El Encín”. Ctra A2, Km 38.200. 28800 Alcalá de
Henares. Madrid. Spain.
4
Instituto Murciano de Investigación y Desarrollo
Agrario y Alimentario (IMIDA). Estación Sericícola. C/Mayor, s/n. 30150 La
Alberca. Murcia. Spain.
5
CSIRO Plant Industry, PO Box 350, Glen Osmond, SA
5064, Australia.
6
Instituto Nacional de Investigación y Tecnología Agraria y
Alimentaria. Ctra de A Coruña, Km 7. 28040. Madrid. Spain.
7
Instituto de
Biología Agrícola de Mendoza, Facultad de Ciencias Agrarias, CONYCET-
Universidad Nacional de Cuyo, Almirante Brown 500, M5528AHB Chacras de
Coria, Argentina.
Authors’ contributions
JAC carried out the physical and genetic mapping of the SNP, participated
in the SNP selection and drafted part of the manuscript. JI carried out the
stability and genetic diversity analyses and drafted part of the manuscript.
DL played a part in the selection of SNP selection and characterization. MDV
participated in the stability analyses. GB, VR, IC and LRG contributed in the
genotyping of cultivars and progenies. AMJ participated in the SNP
selection. JC generated and maintained most of the progenies. MRT assisted
in the SNP selection and helped draft the manuscript. JMZ conceived the
study, partook in its design and coordination and helped draft the

manuscript. All authors read and approved the final manuscript.
Received: 26 July 2011 Accepted: 8 November 2011
Published: 8 November 2011
References
1. This P, Lacombe T, Thomas MR: Historical origins and genetic diversity of
wine grapes. Trends Genet 2006, 22(9):511-519.
2. Regner F, Hack R, Santiago JL: Highly variable Vitis microsatellite loci for
the identification of Pinot Noir clones. Vitis 2006, 45(2):85-91.
3. Thomas MR, Cain P, Scott NS: DNA typing of grapevines: A universal
methodology and database for describing cultivars and evaluating
genetic relatedness. Plant Mol Biol 1994, 25 :939-949.
4. Bowers JE, Meredith CP: The parentage of a classic wine grape, Cabernet
Sauvignon. Nat Genet 1997, 16(1):84-87.
5. Thomas MR, Scott NS: Microsatellite repeats in grapevine reveal DNA
polymorphisms when analysed as sequence-tagged sites (STSs). Theor
Appl Genet 1993, 86:985-990.
6. This P, Jung A, Boccacci P, Borrego J, Botta R, Costantini L, Crespan M,
Dangl GS, Eisenheld C, Ferreira-Monteiro F, Grando S, Ibáñez J, Lacombe T,
Laucou V, Magalhaes R, Meredith CP, Milani N, Peterlunger E, Regner F,
Zulini L, Maul E: Development of a standard set of microsatellite
reference alleles for identification of grape cultivars. Theor Appl Genet
2004, 109(7):1448-1458.
7. Cipriani G, Marrazzo MT, Di Gaspero G, Pfeiffer A, Morgante M, Testolin R: A
set of microsatellite markers with long core repeat optimized for grape
(Vitis spp.) genotyping - art. no. 127. BMC Plant Biol 2008, 8:127-127.
8. Allen AR, Taylor M, McKeown B, Curry AI, Lavery JF, Mitchell A,
Hartshorne D, Fries R, Skuce RA: Compilation of a panel of informative
single nucleotide polymorphisms for bovine identification in the
Northern Irish cattle population. Bmc Genetics 2010, 11.
9. Deleu W, Esteras C, Roig C, Gonzalez-To M, Fernandez-Silva I, Gonzalez-

Ibeas D, Blanca J, Aranda MA, Arus P, Nuez F, Monforte AJ, Pico MB, Garcia-
Mas J: A set of EST-SNPs for map saturation and cultivar identification in
melon. BMC Plant Biol 2009, 9.
10. Ganal MW, Altmann T, Roder MS: SNP identification in crop plants. Current
Opinion in Plant Biology 2009, 12(2):211-217.
11. Glover KA, Hansen MM, Lien S, Als TD, Hoyheim B, Skaala O: A comparison
of SNP and STR loci for delineating population structure and performing
individual genetic assignment. Bmc Genetics 2010, 11.
12. Hayden MJ, Tabone TL, Nguyen TM, Coventry S, Keiper FJ, Fox RL,
Chalmers KJ, Mather DE, Eglinton JK: An informative set of SNP markers
for molecular characterisation of Australian barley germplasm. Crop &
Pasture Science 2010, 61(1):70-83.
13. Lijavetzky D, Cabezas JA, Ibáñez A, Rodriguez V, Martínez-Zapater JM: High
throughput SNP discovery and genotyping in grapevine (Vitis vinifera
L.)
by
combining a re-sequencing approach and SNPlex technology. BMC
Genomics 2007, 8:424.
14. Aradhya MK, Dangl GS, Prins BH, Boursiquot JM, Walker MA, Meredith CP,
Simon CJ: Genetic structure and differentiation in cultivated grape, Vitis
vinifera L. Genetical Research 2003, 81(3):179-192.
15. Arroyo-García R, Ruiz-Garcia L, Bolling L, Ocete R, Lopez MA, Arnold C,
Ergul A, Soylemezoglu G, Uzun HI, Cabello F, Ibáñez J, Aradhya MK,
Atanassov A, Atanassov I, Balint S, Cenis JL, Costantini L, Goris-Lavets S,
Grando MS, Klein BY, McGovern PE, Merdinoglu D, Pejic I, Pelsy F,
Primikirios N, Risovannaya V, Roubelakis-Angelakis KA, Snoussi H, Sotiri P,
Tamhankar S, et al: Multiple origins of cultivated grapevine (Vitis vinifera
L. ssp sativa) based on chloroplast DNA polymorphisms. Mol Ecol 2006,
15(12):3707-3714.
16. Vezzulli S, Troggio M, Coppola G, Jermakow A, Cartwright D, Zharkikh A,

Stefanini M, Grando MS, Viola R, Adam-Blondon AF, Thomas M, This P,
Velasco R: A reference integrated map for cultivated grapevine (Vitis
vinifera L.) from three crosses, based on 283 SSR and 501 SNP-based
markers. Theor Appl Genet 2008, 117(4):499-511.
17. Zhang JK, Hausmann L, Eibach R, Welter LJ, Topfer R, Zyprian EM: A
framework map from grapevine V3125 (Vitis vinifera ’Schiava grossa’ ×
‘Riesling’) × rootstock cultivar ‘Borner’ (Vitis riparia × Vitis cinerea)to
localize genetic determinants of phylloxera root resistance. Theor Appl
Genet 2009, 119(6):1039-1051.
18. Troggio M, Malacarne G, Coppola G, Segala C, Cartwright DA, Pindo M,
Stefanini M, Mank R, Moroldo M, Morgante M, Grando MS, Velasco R: A
dense single-nucleotide polymorphism-based genetic linkage map of
grapevine (Vitis vinifera L.) anchoring pinot noir bacterial artificial
chromosome contigs. Genetics 2007, 176(4):2637-2650.
19. Lowe KM, Walker MA: Genetic linkage map of the interspecific grape
rootstock cross Ramsey (Vitis champinii) × Riparia Gloire (Vitis riparia).
Theor Appl Genet 2006, 112(8):1582-1592.
20. van Ooijen JW, Voorrips RE: JoinMap® 3.0, Software for the calculation of
genetic linkage maps. Wageningen: Plant Research International; 2001.
21. Laucou V, Lacombe T, Dechesne F, Siret R, Bruno JP, Dessup M, Dessup T,
Ortigosa P, Parra P, Roux C, Santoni S, Varès D, Péros JP, Boursiquot JM,
This P: High throughput analysis of grape genetic diversity as a tool for
germplasm collection management. TAG Theoretical and Applied Genetics
2011, 1-13.
22. Myles S, Boyko AR, Owens CL, Brown PJ, Grassi F, Aradhya MK, Prins B,
Reynolds A, Chia J-M, Ware D, Bustamante CD, Buckler ES:
Genetic
Cabezas et al. BMC Plant Biology 2011, 11:153
/>Page 11 of 12
structure and domestication history of the grape. Proc Nat Acad Sci USA

2011, 108(9):3457-3458.
23. Ibáñez J, Vélez M, de Andrés MT, Borrego J: Molecular markers for
establishing distinctness in vegetatively propagated crops: a case study
in grapevine. Theor Appl Genet 2009, 119(7):1213-1222.
24. Galet P: Dictionnaire Encyclopédique des Cépages. Paris: Hachette; 2000.
25. Franks TR, Botta R, Thomas MR, Franks J: Chimerism in grapevines:
implications for cultivar identity, ancestry and genetic improvement.
Theor Appl Genet 2002, 104(2-3):192-199.
26. Blaich R, Konradi J, Ruhl E, Forneck A: Assessing genetic variation among
Pinot noir (Vitis vinifera L.) clones with AFLP markers. Am J Enol Vitic
2007, 58:526-529.
27. Konradi J, Blaich R, Forneck A: Genetic variation among clones and sports
of ‘Pinot noir’ (Vitis vinifera L.). European Journal of Horticultural Science
2007, 72(6):275-279.
28. Regner F, Stadlbauer A, Eisenheld C, Kaserer H: Genetic relationships
among Pinots and related cultivars. Am J Enol Vitic 2000, 51(1):7-14.
29. Stenkamp SHG, Becker MS, Hill BHE, Blaich R, Forneck A: Clonal variation
and stability assay of chimeric Pinot Meunier (Vitis vinifera L.) and
descending sports. Euphytica 2009, 165(1):197-209.
30. Martín JP, Borrego J, Cabello F, Ortiz JM: Characterization of Spanish
grapevine cultivar diversity using sequence-tagged microsatellite site
markers. Genome 2003, 46:10-18.
31. Sefc KM, Lopes MS, Lefort F, Botta R, Roubelakis-Angelakis KA, Ibáñez J,
Pejic I, Wagner HW, Glössl J, Steinkellner H: Microsatellite variability in
grapevine cultivars from different European regions and evaluation of
assignment testing to assess the geographic origin of cultivars. Theor
Appl Genet 2000, 100:498-505.
32. Cipriani G, Spadotto A, Jurman I, Di Gaspero G, Crespan M, Meneghetti S,
Frare E, Vignani R, Cresti M, Morgante M, Pezzotti M, Pe E, Policriti A,
Testolin R: The SSR-based molecular profile of 1005 grapevine (Vitis

vinifera L.) accessions uncovers new synonymy and parentages, and
reveals a large admixture amongst varieties of different geographic
origin. Theor Appl Genet 2010, 1-17.
33. Ibáñez J, de Andrés MT, Molino A, Borrego J: Genetic study of key Spanish
grapevine varieties using microsatellite analysis. Am J Enol Vitic 2003,
54(1):22-30.
34. Vargas AM, de Andrés MT, Borrego J, Ibáñez J: Pedigrees of fifty table
grape cultivars. Am J Enol Vitic 2009, 60(4):525-532.
35. Bowers JE, Boursiquot JM, This P, Chu K, Johansson H, Meredith CP:
Historical genetics: the parentage of Chardonnay, Gamay, and other
wine grapes of northeastern France. Science 1999, 285:1562-1565.
36. Cabezas JA, Cervera MT, Ruiz-Garcia L, Carreno J, Martinez-Zapater JM: A
genetic analysis of seed and berry weight in grapevine. Genome 2006,
49(12):1572-1585.
37. Lijavetzky D, Ruiz-Garcia L, Cabezas JA, De Andres MT, Bravo G, Ibáñez A,
Carreño J, Cabello F, Ibáñez J, Martínez-Zapater JM: Molecular genetics of
berry colour variation in table grape. Molecular Genetics and Genomics
2006, 276(5):427-435.
38. De La Vega FA, Lazaruk KD, Rhodes MD, Wenz MH: Assessment of two
flexible and compatible SNP genotyping platforms: TaqMan (R) SNP
genotyping assays and the SNPlex (TM) genotyping system. Mutation
Research-Fundamental and Molecular Mechanisms of Mutagenesis 2005,
573(1-2):111-135.
39. Grattapaglia D, Sederoff R: Genetic-linkage maps of Eucalyptus-grandis
and Eucalyptus-urophylla using a pseudo-testcross - mapping strategy
and RAPD markers. Genetics 1994, 137(4):1121-1137.
40. Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N,
Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C,
Horner D, Mica E, Jublot D, Poulain J, Bruyere C, Billault A, Segurens B,
Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C,

Alaux M, Di Gaspero G, Dumas V, et al: The grapevine genome sequence
suggests ancestral hexaploidization in major angiosperm phyla. Nature
2007, 449(7161):463-467.
41. Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D,
Pindo M, Fitzgerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G,
Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M,
Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y,
Segala C, Davenport C, Dematte L, Mraz A, et al: A high quality draft
consensus sequence of the genome of a heterozygous grapevine
variety. PLoS ONE 2007, 2(12):e1326.
42. Wagner HW, Sefc KM: Identity. Vienna;, 1.0 1999.
43. Park SDE: Trypanotolerance in West African Cattle and the Population
Genetic Effects of Selection. Dublin: University of Dublin; 2001.
44. Kalinowski ST, Taper ML, Marshall TC: Revising how the computer
program CERVUS accommodates genotyping error increases success in
paternity assignment. Mol Ecol 2007, 16(5):1099-1106.
doi:10.1186/1471-2229-11-153
Cite this article as: Cabezas et al.: A 48 SNP set for grapevine cultivar
identification. BMC Plant Biology 2011 11:153.
Submit your next manuscript to BioMed Central
and take full advantage of:
• Convenient online submission
• Thorough peer review
• No space constraints or color figure charges
• Immediate publication on acceptance
• Inclusion in PubMed, CAS, Scopus and Google Scholar
• Research which is freely available for redistribution
Submit your manuscript at
www.biomedcentral.com/submit
Cabezas et al. BMC Plant Biology 2011, 11:153

/>Page 12 of 12

×