walnut genome analysis - university of …walnutresearch.ucdavis.edu/2009/2009_35.pdf · walnut...
TRANSCRIPT
WALNUT GENOME ANALYSIS Jan Dvorak, Ming-Cheng Luo, Mallikarjuna Aradhya, Dianne Velasco, Charles A. Leslie, Sandie L. Uratsu, Monica T. Britton, Russell L. Reagan, Gale H. McGranahan and Abhaya M. Dandekar ABSTRACT The goal of this project is to build a set of comprehensive genomic tools for walnut. These will facilitate a more precise evaluation of breeding populations and will accelerate development of improved walnut cultivars to address the needs of both California growers and the consumers of this important agricultural commodity. Development of these tools includes (1) construction of a physical map of the walnut genome, (2) a detailed survey of walnut gene expression, and (3) fine-scale genetic and association mapping of economically important traits. Two bacterial artificial chromosome (BAC) libraries comprising a total of 129,024 clones (64,512 each) were constructed from Persian walnut (Juglans regia cv. Chandler) DNA. Average insert lengths were 135 kb (HindIII) and 120 kb (MboI) for the two libraries respectively, providing approximately 20x genome coverage. To date 124,890 BAC clones have been fingerprinted using the five-color SNaPshot HICF technology. The fingerprints have been edited, and 113,073 could be used for contig assembly with the FPC program. A total of 916 contigs and 4,830 singletons were obtained. A total of 54,912 BAC end sequences (BES) have also been produced, one BES per BAC. A mapping population of 265 progeny of the cross Chandler’ x ‘Idaho that segregates for all commercially important walnut traits was evaluated with 15 SSRs to verify the parentage of F1 progeny. In addition, the genetic structure of 399 trees among 204 diverse accessions, including 62 elite germplasm used for breeding, was employed in the development of a population for association mapping of walnut traits. A total of 21 cDNA libraries were constructed for the characterization of the walnut transcriptome using next generation DNA sequencing. These genomic tools will significantly strengthen ongoing California walnut breeding efforts by facilitating marker-assisted selection strategies. The use of well-defined markers will significantly increase selection efficiency, the discovery of new genes, and rapid integration of these genes into genetic backgrounds adapted to California environmental conditions, thus accelerating the development of improved walnut cultivars. PROJECT OBJECTIVES
1. Physical mapping of the walnut genome 2. Genetic and association mapping of economically important walnut traits 3. Functional mapping of the walnut genome 4. Development of a ‘Walnut Genome Resource (WGR)’, a web-based knowledge base of
walnut genomic information
California Walnut Board 35 Walnut Research Reports 2009
PROCEDURES Objective 1: Physical mapping of the walnut genome A physical map of the walnut genome will be built concurrent with development of the genetic map. To accomplish the construction of a physical map, walnut genomic DNA fragments were cloned in a bacterial artificial chromosome (BAC) vector. Each BAC clone was fragmented with different restriction enzymes and ordered into contiguous sequences based on the overlap of fragment patterns. Ends of the BAC clones were then sequenced using Sanger DNA sequencing technology. Each of the BAC end sequences (BES) generated by this process is collinear with the BAC segments and thus corresponds to the sequence of nucleotides along a walnut chromosome. The presence of gene sequence tags (GSTs) within the BES will be confirmed through their expression in walnut tissues (Objective 3). The physical map also provides a scaffold upon which to assemble the complete walnut genomic sequence when such sequencing is performed. Objective 2: Genetic and association mapping of economic traits in walnut. Two different approaches were proposed for walnut genome mapping: (1) Linkage analysis of a conventional mapping population derived from a cross between parents that differ for traits under consideration; and (2) Association genetic analysis of a natural population such as a germplasm collection with genotypes of unknown or mixed ancestry that represent a common gene pool. Extensive DNA libraries of both mapping and association mapping populations have been developed. We are currently developing genotypic data using microsatellite polymorphisms to validate the full-sib nature of the mapping population and to assess the genetic structure of the association mapping population. Objective 3: Functional mapping of the walnut genome Genetic and physical maps describe the structure of the genome, but it is also essential to precisely document gene expression and to link specific traits (Objective 2) and GSTs (Objective 1) to underlying metabolic and biochemical processes. A key step toward this is gene transcript sequencing to identify expressed genes. Twenty tissue-specific gene transcript libraries have been constructed using tissue samples collected over the 2008 growing season. These transcripts were copied into DNA, cloned and high-throughput Illumina Solexa sequencing is in progress to generate millions of Expressed Sequence Tags (ESTs). The ESTs will be deposited in public databases, compiled, and annotated with computer analysis to identify genes involved in important metabolic pathways. The ESTs will be analyzed to validate GSTs through their expression pattern in different walnut tissues. Objective 4: Development of a ‘Walnut Genome Resource (WGR)’, a web-based knowledge base of walnut genomic information. A web-based browser is being developed as data from Objectives 1, 2 and 3 begins to accumulate. The database will contain all physical and linkage mapping information as well as all EST sequences and their integration with the walnut physical and genetic maps to better inform the walnut research community and to provide access to these genomic resources.
California Walnut Board 36 Walnut Research Reports 2009
RESULTS AND DISCUSSION Objective 1: Physical mapping of the walnut genome Physical mapping consists of cloning large genomic DNA fragments in a suitable vector such as a BAC and ordering the fragments so that their sequence reflects the order of nucleotides in a chromosome. The first step in the construction of physical maps is the construction of BAC libraries. Two bacterial artificial chromosome (BAC) libraries were constructed from DNA isolated from in vitro grown shoots of Persian walnut (Juglans regia cv. Chandler). Walnut genomic DNA was fragmented with either HindIII or MboI restriction endonucleases. A total of 129,024 clones, 64,512 per BAC library, were arrayed in 336 384-well plates. The average insert size was around 135 kb and 120 kb for the HindIII and MboI libraries, respectively. Assuming the walnut genome is approximately 800 Mb, these two BAC libraries represent approximately 20x genome equivalents. Each BAC library has been stored in triplicate. BAC fingerprinting and BAC end sequencing were initiated in 2008. We are using a previously developed fluorescence-based, high-throughput BAC DNA fingerprinting technique (Luo et al. 2003) that sizes DNA fragments from each BAC clone by capillary electrophoresis, creating a unique fragment profile or BAC clone fingerprint for each BAC clone (Fig. 1). The “SNaPshot” fragment pattern for each BAC clone, using a different restriction endonuclease, is analyzed using a 96-capillary ABI3730XL robotic DNA sequence analyzer. These BAC fingerprints are rapidly edited using a previously developed computer software (GenoProfiler; You et al. 2006). Another computer program, FPC (Soderlund et al. 2000), searches for overlaps between BAC fingerprints, creating contiguous sequences of BAC clones (contigs). These contigs (Fig. 2) reflect the sequences of nucleotides along individual chromosomes. As of November 20, 2009, 124,890 BAC clones have been fingerprinted with the 5-dye SNaPshot fingerprinting technique. The fingerprints have been edited, and 113,073 could be used for contig assembly with the FPC program. A total of 916 contigs and 4,830 singletons were obtained. A total of 54,912 BAC end sequences (BES) have also been produced, one BES per BAC. Objective 2: Genetic and association mapping of economically important traits in walnut Genotyping of mapping populations to confirm their full-sib origin: The F1 mapping population consisting of 265 individuals of the cross “Chandler x Idaho” has been genotyped and parentage confirmed. We have augmented the DNA and it will be available for the SNP genotyping when the platform is ready in the first quarter of 2010. The phenotypic data on traits of breeding value: (1) Lateral vs. terminal bearing; (2) Leafing and harvesting dates; (3) Nut size; (4) Shell thickness; (5) Shell seal; (6) Plumpness; (7) Fill (nut/kernel ratio); (6) Kernel color and; (7) Yield have been recorded for the second year. Association mapping using the walnut germplasm collection: We have increased the association mapping population of English walnut to 452 trees and they have been genotyped using 21 microsatellite loci. A cluster analysis (CA) of the microsatellite data indicated mild genetic structure within cultivated walnut (Fig. 3). The elite germplasm consisting of cultivars, breeders’ selections, and germplasm frequently used in breeding programs showed moderate differentiation. The Chinese germplasm appeared to be unique and exhibited considerable
California Walnut Board 37 Walnut Research Reports 2009
divergence and closely allied with SE Asian germplasm. Walnuts from South Asia are found to be the most diverse and there was further evidence of differentiation within this group. A small group of West Asian germplasm is nested within this group. Both South and West Asian germplasm have been extensively used in the California breeding program. Overall, the cultivated walnut (Juglans regia) shows a marginal differentiation and is suitable for association genetic analysis. The principle component analysis (PCA) of the walnut microsatellite data basically confirmed the results of CA and however, the genetic differentiation among the groups is more amplified and elucidated lot clearly (Fig. 4). All the six groups illustrated in the CA were evident in the PCA, but there was significant overlapping among clusters and also the minimum spanning tree superimposed on the PC plot suggested that the there is still significant genetic affinity within and among the groups (Fig. 5). The data has been further analyzed to quantify and describe the genetic diversity in the collection. There is significant deficiency of heterozygotes at all loci as compared to Hardy-Weinberg expected frequencies (Table 1) indicating some level of population substructuring within the collection as illustrated in the cluster analysis. All the six groups exhibited in the CA and PCA showed significant inbreeding within groups suggesting a complex but still mild population structure within the walnut germplasm collection (Table 2). Orthogonal partitioning of genetic variation indicated that the most variation is locked up within groups among individuals (~87%) and only a small proportion of the total variation is explained by genetic differentiation among groups (13%) (Table 3). However, the FST indicated that there is still significant differentiation among groups. Objective 3: Functional mapping of the walnut genome Functional mapping represents a composite profile the transcriptome of walnut. The transcriptome represents the transcribed RNA of any particular organism. In plants, gene expression is both temporally and spatially regulated in different tissues and organ systems. A profile of the transcriptome in a plant organ such as fruit, for example, would sample all mRNA expressed in this organ, and thus represents all genes expressed in fruit. Since only the functional part of the genome is observed, this provides a rapid and direct way to analyze genes that regulate all of the different fruit traits expressed. The first step is the isolation of total RNA from the tissue. Messenger RNA that represents the transcripts of individual genes is about 1-3% of total RNA is first separated using magnetic beads with poly oligo-dT nucleotides that bind to the poly-A tails on mRNA molecules. The mRNA is then chemically fragmented and the single stranded mRNAs are converted to double-stranded cDNA via in vitro cDNA synthesis. Each mRNA population isolated from a particular walnut tissue and converted into a corresponding population of complimentary DNA (cDNA library) represents the mRNA population of a particular plant and tissue at a particular time during development. These cDNA libraries are ligated to adapters so that they can be PCR amplified to create sufficient quantities of molecules that then can be sequenced using the next generation of DNA sequencers that can profile an entire transcriptome in a single run using Solexa technology on an Illumina Genome Analyzer (http://genomecenter.ucdavis.edu/dna_technologies/uhtsequencing.html). Such new technologies are able to generate millions of sequences corresponding to a greater diversity of the mRNA
California Walnut Board 38 Walnut Research Reports 2009
population at a lower cost than traditional Sanger sequencing. A single library is loaded onto one lane in a GAII sequencer, which generates 10-20 million sequencing reads between 20-80 nucleotides in length from both ends of each cDNA molecule. The resulting sequences will be compiled to generate a reference walnut Unigene set, representing a high percentage of the walnut gene space. For this project, 16 samples of walnut tissue were gathered from Chandler trees in the Stuke block at UC Davis between April and October 2008 (Table 4). Four additional samples were taken from Chandler plant material maintained in the lab of Gale McGranahan. RNA isolation and construction of cDNA libraries for each sample has nearly been completed, with just two libraries remaining to be constructed (Table 4). Each cDNA sample will be sequenced separately so that the profile of genes expressed in each tissue can be determined. Paired-end (85bp x 85bp) sequencing is underway on an Illumina Genome Analyzer using Solexa technology. The sequencing of three libraries has just been completed, yielding 10-20 million paired-end sequences of 20-85 bp in length from each cDNA library. Bioinformatics for sequencing processing will be conducted at the UC Davis Genome Center Bioinformatics Core, where these sequences will be compiled with the 18,000 existing walnut ESTs to generate a Unigene set, which is expected to describe most of the expressed walnut genome. This set of genes will be annotated using Blast2GO (http://blast2go.bioinfo.cipf.es/; Götz et al. 2008). This software package was recently used to annotate the set of 8622 walnut genes compiled in our walnut-nematode project. There, Gene Ontology (GO) categories were assigned to 57% of the sequences, including 18% which were also matched to Enzyme Commission (EC) identifiers. The GO categories and EC identifiers can be used to map the associated proteins onto metabolic pathways. These data can then be used to determine which metabolic pathways are active in each tissue and at various time points in the growing season which is a functional map of the gene activities in walnut. Objective 4: Development of a ‘Walnut Genome Resource (WGR)’, a web-based knowledge base of walnut genomic information A genome resource or knowledgebase is a database that provides access to genetic, physical, and functional mapping data generated in this project. The resource, which is now web accessible (http://walnutgenome.ucdavis.edu/) will have two distinct components: one for visualizing the genetic map and one for visualizing the physical map. The database will provide access to all of the fingerprinting data along with the BAC end sequences (Fig. 1). Tools are available to integrate and represent this information as a physical map showing individual contigs (Fig. 2). The physical map is a scaffold on which we will integrate genetic mapping data of walnut phenotypic information, molecular markers, and expressed genes as that information becomes available. The main objective of functional mapping is gene annotation and functional categorization of ESTs, the first step toward linking specific traits and conditions to metabolic and biochemical processes. Every EST enters a functional mapping pipeline using Blast2GO to assign sequences to metabolic pathways, and in some cases, regulatory roles in quality traits, pathogen resistance and stress response. This data pipeline includes gene set enrichment, a statistical analysis to determine which pathways and functional classes are expressed at higher or lower levels,
California Walnut Board 39 Walnut Research Reports 2009
compared to the walnut Unigene set, in each tissue sample. Patterns in gene expression are visualized by unsupervised clustering, to reveal patterns in expression alone across several samples, and by functional visualizations. The Pathway Tools Omics Viewer (http://www.plantcyc.org; Zhang et al. 2005), and MapMan (http://mapman.gabipd.org; Thimm et al. 2004) visualization platforms can display high throughput expression data of individual genes and whole pathways in combination with standard pathway maps and graphical classifications of gene functions. We have used MapMan to visualize expression data from experiments on tomato, apple, citrus, and grape, in addition to walnut. REFERENCES Götz S, García-Gómez J.M., Terol J., Williams T.D., Nagaraj S.H., Nueda M.J., Robles M., Talón M.,
Dopazo J., Conesa A. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 36:3420-3435.
Luo M.C., Thomas C., You F.M., Hsiao J., Ouyang S., Buell C.R., Malandro M., McGuire P.E., Anderson O.D., Dvorak J. (2003) High-throughput fingerprinting of bacterial artificial chromosomes using the SNaPshot labeling kit and sizing of restriction fragments by capillary electrophoresis. Genomics 82:378-389.
Soderlund C., Humphray S., Dunham A., French L. (2000) Contigs built with fingerprints, markers, and FPCV4.7. Genome Research 10:1772-1787.
Thimm, O., Blasing, O., Gibon, Y., Nagel, A., Meyer, S., Kruger, P., Selbig, J., Muller, L.A., Rhee, S.Y. Stitt, M. (2004). MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. The Plant Journal 37, 914-939.
Yu, J., Pressoir, G., Briggs, W.H., Bi, I.V., Yamasaki, M., Doebley, J.F., McMullen, M.D., Gaut, B.S., Nielsen, D.M., Holland, J.B., Kresovich, S. and Buckler, E.S. (2006). A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet., 38: 203-208.
Zhang, P., Foerster, H., Tissier, C., Mueller, L., Paley, S., Karp, P., Rhee, S.Y. (2005). MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research. Plant Physiology 138: 27-37.
California Walnut Board 40 Walnut Research Reports 2009
Table 1. Locus-wide genetic variability in walnut germplasm
Locus #Genotype H(o) H(e) P
WGA001 450 0.62222 0.80766 0.00000
WGA202 437 0.61327 0.83625 0.00000 WGA384 427 0.45902 0.58113 0.00000 WGA331 445 0.58652 0.65681 0.01330 WGA332 451 0.54767 0.64695 0.00011 WGA321 451 0.60310 0.72423 0.00000 WGA009 424 0.65566 0.77891 0.00000 WGA118 449 0.64588 0.79499 0.00000 WGA225 432 0.38889 0.49816 0.00000 WGA004 447 0.58166 0.68289 0.00000 WGA069 451 0.55876 0.81452 0.00000 WGA089 450 0.47778 0.66620 0.00000 WGA349 414 0.28986 0.82987 0.00000 WGA338 450 0.52000 0.55639 0.00975 WGA178 451 0.67184 0.74688 0.00657 WGA318 399 0.34837 0.82008 0.00000 WGA106 450 0.33556 0.42703 0.00011 WGA237 446 0.33632 0.59402 0.00000 WGA071 450 0.34222 0.38149 0.00010 WGA242 450 0.49556 0.65027 0.00000 WGA223 425 0.53412 0.82727 0.00000
California Walnut Board 41 Walnut Research Reports 2009
Table 2. Group specific fixation indices (inbreeding coefficient)
Group FIS P
China 0.08503 0.000000 SE Asia 0.14087 0.003910 Breeders’ gene pool 0.12687 0.000000 Bred 0.05423 0.076246 W Asia 0.09642 0.001955 S Asia 0.10898 0.000000
Table 3. Orthogonal partitioning of genetic variation within and among groups
Source of Sum of Variance Percentage Fixation variation d.f. squares components of variation index
Among groups 5 523.378 0.70793 13.37** 0.134** Within groups 898 4118.848 4.58669 86.63**
Total 903 4642.226 5.29462
California Walnut Board 42 Walnut Research Reports 2009
Table 4: Functional mapping of the walnut genome (Objective 3), walnut tissues gathered April to Nov 2008 for cDNA library construction and EST analysis Sample No
Tissue Source
Genotype Developmental Stage
Source Harvest Date
Code cDNA Library
RNA-seq done
1 Vegetative Bud
Chandler Vegetative Tree 4/1/08 VB Yes Done
2 Leaf - Young
Chandler Vegetative Tree 4/15/08 LY Yes Done
3 Root Chandler Vegetative Pot 8/27/08 RT Yes Done 4 Callus
Interior Chandler Vegetative In
Vitro 10/14/08 CI Yes Waiting
5 Callus Exterior
Chandler Vegetative In Vitro
10/14/08 CE Yes Done
6 Pistillate Flower
Chandler Vegetative Tree 4/17/08 FL Yes Done
7 Catkins Chandler Immature Tree 4/1/08 CK Yes Done 8 Somatic
Embryo Chandler Immature In
Vitro 10/14/08 SE Yes Done
9 Leaf- mature
Chandler Vegetative Tree 6/5/08 LM Yes Waiting
10 Leaves Chandler Vegetative Tree 4/17/08 LE Yes Waiting 11 Fruit
immature Mixed Immature Tree 6/5/08 IF Yes Waiting
12 Hull immature
Chandler Immature Tree 6/5/08 HL Yes Waiting
13 Packing Tissue
Chandler Immature Tree 6/5/08 PT Yes Waiting
14 Hull Peel Chandler Mature Tree 9/4/08 HP Yes Waiting 15 Hull
Cortex Chandler Mature Tree 9/4/08 HC Yes Waiting
16 Packing Tissue
Chandler Mature Tree 9/4/08 PK Yes Waiting
17 Pellicle Chandler Mature Tree 9/4/08 PL No No 18 Embryo Mixed Mature Tree 9/4/08 EM Yes Done 19 Leaf – late Chandler Senescent Tree 10/15/08 LS No No 20 Hull –
dehiscing Chandler Senescent Tree 10/15/08 HU Yes Waiting
21 Transition wood
J.nigra Transition Zone
Tree 04/02/08 KW Yes Waiting
California Walnut Board 43 Walnut Research Reports 2009
Figure 1. Example of a single fingerprint for a walnut BAC clone. To date 65,280 BACs have been fingerprinted with 92% being of high quality suitable for contig assembly.
California Walnut Board 44 Walnut Research Reports 2009
Figure 2. Example of an assembled contig, containing 557 BAC clones, ca. 4.7 Mb in length.
California Walnut Board 45 Walnut Research Reports 2009
0361 01 CHINA 0370 04 CHINA 0362 10 CHINA 0366 01 CHINA 0370 10 CHINA 0466 Xinjiang7 CHINA 0370 01 CHINA 0370 07 CHINA 0370 02 CHINA 0370 03 CHINA 0360 08 CHINA 0371 01 CHINA 0372 01 CHINA0348 01 CHINA 0348 04 CHINA 0349 01 CHINA
0350 04 CHINA0362 01 CHINA
0313 01 USSR
0377 03 CHINA0315 03 USSR 0315 04 USSR
0361 06 CHINA0361 09 CHINA 0361 03 CHINA
0373 03 CHINA 0373 05 CHINA
0375 10 CHINA
0383 02 CHINA0383 05 CHINA 0377 02 CHINA
0379 03 CHINA0379 06 CHINA
0379 02 CHINA
0379 04 CHINA0148 USSR
0334 04 CHINA
0162 unk
0338 02 CHINA85 8
Sexton
0351 01 CHINA
0381 07 CHINA
0382 02 CHINA
0373 01 CHINA
0381 01 CHINA
0350 05 CHINA
0350 06 CHINA 0364 02 CHINA
0381 05 CHINA
0363 03 CHINA
0345 01 CHINA
0347 03 CHINA
0361 07 CHINA
0347 01 CHINA
0347
04 CHINA01
22 04 P
AKISTAN
0347
02 CHINA
0347
07 CHIN
A
0355
03 CHIN
A
0377
12 C
HINA
0377
05 C
HINA
0332
01 C
HINA
0342
04 C
HINA
0375
07 C
HINA
0370
06 C
HINA
0375
02 C
HINA
0370
09 C
HINA
0375
03 C
HINA
0350
02 C
HINA
0376
05 C
HINA
0372
02 C
HINA
0381
04 C
HINA
0377
11 C
HINA
0373
04 C
HINA
0373
06
CHIN
A
0362
08
CHIN
A
0375
08
CHIN
A 0375
01
CHIN
A
0384
06
CHIN
A
0340
01
CHIN
A
0340
05
CHIN
A
0340
04
CHIN
A
0340
06
CHIN
A
0340
02
CHIN
A
0360
01
CHIN
A
0363
04
CHIN
A
0384
08
CHIN
A
0355
09
CHIN
A 0384
02
CHIN
A
0384
19
CHIN
A
0384
03
CHIN
A
0384
04
CHIN
A
0384
11
CHIN
A
0338
04
CHIN
A
0359
02
CHIN
A
0338
07
CHIN
A
0342
07
CHIN
A
0384
14
CHIN
A
0339
04
CHIN
A
0384
01
CHI
NA03
40 0
3 CH
INA
0342
02
CHI
NA
0378
05
CH
INA
0378
07
CH
INA
0378
08
CH
INA
0378
01
CH
INA
0378
04
CH
INA
0339
02
CH
INA
0343
02
CH
INA
0379
01
CH
INA
0378
02
CH
INA
0378
03
CH
INA 02
61 0
1 PA
KIS
TAN
0262
02
PAK
ISTA
N
0353
02
CH
INA
0353
11
CH
INA
0353
10
CH
INA
0354
05
CH
INA
0353
03
CH
INA
035 3
08
CH
INA
0354
03
CH
I NA
0354
06
CH
INA
0354
02
CH
INA
0 354
09
CH
INA
0353
09
CH
INA
0352
01
CH
INA
0355
01
CH
INA
0354
07
CH
INA
0353
01
CH
INA
0355
05
CH
INA
0355
06
CH
INA
0246
01
PAK
ISTA
N
0246
03
PAK
ISTA
N02
46 0
4 PA
KIS
TAN
0356
01
CH
INA
0356
03
CH
INA
0356
02
CHI
NA
0356
04
CHIN
A02
62 0
3 PA
KIS
TAN
0423
CH
INA
0464
Sha
nXi4
CHI
NA
0334
01
CHIN
A
0334
02
CHIN
A
0468
Yun
an1
CHIN
A
0469
Yna
n2 C
HINA
0338
06
CHIN
A
Man
regi
an01
01 D
18
6 CH
INA
0139
USS
R
0480
Cho
nwon
14 K
orea
0126
unk
0146
KO
REA
0151
KO
REA
0478
Yon
gdon
g6 K
ORE
A
0479
Yon
gdon
g24
KORE
A
0481
Yon
gdon
g lo
cal K
orea
0123
USS
R
0137
unk
0314
02
USSR
0141
unk
0153
unk02
57 0
3 PA
KIST
AN02
58 0
1 PA
KIST
AN
0256
01 P
AKIS
TAN
0260
05 P
AKIS
TAN
0240
01 P
AKIST
AN
0262
01 P
AKISTA
N
0383
06 C
HINA
0239
02 P
AKISTA
N
0259
02 P
AKISTA
N
0250
03 P
AKISTA
N
0250
05 P
AKISTA
N
0244
02 P
AKISTA
N
0253
04 P
AKISTA
N
0274
01 P
AKISTA
N
0270
02 PAKISTAN
0270
03 PAKISTAN
0133
USSR
0270
06 PAKISTAN
0086
01 U
SSR
0086
02 USSR
0315
01B USSR 02
72 05
PAKISTAN
0272
07 PAKISTAN
0085 01 NEPAL
0267 03 PAKISTAN
0267 06 PAKISTAN
0267 07 PAKISTAN
0267 02 PAKISTAN
0267 01 PAKISTAN
0267 04 PAKISTAN
0264 01 PAKISTAN
0267 05 PAKISTAN
0267 08 PAKISTAN
0269 04 PAKISTAN
0272 06 PAKISTAN
0272 03 PAKISTAN
0272 09 PAKISTAN0263 04 PAKISTAN
0263 06 PAKISTAN
0269 07 PAKISTAN
0269 08 PAKISTAN
0269 03 PAKISTAN
0269 06 PAKISTAN
0239 03 PAKISTAN
0487 Seri3 INDIA
0490 Seri2 INDIA
Earliest
0269 02 PAKISTAN
0258 02 PAKISTAN
0258 03 PAKISTAN
0275 01 PAKISTAN
0275 02 PAKISTAN
0275 05 PAKISTAN
0275 06 PAKISTAN
0275 03 PAKISTAN
0275 04 PAKISTAN
0253 01 PAKISTAN
0253 02 PAKISTAN
0255 02 PAKISTAN0255 03 PAKISTAN
0269 01 PAKISTAN
0269 05 PAKISTAN
0274 03 PAKISTAN
0274 05 PAKISTAN
0264 05 PAKISTAN
0264 03 PAKISTAN0272 08 PAKISTAN
0489 Tandari5sdlg2 INDIA
0122 23 PAKISTAN0122 26 PAKISTAN
0122 13 PAKISTAN0122 17 PAKISTAN
0122 19 PAKISTAN0266 03 PAKISTAN
0270 04 PAKISTAN0266 01 PAKISTAN
0266 04 PAKISTAN0266 02 PAKISTAN0266 05 PAKISTAN0250 04 PAKISTAN
0250 06 PAKISTAN0140 USSR
0251 01 PAKISTAN
0251 03 PAKISTAN0226 0 13 1048 AFGHANISTAN
0334 03 CHINA
0336 03 CHINA0085 02 NEPAL
0254 03 PAKISTAN0337 01 CHINA
0270 01 PAKISTAN0270 05 PAKISTAN
0088 01 USSR
0263 03 PAKISTAN
0246 02 PAKISTAN
0250 01 PAKISTAN
0244 01 PAKISTAN
0316 01 USSR
0252 01 PAKISTAN0247 04 PAKISTAN
0260 04 PAKISTAN
0247 01 PAKISTAN
0247 03 PAKISTAN
0247 02 PAKISTAN
0247 05 PAKISTAN
0268 02 PAKISTAN
0268 03 PAKISTAN
0259 01 PAKISTAN
0259 04 PAKISTAN
0257 02 PAKISTAN
0259 03 PAKISTAN
0257 01 PAKISTAN
0268 01 PAKISTAN
0265 01 PAKISTAN
0265 02 PAKISTAN
0265 03 PAKISTAN
0485 NagaBagh INDIA
0243 05 PAKISTAN
0255 01 PAKISTAN
0241 01 PAKISTAN
0243 01 PAKISTAN
0243 02 PAKISTAN
0243 04 PAKISTAN
0243 03 PAKISTAN
0243 07 PAKISTAN0243 06 PAKISTAN
0243 08 PAKISTAN
0191 03 USSR
0191 08 USSR
0191 01 USSR
0191 02 USSR0191
04 USSR
0191 06
USSR
0191
07 USSR
0191
05 USSR
0191
09 U
SSR
0224
0 38
1207
AFGHANISTAN
0192
05 U
SSR01
92 08
USSR
0192
03 U
SSR0192
01 U
SSR
0192
04 U
SSR
0192
06 U
SSR
0192
02 U
SSR
0192
07 U
SSR
0482
Tan
dari6
sdlg1
INDIA
0488
Tan
dari5
sdlg1
INDIA
0483
Tan
dari6
sdlg
4 INDIA
0274
02 P
AKISTA
N02
74 04
PAKI
STAN
0102
Indi
a IND
IA02
63 0
2 PAK
ISTA
N
0486
Lan
gTha
cha
INDI
A
0272
01
PAKI
STAN
0272
11
PAKI
STAN
0272
04
PAKI
STAN
0272
10
PAKI
STAN
0475
Hen
zal P
AKIS
TAN
0315
02
USSR
0189
01
USSR
0189
04
USSR
0239
01
PAKI
STAN
0333
01
CHIN
A02
73 0
1 PA
KIST
AN0273
03
PAKI
STAN
0471
Pak
ista
n231
PAK
ISTA
N
0472
Tile
PAK
ISTA
N
0416
Cha
seD9
USA
Chas
eD9
Adam
s10
Shar
key
Bada
joz
New
One
Casc
ade
Fern
ette
61 2
5Lara
56 2
24H
owar
d76
80
Cha
ndle
rFo
rde
95 2
6 22
Chi
coGill
ette
0254
02
PAK
ISTA
N03
38 0
8 C
HIN
AA
mig
o04
61 1
8256
E6 U
SATr
iplo
id03
16 0
2 U
SSR
0407
Geo
agiu
44 4
360
RO
MA
NIA
0147
unk
0491
Che
inov
o sd
BU
LGA
RIA
Che
inov
o sd
0410
Bul
garia
3 B
ULG
AR
IA00
98 S
inen
sis7
JAP
ANSi
n ens
is5
0143
USS
R0 4
1 3 N
n88G
ody n
PO
LAN
D01
45 U
SSR
0154
unk
0165
0 7
0 13
07 A
FGH
AN
ISTA
N0
20 1
072
0134
67
103
USA
67 1
3Tw
iste
rTu
lare
SirB
onSe
rr
0099
PI1
5956
8 A
FGH
AN
ISTA
N59
124
Sunl
and
0188
01
USS
R
0190
01
USS
R
0492
Kru
snsk
y sd
BU
LGAR
IA
0138
HU
NGAR
Y
0158
HU
NGAR
Y01
59 u
nk
0163
HUN
GAR
Y
0160
HUN
GAR
Y
0171
Idah
o US
A01
29 U
SSR
0150
HUN
GAR
Y
0152
USS
R
0125
USS
R01
32 u
nk
0144
USS
R0155
USS
R
0161
USS
R
0164
unk
Also
szen
t
0411
RVi
ii6 P
OLAN
D
0412
RXi
ii4 P
OLAN
D021
4 01
MEX
ICO
0214
02
MEX
ICO
0214
04
MEX
ICO
0214
03
MEX
ICO
0190
02
USSR
0190
03
USSR
0470
Jpur
pure
a GER
MANY
Red
Zing
er
Wee
per
Loze
ronn
e77 12
Ronde
de M
ont
0189
02 U
SSR
0189
03 U
SSR
0111
74 26
6 USA
0315
01C U
SSR
Gustin
e
March
ettiLo
mpocAsh
ley
Payne
49 46
Vina68 1
04
0170
Howe USA
0106
74 25
6 USA
0178 7
4 256
USA74 259
59 165
0108 74 253 USA
0109 74 245 USA
0228 53 128 unk
EarlyEhrhardt
Placentia
0180 Hartley CHINA
Klopping
PedroCisco
ConwayMayette
CarmelloPioneer
0229 61 12 unkPoe
0414 Verdot FRANCEConcha
0415 Meylannaise FRANCEMeylanXXXMayette
MoyerFranquetteFernor
0477 Elit YUGOSLAVIA
0173 ScharschFranquette USA
C0106 Franquette FRANCE0166 73 16 unk
0177 WFranquette FRANCE
GravesFranquette
0.02
Figure 3. An unrooted tree depicting the genetic structure and differentiation within the walnut (Juglans regia) germplasm collection based on 21 microsatellite loci.
California Walnut Board 46 Walnut Research Reports 2009
Figure 4. 3D projection of walnut germplasm along the first three principle axes.
California Walnut Board 47 Walnut Research Reports 2009