large-scale gwas in sorghum reveals common genetic control of … · 50 introduction cereal...
TRANSCRIPT
-
Large-scale GWAS in sorghum reveals common genetic control of grain 1
size among cereals 2
Yongfu Tao1*, Xianrong Zhao1, Xuemin Wang1, Adrian Hathorn1, Colleen Hunt1,2, Alan 3
W.Cruickshank1,2 Erik J. van Oosterom3, Ian D. Godwin3, Emma S. Mace1,2* , David R. 4
Jordan1* 5
6
1 Queensland Alliance for Agriculture and Food Innovation (QAAFI), The University of 7
Queensland, Hermitage Research Facility, Warwick, QLD 4370, Australia 8
2 Agri-Science Queensland, Department of Agriculture and Fisheries (DAF), Hermitage 9
Research Facility, Warwick, QLD 4370, Australia. 10
3Queensland Alliance for Agriculture and Food Innovation (QAAFI), The University of 11
Queensland, Brisbane, QLD 4072, Australia. 12
4School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, 13
Australia 14
15
*Corresponding authors: 16
Yongfu Tao: [email protected] 17
Emma Mace: [email protected] 18
David Jordan: [email protected] 19
ORCID ID: 20
Yongfu Tao: 0000-0001-9096-7407 21
Xuemin Wang: 0000-0002-2038-8829 22
Emma S. Mace: 0000-0002-5337-8168 23
David R. Jordan: 0000-0002-8128-1304 24
Ian D. Godwin: 0000-0002-4006-4426 25
Erik J van Oosterom: 0000-0003-4886-4038 26
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
mailto:[email protected]:[email protected]://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Summary: 27
Grain size is a key yield component of cereal crops and a major quality attribute. It is 28
determined by a genotype’s genetic potential and its capacity to fill the grains. 29
30
This study aims to dissect the genetic architecture of grain size in sorghum via an 31
integrated genome wide association study (GWAS) using a diversity panel of 837 32
individuals and a BC-NAM population of 1,421 individuals. 33
34
In order to isolate genetic effects associated with grain size, rather than the genotype’s 35
capacity to fill grain, a field treatment of removing half of the panicle during 36
flowering was imposed. Extensive variation in grain size with high heritability was 37
observed in both populations across 5 field trials. Subsequent GWAS analyses 38
uncovered 92 grain size QTL, which were significantly enriched for orthologues of 39
known grain size genes in rice and maize. Significant overlap between the 92 QTL 40
and grain size QTL in rice and maize was also found, supporting common genetic 41
control of this trait among cereals. Further analysis found grain size genes with 42
opposite effect on grain number were less likely to overlap with the grain size QTL 43
from this study, indicating the treatment facilitated identification of genetic regions 44
related to the genetic potential of grain size rather than the capacity to fill the grain. 45
46
These results enhance understanding of the genetic architecture of grain size in cereal, 47
and pave the way for exploration of underlying molecular mechanisms in cereal crops 48
and manipulation of this trait in breeding practices. 49
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Introduction 50
Cereal crops, including maize, rice, wheat, barley, and sorghum, supply more than 75% of the 51
calories consumed by humans (Sands et al., 2009). They are critical to address global food 52
security, which is under threat of population expansion and climate change. Among cereals, 53
sorghum is an important source of food, fibre, feed, and biofuel, and provides staple food for 54
over 500 million people in the semi-arid tropics of Africa and Asia. Cereal crops share a 55
close ancestry, and strong gene co-linearity among them supports comparative genomics 56
approaches for exploring the genetic bases of agronomical traits at a cross-species level 57
(Bolot et al., 2009). Grain size is a key yield component of grain yield in cereal crops, crops 58
and is a major quality attribute affecting planting, harvesting, and processing activities (Tao 59
et al., 2017). A better understanding of the genetic basis of grain size, including underlying 60
molecular mechanisms, will provide new targets for improving yield and grain quality in 61
cereal breeding. 62
Grain size in cereals is a function of both the potential maximum grain size and the capacity 63
of the plant to fill the grain. The weight of individual grains is determined by the rate and 64
duration of grain filling. In sorghum, the grain filling rate is highly correlated with the ovary 65
volume at anthesis, which in turn is associated with the size of the meristematic dome during 66
early floret development (Yang et al., 2009). While the potential maximum grain size is 67
largely genetically pre-determined, the capacity to fill the grain is strongly affected by 68
assimilate availability (Gambín & Borrás, 2010). Grains are the key sink for carbon demand 69
post-anthesis, and the amount of available assimilate per grain is determined by grain number 70
and assimilate supply, which are associated with a range of genetic and environmental factors 71
(Gambín & Borrás, 2010; Wardlaw, 1990). Negative correlations between grain number and 72
grain size are commonly observed in cereal crops (Acreche & Slafer, 2006; Peltonen-Sainio 73
et al., 2007; Sadras, 2007; Tao et al., 2018). In sorghum, grain number reduction at early 74
stages of flowering has been observed to result in a larger grain weight (Sharma et al., 2002). 75
This indicates that demand for carbohydrate to fill grains exceeds the available supply. 76
In cereals, the genetic architecture of grain size is complex and involves numerous genes. In 77
rice and maize, around 110 genes affecting grain size have been cloned (Tao et al., 2017). 78
Given the conservation of gene function among closely related cereal species, this presents 79
opportunities for comparative genomics studies to investigate the genetics of grain size in 80
other cereal crops using knowledge gained from gene cloning studies in maize and rice. The 81
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
sorghum QTL atlas of Mace et al. (2018) details around 100 grain weight QTL in sorghum 82
that have been identified through linkage analysis in 18 studies (Paterson et al., 1995; 83
Tuinstra et al., 1997; Rami et al., 1998; Brown et al., 2006; Feltus et al., 2006; Murray et al., 84
2008; Srinivas et al., 2009; Phuong et al., 2013; Rajkumar et al., 2013; Reddy et al., 2013; 85
Han et al., 2015; Mocoeur et al., 2015; Shehzad & Okuno, 2015; Gelli et al., 2016; Spagnolli 86
et al., 2016; Bai et al., 2017; Boyles et al., 2017;Tao et al., 2018). A further 12 markers 87
associated with grain weight have been identified in three GWAS studies (Upadhyaya et al., 88
2012; Zhang et al., 2015; Boyles et al., 2016). Given the number of genomic regions 89
identified using standard genetic linkage mapping approaches, the limited number of marker-90
trait associations identified in the GWAS studies is likely due to the small population sizes 91
employed in these studies (n = 242 - 390). All of the previous genetic studies on grain size in 92
sorghum have focused on the dissection of the genetic basis of grain size under natural 93
conditions, which also include environmental impacts and Genotype × Environment (G×E) 94
interactions. However, minimisation of variation in grain-filling capacity could limit the 95
environment effects on variation in grain size, and thereby facilitate identification of genetic 96
factors underlying variation of this trait. 97
This study aims to 1) investigate the genetic architecture of grain size in sorghum by 98
conducting an integrated GWAS in two large populations, a diversity panel of 837 99
individuals and a backcross nested association mapping population (BC-NAM) population of 100
1421 individuals, and 2) compare genetic loci affecting grain size identified in this study to 101
those reported in rice and maize. To minimize variation in grain size caused by assimilate 102
variability, a field treatment of removing half of the panicle during flowering time was 103
imposed in this study in order to maximise assimilate availability for the remaining seeds 104
during grain filling. 105
106
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Materials and methods 107
Plant material: 108
Two populations, together comprising over 2000 individuals, were used in this study: a 109
diversity panel (DP, n=837) (Table S1) and a BC-NAM consisting of 30 interrelated families 110
(n=1421) (Figure S1A; Table S2). The diversity panel has around 225 genotypes in common 111
with the US sorghum association panel. It consists predominantly of lines developed by the 112
sorghum conversion program conducted by Texas Agricultural Experiment Station, which 113
took diverse sorghum lines from the world collection and converted tall, late, or photoperiod 114
sensitive sorghums from the tropics into short, early, photoperiod insensitive types that could 115
be used by breeders in temperate regions (Rosenow et al., 1997). The program involved 116
repeated backcrossing to the exotic line combined with selection for height and maturity in 117
temperate environments. The resulting material has been reported to contain >4% genome 118
introgression from the temperate donor with the remainder from the exotic parent and has a 119
much narrower range of height and maturity than the original exotic lines (Thurber et al., 120
2013). 121
The BC-NAM population was previously developed by the Department of Agriculture and 122
Fisheries (DAF), Queensland, Australia and described in (Jordan et al., 2011). In summary, 123
each BC-NAM family was produced by crossing a single elite parent (R931945-2-2 or 124
R986087-2-4-1) with a diverse, exotic line (the non-recurrent parent) and backcrossing the 125
resulting F1 to the elite parent to produce a large BC1F1 population. Variable numbers of 126
plants were selected from each BC1F1 family and using single-seed descent >5 generations of 127
self-pollination were generated to produce BC1F6 and beyond. During generation advance, 128
selection was imposed for height and maturity of the recurrent parental type. 129
The 30 BC-NAM families used in the current study are listed in Table S2. In total 27 unique 130
non-recurrent parental lines and 2 elite recurrent parental lines were used for population 131
development. These included 10 diverse lines from advanced breeding programs and 17 132
landraces, 7 of which were converted to temperate adaptation through the Sorghum 133
Conversion Program (Stephens et al., 1967). Fourteen of the 27 non-recurrent parental lines 134
were also included in the diversity panel (Figure S1B). 135
Field Trials and phenotypic evaluation 136
A total of 5 field trials were planted at Hermitage Research Facility (HER), Warwick, 137
Queensland, Australia (28°12ʹS, 152°5ʹE, 470 m above sea level) and Gatton Research 138
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Facility (GAT), Gatton, Queensland, Australia (27°33ʹS, 152°20ʹE, 94 m above sea level) 139
during the Australian summer cropping seasons of 2014/15 and 2015/16. Two trials of the 140
diversity panel were grown in the 2015/16 season at Hermitage (DPHER16) and Gatton 141
(DPGAT16). The BC-NAM were planted in three trials at Hermitage in 2014/15 142
(NAMHER15), and Gatton in 2014/15 (NAMGAT15) and 2015/16 (NAMGAT16). All trials 143
were planted between November and February, using a row column design with partial 144
replication. Each plot consisted of two 6 m rows with a row spacing of 0.75 m. Differences in 145
plant height within a plot was minimal. Different numbers of genotypes were grown in each 146
trial due to seed availability (Table S3). Standard agronomic practices and pest-control 147
practices were applied. 148
A treatment of removing half of each of two panicles in each plot was imposed when each 149
panicle commenced flowering, before significant grain development had occurred (Figure 150
1A). Once physiological maturity was reached, the remaining half panicle (hereafter, referred 151
as half head) was hand harvested and threshed using a mechanical threshing machine. In 152
DPGAT16, a full head panicle in each plot was also harvested and threshed for comparison 153
with the half head panicles from the same plot. An aspirator was used to remove any debris 154
from each sample before measurements were taken. Although sorghum grain is typically 155
tending toward spherical, considerable phenotypic variation in length, width, and thinkness 156
does exist. Therefore, grain size parameters, including thousand kernel weight (TKW), and 157
the length, width, thickness, and volume of grains were measured using Seedcount SC5000 158
(Next Instruments, Condell Park, NSW, Australia) and a digital balance. 159
Statistical analysis 160
Due to the large number of genotypes included in this study, partial replication was used in 161
all trial designs (Cullis et al., 2006), with 30 % of the genotypes replicated two or more times 162
and the remaining 70 % represented by single plots. The total number of plots in each trial 163
ranged from 880 to 1521, with the total number of genotypes planted in each trial ranging 164
from 658 to 1164 (Table S3). A customized design was used to minimise spatial error effects 165
within each trial. The concurrence of genotypes and populations across the two seasons 166
allowed the DP and BC-NAM trials to be analysed as two multi-environment trials (METs) 167
for each of the five measured grain size parameters, comprising of two trials for the DP and 168
three for the BC-NAM. Each MET was analysed by fitting a linear mixed model using the 169
package ASReml (Butler et al., 2009) and the R statistical software. Each model consisted of 170
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
a fixed effect for the targeted trait at each trial, random effects for genotype within trial, and 171
spatial error for each individual trial (Smith et al., 2001). The G×E interaction was analysed 172
by fitting a third-order factor analytic structure to the trial × genotype interaction, in this case 173
the structure also modelled the 3-way interaction between site, trait, and genotype. The 174
analysis resulted in genetic variances for each trial and loading values representing factor 175
analytic loadings. 176
Generalised repeatability estimates were calculated for grain size parameters using the 177
method proposed by Cullis et al. (2006). High correlations were observed among grain size 178
parameters, and hence a principal components analysis was conducted to further investigate 179
the interrelationships among TKW and grain length, width, and thickness (Figure S3 and S4). 180
Because it is possible that a constructed trait such as a principal component may explain the 181
data more effectively, the main principal components were used as additional traits in the 182
association analysis. 183
Genotyping and imputation 184
The 2 populations were genotyped using medium to high density genome-wide SNPs 185
provided by Diversity Arrays Technology Pty Ltd (www.diversityarrays.com). DNA was 186
extracted from bulked young leaves of five plants from a plot of each genotype in the 2 187
populations using a previously described CTAB method (Doyle, 1987). The DNA samples 188
were then genotyped following an integrated DArT and genotyping-by-sequencing (GBS) 189
methodology, which involves complexity reduction of the genomic DNA to remove repetitive 190
sequences using methylation sensitive restriction enzymes prior to sequencing on Next 191
Generation sequencing platforms. The sequence data generated were then aligned to version 192
v3.1.1 of the sorghum reference genome sequence (McCormick et al., 2018) to identify SNPs 193
(Single Nucleotide Polymorphisms). 194
In total, 111,089 SNPs were identified in the diversity panel and 31,478 SNPs in the BC-195
NAM population. The overall proportion of missing data reported in the raw genotypic data 196
sets was approximately 10% and 5% for the diversity panel and BC-NAM respectively. 197
Individual SNP markers with >50% missing data were removed from further analysis and the 198
remaining missing values were phased and imputed using Beagle v4.1 (Browning & 199
Browning, 2016). An average imputation accuracy of 96% was achieved across both 200
populations. 201
Population statistics, GWAS analysis and QTL identification 202
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
http://www.diversityarrays.com/https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Pairwise linkage disequilibrium (LD) (r2) was calculated using PopLDdecay (Zhang et al., 203
2018) for the diversity (Figure S4A) and the BC-NAM (Figure S1C). The population 204
structure in the diversity panel was analysed used LEA in R (François, 2016). The structure 205
analysis identified five groups that corresponded to 4 racial groups (kafir, caudatum, guinea, 206
durras of Asian origin, durras of East African origin) (Figure S4B) based on previous racial 207
information and geographical origin for a subset of lines. Each individual line with ≥95 % of 208
genetic identity from one of the five identified groups was designated as a representative of 209
the corresponding racial group (Figure S4C, Table S4). The remaining lines were considered 210
as admixtures of multiple groups. 211
For the GWAS analysis, imputed SNP data sets for both populations were filtered for minor 212
allele frequency (MAF) >0.005. The final number of SNPs used for GWAS was 56,412 for 213
the diversity panel (Figure S5) and 26,926 for the BC-NAM population (Figure S6). For the 214
diversity panel, a principal component analysis (PCA) was conducted using a pruned SNP 215
data set to control for population structure. Marker pruning was conducted using PLINK 216
(Purcell et al., 2007) with a sliding window of 100 SNPs , a step size of 50 SNPs and r2 of 217
0.5. For the BC-NAM population, pedigree was used to control population structure. GWAS 218
was performed using FarmCPU (Liu et al., 2016) for both populations for the five measured 219
grain size parameters and derived PCs. Thresholds for defining significant and suggestive 220
associations were set using the Bonferroni correction based on the number of independent 221
tests, calculated using the GEC software package (Li et al., 2012). Marker-trait associations 222
were then identified using a two-step process. Firstly, SNPs were identified that were 223
significantly associated with grain size parameters and derived PCs in both populations 224
separately (p
-
population structure accounted for by PCs for the diversity panel and by pedigrees in the BC-236
NAM population. 237
A candidate region based approach was used to further investigate associations between 238
population specific QTL and grain size parameters in the alternative population. SNPs within 239
a 0.5 cM window of a population specific QTL were extracted from the alternative 240
population. To examine associations between SNPs in the candidate regions and the grain 241
size parameters, GLM in TASSEL 5.0 was run, using either PCA in the diversity panel or 242
pedigree in the BC-NAM population as a covariate to control for population structure 243
(Bradbury et al., 2007). The Bonferroni correction (0.05/number of markers) was used to 244
control for false positives. 245
Using the BC-NAM population, allele effects of the exotic parent were estimated compared 246
to the recurrent parent. At each QTL, the numbers of exotic alleles with larger and smaller 247
values than the recurrent parent were determined, and a test conducted to assess if the 248
numbers of positive and negative alleles relative to the recurrent parent varied. Families with 249
20 cM were excluded during this step. 255
Candidate gene analysis 256
Candidate genes for grain size in sorghum were identified using the methods described by 257
Tao et al. (2017). Firstly, genes functionally determined to control grain size in rice, maize, 258
and sorghum were collated through a literature search (Table S6). Subsequently, 259
corresponding sorghum orthologous of grain size genes in rice and maize were identified 260
using a combination of syntenic and bidirectional best hit (BBH) approaches. Syntenic gene 261
sets among rice, maize and sorghum were downloaded from PGDD 262
(http://chibba.agtec.uga.edu/duplication/) to identify sytenic orthologues of known grain size 263
genes from rice and maize. Local blast was performed to identify best BLAST hits of pairs of 264
genes from two genomes, using BLASTP. 265
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
http://chibba.agtec.uga.edu/duplication/https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Results 266
Minimizing environmental impact on grain size through the removal of half of the 267
panicle during flowering time 268
In the DPGAT16 trial, a significant increase in average grain size of 8.54% was observed in 269
half heads compared to full heads (Figure 1B). The magnitude of this increase varied 270
significantly among genotypes, indicating the presence of genotypic variation in the degree of 271
source limitation for grain filling in the full head (Figure 1C and 1D). Nonetheless, TKW of 272
half-head and full-head treatments were highly correlated (r2~0.70) (Figure S7). Since the 273
greater average TKW of the half head treatment compared with the full head treatment 274
indicated that this treatment was effective in removing some of the variation in grain size due 275
to restrictions in grain-filling capacity, data collected from plants with half head phenotypes 276
were used in this study. 277
Phenotypic variation of grain size 278
Substantial variation in TKW and the volume, length, width, and thickness of grains was 279
observed in half head samples across the five trials and two populations (Tables 1 and 2, 280
Figure 2). The diversity panel exhibited much broader variation in these grain size parameters 281
than the BC-NAM population, reflecting its larger genetic diversity. For example, in the trials 282
conducted at Gatton in 2016, the range in TKW was 6.5-58.4g in the diversity panel, 283
compared to 16.9-44.5g in the BC-NAM population. All parameters had low levels of G×E 284
interaction and were highly correlated across trials in both the diversity panel (r>0.82, 285
average r=0.85) and the BC-NAM population (r>0.79, average r=0.89) (Tables 1 and 2). 286
Hence, the combined cross-trial BLUPs were used for subsequent analyses of these 287
parameters. The high cross-trial correlations of the individual grain size parameters were 288
consistent with their medium to high heritability (Table 1 and 2). 289
All five grain size parameters displayed near normal distributions in both populations (Figure 290
2), which in combination with the medium to high heritability and low G×E observed, 291
indicated that variation in the parameters was controlled by multiple genetic loci. High 292
correlations among the five grain size parameters were also observed in both populations, 293
implying a high degree of shared genetic control among them (Figure 2). The principal 294
components analysis determined that 3 PCs accounted for the majority (>85%) of the 295
variation in the grain size parameters observed across multiple trials in both populations 296
(Figure S2 and S3). The direction of the correlations between traits was positive in all 297
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
contrasts with the exception of grain thickness, which displayed negative correlations with 298
the other four grain size parameters in the BC-NAM population (r=-0.39), as opposed to 299
positive correlations (r=+0.44) in the diversity panel (Figure 2). 300
To investigate possible correlations between racial groups and grain size parameters in 301
sorghum, structure analysis of the diversity panel was conducted, which revealed 5 groups 302
that corresponded to different racial groups of sorghum (Figure S4B, Table S4). All grain size 303
parameters showed significant differences among these racial groups (ANOVA, p-value 304
80%) of these 54 QTL were 312
significantly associated with multiple grain size parameters (Table S9). However, phenotypic 313
variation explained by individual QTL was generally small and the majority of QTL 314
explained < 5% of phenotypic variation each. In the BC-NAM population, 131 significant 315
marker-trait associations were identified with 106 SNPs associated with between 1 - 4 grain 316
size parameters (Figure 4, Figure S9, and Table S10). Because the extent of LD was greater 317
in the BC-NAM than in the diversity panel (Figure S1C), a 2 cM window was used to cluster 318
these associated SNPs into 66 QTL regions. Similar to observations for the diversity panel, 319
the vast majority (~85%) of the QTL were significantly associated with multiple grain size 320
parameters and phenotypic variation explained by individual QTL was generally small, with 321
the majority of QTL explaining < 5% of the phenotypic variation of a given grain size 322
parameter (Table S11). Within the BC-NAM, it was possible to observe the distribution of 323
allelic effects of the exotic parents compared to the recurrent parent (Figure 5). In general, 324
alleles were observed that were both larger and smaller than the elite parent, which was 325
supportive of the presence of multiple alleles at the majority of QTL. 326
A comparison of the QTL identified initially in both populations separately revealed 28 QTL 327
in common, with 26 QTL being unique to the diversity panel and 38 QTL unique to the BC-328
NAM population (Figure 6), resulting in 92 QTL identified in total across populations. These 329
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
92 QTL were distributed across all 10 sorghum chromosomes (Table 3). Further examination 330
revealed that the majority of the population specific QTL (22 out of 26 QTL unique to the 331
diversity panel and 34 out of 38 QTL specific to BC-NAM population) were significantly 332
associated with grain size parameters in the alternative population using the candidate region 333
based association analysis (Table S12). 334
The genetic basis of grain size has been the focus of 18 previous studies in sorghum, which 335
used 21 bi-parental populations and reported 85 grain weight QTL (Table S5). Nearly three 336
quarters (62) of these previously reported QTL co-located with QTL identified in this study. 337
A significant overlap between the QTL identified in this study and grain size SNPs identified 338
in rice (Huang et al., 2012) was observed, with 60% of the rice grain size SNPs identified 339
within a 1cM window of the sorghum QTL identified in this study (p-value
-
reported to have an opposite effect on grain number and grain size were underrepresented in 362
the overlap with the QTL from this study (Table S15). 363
One of the candidate genes, SbGS3, the orthologue of the rice grain size gene, GS3 (Mao et 364
al., 2010), co-located with qGS1.9, a QTL identified in both the diversity and BC-NAM 365
populations, for grain length, weight, volume, width, and thickness. In the diversity panel, 366
one SNP was identified that caused a C to A change in the fifth exon of SbGS3, turning a 367
cysteine codon (TGC) to a premature stop codon (TGA), resulting in the 198 amino acid 368
protein being truncated at the 140th amino acid. However, the most significant SNP identified 369
for qGS1.9 was 11.48kb (0.01cM) away from SbGS3, rather than the large-effect SNP 370
causing truncation at the 140th amino acid. This may have been due to the low MAF (22 out 371
of 837 individuals carried the premature stop codon allele) and the similar genetic 372
background for the individuals carrying the premature stop codon allele. Transcriptomics 373
expression data (Makita et al., 2015) showed that GS3 had comparable expression patterns in 374
sorghum and rice, with high expression levels observed in the panicle and minimal 375
expression in other tissue types covering a range of temporal and spatial variation (Figure 376
S10A). Additional transgenic experiments, using RNAi, have demonstrated that inhibiting the 377
expression of SbGS3 increased grain weight by 9.4% on average in sorghum (Trusov et al., 378
pers. comm.). Both positive and negative allelic effects from non-recurrent parental lines 379
were observed in the BC-NAM compared with the recurrent parental line, R931945-2-2, 380
indicating the existence of multiple alleles of SbGS3 in the population (Figure S10 B). 381
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Discussion 382
Grain size is one of the most critical traits of cereal crops due to its direct contribution as a 383
yield component, its importance as a quality attribute, and its contribution to fitness 384
stemming from its impact on reproductive rates, emergence and establishment (Tao et al., 385
2017; Westoby et al., 1992). This study represents one of the largest and most comprehensive 386
investigations of the genetic basis underling this trait in any cereal via a joint GWAS analysis 387
of grain size in sorghum using a diversity panel (n=837) and a BC-NAM population 388
(n=1421). The population size is the largest of all GWAS studies on grain size in any cereal 389
crop, >5 times larger than the second largest GWAS studies in sorghum. As a result, a total 390
number of 92 grain size QTL were identified. Enrichment of grain size candidate genes 391
within these QTL was observed, and significant overlap between these 92 QTL and SNPs 392
associated with grain size in rice and maize was found, highlighting common genetic control 393
of this trait among cereals. In addition, the treatment of removing half the panicle during 394
flowering facilitated minimisation of variation of grain size due to assimilate variability, and 395
identification of genetic regions related to the genetic potential of grain size rather than the 396
capacity to fill the grain. Therefore, these results provide an enhanced understanding of the 397
genetic basis underlying grain size in sorghum and offer opportunities for breeders in 398
sorghum and other cereal species to manipulate grain size to improve crop productivity and 399
quality. 400
The relationship between grain size and grain number in sorghum is mainly driven by 401
genes acting before grain filling 402
A negative correlation between grain size and number is widely observed in cereals The 403
nature of this correlation is complex (Sadras 2007) and is partially driven by the fact that 404
potential maximum grain size and grain number are established before grain filling and by the 405
availability of assimilates to enable seeds to reach their genetic potential (Yang et al., 2009; 406
Gambín & Borrás, 2010). In this study, the half-head treatment resulted in an average 407
increase in TKW of 8.54%, with significant variation among genotypes. The average change 408
in seed weight is in line with previous estimates of sorghum’s capacity to compensate for loss 409
of seeds (Sharma et al., 2002). Seed weight in the full-head treatment explained ~70% of the 410
genetic variation in grain size in the half-head treatment, indicating that while assimilate 411
availability is a significant factor influencing grain size, most of the genetic factors driving 412
the correlation between grain size and number in sorghum are acting prior to grain filling. 413
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Future yield gains may require that this association is better understood to determine if the 414
negative association can be broken at some or all QTL. 415
Shared genetic control of grain size dimensions 416
Sorghum grains vary in terms of length, thickness, width, volume and weight. Substantial 417
genetic variation was observed for all of these parameters in both populations. Cross 418
environment correlations of these grain size parameters within the same population were very 419
high and the heritability of the traits was also moderate to high, indicating strong genetic 420
control as previously observed (Prado et al., 2014; Tao et al., 2018). These findings, 421
combined with a normal distribution of the phenotypes, indicate that all of these parameters 422
are likely to be controlled by many genes with small to moderate effects. In addition, strong 423
correlations were observed between the different grain size parameters in the same 424
population, indicating they have a high degree of common genetic control. This was expected 425
for parameters such as volume and weight and their correlations with the various measured 426
dimensions, since the traits are linked mathematically. However, strong correlations between 427
most of these parameters in both of the two populations indicate shared genetic control. In 428
contrast to positive correlations between other grain size parameters in both populations, the 429
grain thickness trait was negatively correlated with the other 4 parameters in the BC-NAM 430
population but positively correlated with these parameters in the diversity panel. This is likely 431
due to the different genetic backgrounds of the two populations. The diversity panel has a 432
good representation of sorghum’s racial groups (Thurber et al., 2013). Although the BC-433
NAM also provides a good representation of the genetic variation of the racial groups of 434
sorghum, because of the population structure (backcross derived based on a single reference 435
parent), the genetic composition of each line is strongly biased towards the genetic 436
background of the elite recurrent parent, which is primarily of caudatum origin. 437
High-confidence genetic architecture of grain size revealed through independent multi-438
population GWAS 439
Population size is a key factor affecting the power of a GWAS study, as is the need to control 440
for type I and type II errors (Visscher et al., 2017). The two populations used in this study, 441
taken individually, are among the largest published to date in cereal crop on grain size. This 442
size, coupled with the capacity to independently verify putative QTL in the alternative 443
population, has provided us with a substantial advantage over previously published studies. 444
The diversity panel contains representatives of most of the racial groups in the sorghum gene 445
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
pool and is characterised by its fast LD decay. In contrast, the BC-NAM population consists 446
of BC1 derived lines sharing ~80% of their genomes with the elite reference parent (Jordan et 447
al., 2011), which provides an opportunity to assess exotic alleles in the genetic background of 448
an elite inbred. The BC-NAM population has lower genetic diversity, a more balanced 449
structure and greater LD and therefore potentially more power to detect QTL. The joint use of 450
these complementary populations has also provided a more comprehensive understanding of 451
the genetic architecture of grain size, and ready-to-use information on the effect of QTL 452
alleles in elite breeding material. 453
The high number (92) of QTL identified in this study and their relatively small individual 454
effects highlight the genetic complexity of grain size. Over half of the QTL identified in this 455
study (49) were co-located with QTL identified in previous sorghum studies (Table 3) with < 456
30% of the previous QTL not detected in this study (Table S5). Care must be taken with 457
interpreting this result however, since the small population sizes of the previous studies 458
limited their power to detect QTL and resulted in large confidence intervals to the extent that 459
they often encompassed a number of the QTL detected in this study. In addition, two of the 460
previous studies identified QTL from crosses between wild and domestic sorghum (Paterson 461
et al., 1995; Tao et al., 2018), which may not have been detected in our study as wild species 462
were not included in either population, plus none of the other studies attempted to minimise 463
variation in the capacity to fill grain. Hence, it is likely that some of the QTL previously 464
detected may not be segregating in our study. 465
Our study was internally consistent, with 83% of the QTL detected being associated with 466
multiple traits, in line with the correlations observed between the phenotypes. In addition, 467
91% of the QTL were supported by evidence from the two independent populations either via 468
co-located QTL or with support from the candidate region analysis, resulting in only 8 469
population-specific QTL being identified. Given that the two populations were independent 470
and differed in both genetic composition, power and LD, the high correspondence between 471
the two studies suggests that we identified the majority of the loci for grain size in cultivated 472
sorghum. 473
Significant allelic diversity exists in grain size QTL 474
The BC-NAM population provided us with the opportunity to investigate variation in the 475
effects of different QTL alleles in an elite genetic background. Our results suggest the 476
presence of multiple alleles at the majority of the QTL with effects of the non-recurrent 477
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
parental alleles that were both greater and smaller that the recurrent parent allele observed at 478
most loci (Figure 5). The distribution of the allele effects provides interesting insights into 479
selection for the grain size trait in an elite breeding program. There was weak tendency for 480
the elite parental alleles to contribute to larger grain size compared to the exotic parent. This 481
is similar to an earlier study of flowering time in the same population where positive and 482
negative allele effects relative to the recurrent parental allele were similarly distributed (Mace 483
et al., 2013). The lack of bias observed in the current study suggests modern sorghum 484
breeding with its focus on increasing grain yield has not selected for grain size. The lack of 485
strong selection for grain size by modern breeders is interesting and suggests a potential 486
physiological restriction to increasing grain yield by selection for larger grain. This is 487
consistent with the observation that much of the progress in enhancing grain yield in modern 488
breeding of cereals has been achieved by selection for increased grain number rather than 489
grain size. Clearly, further research is required before breeders attempt to increase grain yield 490
by introducing alleles for larger grain size. However, given the range of allele effects, it is 491
likely some opportunities exist to simultaneously improve grain yield and grain size, eg 492
through manipulation of the duration of grain filling (Yang et al., 2010). 493
Correspondence of loci controlling grain size among cereals 494
Sorghum shares close ancestry with maize and rice, with orthologous genes from these 495
species commonly sharing the same function (Bolot et al., 2009; Paterson et al., 1995). A 496
significant association was found between the locations of the QTL in this study and GWAS 497
signals for grain size in rice (Huang et al., 2012) and maize (Liu et al., 2019), and 498
orthologous of cloned grain size genes from rice and maize. Both findings strongly support a 499
common genetic architecture underlying this trait across these cereals. This provides 500
opportunities for breeders to exploit information from other cereals, as genes identified from 501
large scale GWAS in one species can be targeted for allele mining or diversity creation via 502
gene editing in another species. 503
As an example, the gene SbGS3, the orthologue of the cloned gene Grain Size 3 in rice, was 504
found to control variation of grain size in sorghum. In addition to the presence of a large-505
effect SNP in the gene causing a premature stop codon, our GWAS identified a SNP that was 506
11.48 kb away from this gene as significantly associated with variation of grain size. SbGS3 507
showed a similar expression pattern as GS3 in rice with high expression in early 508
inflorescence, indicating the same role in grain development. Further evidence from 509
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
transgenic experiments has shown that inhibiting the expression of SbGS3 increased grain 510
weight by 9.4% on average in sorghum (Trusov et al., pers. comm.). It has also been reported 511
that the orthologue of GS3 in maize, ZmGS3, was involved in kernel development (Li et al., 512
2010). GS3 contains four domains: an OSR domain in the N terminus, a transmembrane 513
domain, a TNFR/NGFR family cysteine-rich domain, and a VWFC in the C terminus (Mao et 514
al., 2010). In rice, overexpression of a truncated cDNA sequence of GS3 where the VWFC 515
domain is deleted, produced shorter grains (Mao et al., 2010). The premature stop change of 516
SbGS3 caused a truncation of the VWFC domain. Further investigation is required to see 517
whether comparable effects caused by the truncation occur in sorghum. 518
519
Conclusion: 520
Grain size is a key quantitative trait of cereal crops. This study presents a comprehensive 521
genetic dissection of grain size in sorghum using two large and complementary populations, 522
resulting in the identification of a total of 92 QTL related to grain size. Significant overlap 523
was found between the QTL identified in this study and grain size GWAS signals in rice and 524
maize, and orthologues of cloned grain size genes from rice and maize, supporting a common 525
genetic architecture underlying this trait across these cereals. These findings provide a better 526
understanding of the genetic architecture of grain size in sorghum, and pave the way for 527
exploration of its underlying molecular and physiological mechanisms in cereal crops and its 528
manipulation in breeding practices. 529
530
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Acknowledgments 531
We acknowledge access to background IP from the Grains Research and Development 532
Corporation and support from the University of Queensland and the Department of 533
Agriculture and Fisheries Queensland. 534
535
Author contributions 536
D.J., E.M., and I.G. conceived and designed the experiments: Y.T., X.Z., X.W. and A.C. 537
collected data; Y.T., C.H., A.H., E.O., D.J., and E.M. analysed data; Y.T. wrote the 538
manuscript. E.M., E.O., I.G., and D.J. revised the manuscript. All authors read and approved 539
the final manuscript. 540
541
Funding 542
This work was supported by the Australian Research Council (ARC) Discovery project 543
DP14010250. 544
545
Conflict of Interest Statement 546
The authors declare that the research was conducted in the absence of any commercial or 547
financial relationships that could be construed as a potential conflict of interest. 548
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Reference 549
550 Acreche MM, Slafer GA. 2006. Grain weight response to increases in number of grains in wheat in a 551
Mediterranean area. Field Crops Research 98(1): 52-59. 552 Bai CM, Wang CY, Wang P, Zhu ZX, Cong L, Li D, Liu YF, Zheng WJ, Lu XC. 2017. QTL 553
mapping of agronomically important traits in sorghum (Sorghum bicolor L.). Euphytica 554 213(12). 555
Bolot S, Abrouk M, Masood-Quraishi U, Stein N, Messing J, Feuillet C, Salse J. 2009. The 'inner 556 circle' of the cereal genomes. Current opinion in plant biology 12(2): 119-125. 557
Boyles RE, Cooper EA, Myers MT, Brenton Z, Rauh BL, Morris GP, Kresovich S. 2016. 558 Genome-Wide Association Studies of Grain Yield Components in Diverse Sorghum 559 Germplasm. The Plant Genome 9(2). 560
Boyles RE, Pfeiffer BK, Cooper EA, Zielinski KJ, Myers MT, Rooney WL, Kresovich S. 2017. 561 Quantitative Trait Loci Mapping of Agronomic and Yield Traits in Two Grain Sorghum 562 Biparental Families. Crop Science 57(5): 2443-2456. 563
Butler D, Cullis BR, Gilmour A, Gogel BJ. 2009. ASReml-R reference manual. Department of 564 Primary Industries and Fisheries, Brisbane, Australia. 565
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. 2007. TASSEL: 566 software for association mapping of complex traits in diverse samples. Bioinformatics 23(19): 567 2633-2635. 568
Brown PJ, Klein PE, Bortiri E, Acharya CB, Rooney WL, Kresovich S. 2006. Inheritance of 569 inflorescence architecture in sorghum. Theoretical and applied genetics 113(5): 931-942. 570
Browning BL, Browning SR. 2016. Genotype Imputation with Millions of Reference Samples. 571 American Journal of Human Genetics 98(1): 116-126. 572
Cullis BR, Smith AB, Coombes NE. 2006. On the design of early generation variety trials with 573 correlated data. Journal of Agricultural Biological and Environmental Statistics 11(4): 381-574 393. 575
Doyle JJ. 1987. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. 576 Phytochemical Bulletin 19: 11-15. 577
Feltus FA, Hart GE, Schertz KF, Casa AM, Kresovich S, Abraham S, Klein PE, Brown PJ, 578 Paterson AH. 2006. Alignment of genetic maps and QTLs between inter- and intra-specific 579 sorghum populations. Theoretical and applied genetics 112(7): 1295-1305. 580
François OJRTiPGUG-A. 2016. Running structure-like population genetic analyses with R. 1-9. 581 Gambín B, Borrás L. 2010. Resource distribution and the trade‐off between seed number and seed 582
weight: a comparison across crop species. J Annals of Applied Biology 156(1): 91-102. 583 Gelli M, Mitchell SE, Liu K, Clemente TE, Weeks DP, Zhang C, Holding DR, Dweikat IM. 584
2016. Mapping QTLs and association of differentially expressed gene transcripts for multiple 585 agronomic traits under different nitrogen levels in sorghum. BMC plant biology 16. 586
Han LJ, Chen J, Mace ES, Liu YS, Zhu MJ, Yuyama N, Jordan DR, Cai HW. 2015. Fine 587 mapping of qGW1, a major QTL for grain weight in sorghum. Theoretical and applied 588 genetics 128(9): 1813-1825. 589
Huang XH, Zhao Y, Wei XH, Li CY, Wang A, Zhao Q, Li WJ, Guo YL, Deng LW, Zhu CR, et 590 al. 2012. Genome-wide association study of flowering time and grain yield traits in a 591 worldwide collection of rice germplasm. Nature genetics 44(1): 32-U53. 592
Jordan DR, Mace ES, Cruickshank AW, Hunt CH, Henzell RG. 2011. Exploring and Exploiting 593 Genetic Variation from Unadapted Sorghum Germplasm in a Breeding Program. Crop 594 Science 51(4): 1444-1457. 595
Li MX, Yeung JM, Cherny SS, Sham PC. 2012. Evaluating the effective numbers of independent 596 tests and significant p-value thresholds in commercial genotyping arrays and public 597 imputation reference datasets. Human Genetics 131(5): 747-756. 598
Li Q, Yang XH, Bai GH, Warburton ML, Mahuku G, Gore M, Dai JR, Li JS, Yan JB. 2010. 599 Cloning and characterization of a putative GS3 ortholog involved in maize kernel 600 development. Theoretical and applied genetics 120(4): 753-763. 601
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Liu M, Tan X, Yang Y, Liu P, Zhang X, Zhang Y, Wang L, Hu Y, Ma L, Li Z, et al. 2019. 602 Analysis of the genetic architecture of maize kernel size traits by combined linkage and 603 association mapping. Plant Biotechnol J. doi: 10.1111/pbi.13188 604
Liu X, Huang M, Fan B, Buckler ES, Zhang Z. 2016. Iterative Usage of Fixed and Random Effect 605 Models for Powerful and Efficient Genome-Wide Association Studies. PLoS Genetics 12(2): 606 e1005767. 607
Mace E, Innes D, Hunt C, Wang X, Tao Y, Baxter J, Hassall M, Hathorn A, Jordan D. 2018. 608 The Sorghum QTL Atlas: a powerful tool for trait dissection, comparative genomics and crop 609 improvement. Theoretical and applied genetics. 610
Mace E, Innes D, Hunt C, Wang XM, Tao YF, Baxter J, Hassall M, Hathorn A, Jordan D. 2019. 611 The Sorghum QTL Atlas: a powerful tool for trait dissection, comparative genomics and crop 612 improvement. Theoretical and applied genetics 132(3): 751-766. 613
Mace ES, Hunt CH, Jordan DR. 2013. Supermodels: sorghum and maize provide mutual insight 614 into the genetics of flowering time. Theoretical and applied genetics 126(5): 1377-1395. 615
Mace ES, Rami JF, Bouchet S, Klein PE, Klein RR, Kilian A, Wenzl P, Xia L, Halloran K, 616 Jordan DR. 2009. A consensus genetic map of sorghum that integrates multiple component 617 maps and high-throughput Diversity Array Technology (DArT) markers. BMC plant biology 618 9. 619
Makita Y, Shimada S, Kawashima M, Kondou-Kuriyama T, Toyoda T, Matsui M. 2015. 620 MOROKOSHI: Transcriptome Database in Sorghum bicolor. Plant and cell physiology 56(1). 621
Mao H, Sun S, Yao J, Wang C, Yu S, Xu C, Li X, Zhang Q. 2010. Linking differential domain 622 functions of the GS3 protein to natural variation of grain size in rice. Proceedings of the 623 National Academy of Sciences 107(45): 19579-19584. 624
McCormick RF, Truong SK, Sreedasyam A, Jenkins J, Shu SQ, Sims D, Kennedy M, 625 Amirebrahimi M, Weers BD, McKinley B, et al. 2018. The Sorghum bicolor reference 626 genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of 627 genome organization. Plant Journal 93(2): 338-354. 628
Mocoeur A, Zhang YM, Liu ZQ, Shen X, Zhang LM, Rasmussen SK, Jing HC. 2015. Stability 629 and genetic control of morphological, biomass and biofuel traits under temperate maritime 630 and continental conditions in sweet sorghum (Sorghum bicolour). Theoretical and applied 631 genetics 128(9): 1685-1701. 632
Murray SC, Sharma A, Rooney WL, Klein PE, Mullet JE, Mitchell SE, Kresovich S. 2008. 633 Genetic Improvement of Sorghum as a Biofuel Feedstock: I. QTL for Stem Sugar and Grain 634 Nonstructural Carbohydrates. Crop Science 48(6): 2165-2179. 635
Paterson AH, Lin YR, Li Z, Schertz KF. 1995. Convergent domestication of cereal crops by 636 independent mutations at corresponding genetic loci. Science 269(5231): 1714. 637
Peltonen-Sainio P, Kangas A, Salo Y, Jauhiainen L. 2007. Grain number dominates grain weight in 638 temperate cereal yield determination: Evidence based on 30 years of multi-location trials. 639 Field Crops Research 100(2-3): 179-188. 640
Phuong N, Stützel H, Uptmoor R. 2013. Quantitative trait loci associated to agronomic traits and 641 yield components in a Sorghum bicolor L. Moench RIL population cultivated under pre-642 flowering drought and well-watered conditions. Agricultural Sciences 4(12): 781-791. 643
Prado SA, Sadras VO, Borras L. 2014. Independent genetic control of maize (Zea mays L.) kernel 644 weight determination and its phenotypic plasticity. Journal of Experimental Botany 65(15): 645 4479-4487. 646
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de 647 Bakker PI, Daly MJ, et al. 2007. PLINK: a tool set for whole-genome association and 648 population-based linkage analyses. American Journal of Human Genetics 81(3): 559-575. 649
Rami JF, Dufour P, Trouche G, Fliedel G, Mestres C, Davrieux F, Blanchard P, Hamon P. 1998. 650 Quantitative trait loci for grain quality, productivity, morphological and agronomical traits in 651 sorghum (Sorghum bicolor L. Moench). Theoretical and applied genetics 97(4): 605-616. 652
Rajkumar, Fakrudin B, Kavil SP, Girma Y, Arun SS, Dadakhalandar D, Gurusiddesh BH, Patil 653 AM, Thudi M, Bhairappanavar SB, et al. 2013. Molecular mapping of genomic regions 654 harbouring QTLs for root and yield traits in sorghum (Sorghum bicolor L. Moench). Physiol 655 Mol Biol Plants 19(3): 409-419. 656
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Reddy RN, Madhusudhana R, Mohan SM, Chakravarthi DVN, Mehtre SP, Seetharama N, Patil 657 JV. 2013. Mapping QTL for grain yield and other agronomic traits in post-rainy sorghum 658 [Sorghum bicolor (L.) Moench]. Theoretical and applied genetics 126(8): 1921-1939. 659
Rosenow D, Dahlberg J, Stephens J, Miller F, Barnes D, Peterson G, Johnson J, Schertz KJCs. 660 1997. Registration of 63 converted sorghum germplasm lines from the sorghum conversion 661 program. 37(4): 1399-1400. 662
Sadras VO. 2007. Evolutionary aspects of the trade-off between seed size and number in crops. Field 663 Crops Research 100(2): 125-138. 664
Sands DC, Morris CE, Dratz EA, Pilgeram AL. 2009. Elevating optimal human nutrition to a 665 central goal of plant breeding and production of plant-based foods. Plant Science 177(5): 377-666 389. 667
Sharma HC, Abraham CV, Stenhouse JW. 2002. Compensation in grain weight and volume in 668 sorghum is associated with expression of resistance to sorghum midge, Stenodiplosis 669 sorghicola. Euphytica 125(2): 245-254. 670
Shehzad T, Okuno K. 2015. QTL mapping for yield and yield-contributing traits in sorghum 671 (Sorghum bicolor (L.) Moench) with genome-based SSR markers. Euphytica 203(1): 17-31. 672
Smith A, Cullis B, Thompson R. 2001. Analyzing variety by environment data using multiplicative 673 mixed models and adjustments for spatial field trend. Biometrics 57(4): 1138-1147. 674
Spagnolli FC, Mace E, Jordan D, Borras L, Gambin BL. 2016. Quantitative Trait Loci of Plant 675 Attributes Related to Sorghum Grain Number Determination. Crop Science 56(6): 3046-3054. 676
Srinivas G, Satish K, Madhusudhana R, Reddy RN, Mohan SM, Seetharama N. 2009. 677 Identification of quantitative trait loci for agronomically important traits and their association 678 with genic-microsatellite markers in sorghum. Theoretical and applied genetics 118(8): 1439-679 1454. 680
Stephens J, Miller F, Rosenow D. 1967. Conversion of Alien Sorghums to Early Combine 681 Genotypes 1. Crop Science 7(4): 396-396. 682
Tao Y, Mace ES, Tai S, Cruickshank A, Campbell BC, Zhao X, Van Oosterom EJ, Godwin ID, 683 Botella JR, Jordan DR. 2017. Whole-Genome Analysis of Candidate genes Associated with 684 Seed Size and Weight in Sorghum bicolor Reveals Signatures of Artificial Selection and 685 Insights into Parallel Domestication in Cereal Crops. Frontiers in plant science 8. 686
Tao YF, Mace E, George-Jaeggli B, Hunt C, Cruickshank A, Henzell R, Jordan D. 2018. Novel 687 Grain Weight Loci Revealed in a Cross between Cultivated and Wild Sorghum. The Plant 688 Genome 11(2). 689
Thurber CS, Ma JM, Higgins RH, Brown PJ. 2013. Retrospective genomic analysis of sorghum 690 adaptation to temperate-zone grain production. Genome biology 14(6). 691
Tuinstra MR, Grote EM, Goldsbrough PB, Ejeta G. 1997. Genetic analysis of post-flowering 692 drought tolerance and components of grain development in Sorghum bicolor (L.) Moench. 693 Molecular Breeding 3(6): 439-448. 694
Upadhyaya HD, Wang YH, Sharma S, Singh S, Hasenstein KH. 2012. SSR markers linked to 695 kernel weight and tiller number in sorghum identified by association mapping. Euphytica 696 187(3): 401-410. 697
Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, Yang J. 2017. 10 Years 698 of GWAS Discovery: Biology, Function, and Translation. American Journal of Human 699 Genetics 101(1): 5-22. 700
Wardlaw IF. 1990. The control of carbon partitioning in plants. New Phytologist 116(3): 341-381. 701 Yang Z, Van Oosterom EJ, Jordan DR, Hammer GL. 2009. Pre-anthesis ovary development 702
determines genotypic differences in potential kernel weight in sorghum. Journal of 703 Experimental Botany: erp019. 704
Yang ZJ, van Oosterom EJ, Jordan DR, Doherty A, Hammer GL. 2010. Genetic Variation in 705 Potential Kernel Size Affects Kernel Growth and Yield of Sorghum. Crop Science 50(2): 706 685-695. 707
Zhang C, Dong SS, Xu JY, He WM, Yang TL. 2018. PopLDdecay: a fast and effective tool for 708 linkage disequilibrium decay analysis based on variant call format files. Bioinformatics. 709
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Zhang D, Li J, Compton RO, Robertson J, Goff VH, Epps E, Kong W, Kim C, Paterson AH. 710 2015. Comparative genetics of seed size traits in divergent cereal lineages represented by 711 sorghum (Panicoidae) and rice (Oryzoidae). G3: Genes| Genomes| Genetics 5(6): 1117-1128. 712
713
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Table 1 Summary of phenotypic data collected from half head samples across trials in the 714
diversity panel 715
Trait Trial Trait Mean Range Hsq Correlation between Sites
TKW (g) DPGAT16 28.8 6.5 - 58.4 0.79 0.87
TKW (g) DPHER16 28.7 6.7 - 56.8 0.80
Volume (mm3) DPGAT16 26.2 8.4 - 55.0 0.67 0.85
Volume (mm3) DPHER16 26.1 9.9 - 55.6 0.70
Length (mm) DPGAT16 4.41 2.93 - 5.85 0.68 0.88
Length (mm) DPHER16 4.38 3.07 -5.88 0.70
Width (mm) DPGAT16 3.50 2.29 -4.74 0.65 0.84
Width (mm) DPHER16 3.46 2.34 -4.56 0.67
Thickness (mm) DPGAT16 3.22 2.4 - 4.05 0.67 0.82
Thickness (mm) DPHER16 3.25 2.64 - 3.96 0.75
716
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Table 2 Summary of the phenotypic data collected from half head samples across trials in the 717
BC-NAM population 718
Trait Trial Mean Range Hsq Correlations across trials
NAMGAT15 NAMGAT16 NAMHER15
TKW (g) NAMGAT15 28.2 15.6 - 41.2 0.77 1 0.89 0.87
TKW (g) NAMGAT16 29.9 16.9 - 44.5 0.53 0.89 1 0.98
TKW (g) NAMHER15 30.8 13.9 - 45.6 0.66 0.87 0.98 1
Volume (mm3) NAMGAT15 25.5 14.6 - 40.5 0.36 1 0.91 0.89
Volume (mm3) NAMGAT16 25.3 15.0 - 38.7 0.37 0.91 1 0.81
Volume (mm3) NAMHER15 25.7 15.4 - 36.4 0.41 0.89 0.81 1
Length (mm) NAMGAT15 4.18 3.20 - 5.29 0.41 1 0.92 0.91
Length (mm) NAMGAT16 4.17 3.29 - 5.31 0.45 0.92 1 0.85
Length (mm) NAMHER15 4.18 3.31 - 5.07 0.40 0.91 0.85 1
Width (mm) NAMGAT15 3.47 2.58 - 4.32 0.29 1 0.91 0.87
Width (mm) NAMGAT16 3.47 2.65 - 4.26 0.39 0.91 1 0.79
Width (mm) NAMHER15 3.48 2.76 - 4.15 0.29 0.87 0.79 1
Thickness (mm) NAMGAT15 3.32 2.71 - 3.72 0.54 1 0.93 0.89
Thickness (mm) NAMGAT16 3.31 2.85 - 3.69 0.41 0.93 1 0.86
Thickness (mm) NAMHER15 3.35 2.9 - 3.78 0.59 0.89 0.86 1
719
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Table 3 Summary of the grain size QTL identified in the diversity and BC-NAM populations 720
QTL ID Chr P-start P-end G-start G-end Traits affected
Overlap DP BC-NAM
qGS1.1 1 1197434 1197434 3.18 3.18 _VL__ TVL__ Y
qGS1.2 1 3551580 5918355 10.35 13.66 TVLWH TVLWH Y
qGS1.3 1 6907712 7312784 16.89 18.27 TVLWH TVLWH Y
qGS1.4 1 9747355 9819344 25.45 25.48 TVLWH _____ Y
qGS1.5 1 12613546 13193518 31.25 33.54 _V___ TVLWH Y
qGS1.6 1 17427801 17638035 43.49 43.60 TVLW_ TV__H Y
qGS1.7 1 21347462 21457138 46.68 46.73 TVLW_ __L_H N
qGS1.8 1 53201999 53936261 59.15 60.07 TV__H TVLWH N
qGS1.9 1 62886564 62899303 94.75 94.76 TVLW_ TVLWH N
qGS1.10 1 64100851 64100851 98.57 98.57 TVLWH TVLW_ N
qGS1.11 1 67340797 67575811 115.03 116.85 T_L__ _V__H Y
qGS1.12 1 68155790 68372488 123.06 125.38 ____H TVLWH Y
qGS1.13 1 69803971 69803971 129.89 129.89 ____H __L_H N
qGS1.14 1 75731963 75731963 156.66 156.66 T_LWH TVLW_ Y
qGS2.1 2 1044805 1044805 0.10 0.10 TV_WH _VLW_ N
qGS2.2 2 2902001 2977700 13.86 13.93 TV_WH TVLWH N
qGS2.3 2 10134643 10134643 56.44 56.44 ____H TV__H Y
qGS2.4 2 10821637 10821637 59.55 59.55 TVLWH TVLWH Y
qGS2.5 2 57031598 57031598 88.94 88.94 TV_WH ____H N
qGS2.6 2 57512343 57512343 93.17 93.17 TV_WH TV_W_ N
qGS2.7 2 59194799 59194799 106.68 106.68 _VL__ TVLWH N
qGS2.8 2 60589218 61587359 121.99 123.92 TVLWH TVLWH N
qGS2.9 2 62351052 62351052 134.53 134.53 T____ __LWH N
qGS2.10 2 66339464 66573617 148.73 149.23 ___W_ ____H Y
qGS2.11 2 67288493 67673759 152.32 152.63 __L__ TVLW_ Y
qGS2.12 2 69049000 69049000 159.86 159.86 TV_WH TVLW_ N
qGS2.13 2 70379875 70379875 166.28 166.28 TVLW_ TVLW_ Y
qGS2.14 2 73307187 74155649 180.56 182.44 TVLWH __LWH Y
qGS2.15 2 75213858 75310814 188.18 189.09 TV_WH TVLWH Y
qGS3.1 3 617297 617297 0.68 0.68 _____ TVLWH N
qGS3.2 3 1678176 1681138 4.04 4.05 _____ TVLW_ N
qGS3.3 3 2972919 2972919 9.19 9.19 TV_W_ T___H Y
qGS3.4 3 3597488 3597488 11.06 11.06 TVLWH TVLWH Y
qGS3.5 3 7459268 7459268 31.12 31.12 TV_W_ TVL_H N
qGS3.6 3 11842474 13099669 46.52 48.31 TVLW_ TVLWH N
qGS3.7 3 27806456 52373180 56.79 58.13 TV_WH T_LWH Y
qGS3.8 3 54907952 55299538 70.31 72.25 TVLWH TVLW_ Y
qGS3.9 3 58595834 58595834 102.06 102.06 T__WH T_L_H Y
qGS3.10 3 70185356 70185356 149.33 149.33 TV_WH TV_WH Y
qGS3.11 3 70935875 70935875 151.89 151.89 TVLWH T___H Y
qGS3.12 3 71518340 71802981 154.39 155.80 TVLWH TVLWH Y
qGS4.1 4 4223889 4229830 27.71 27.75 TV_WH TVL__ N
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
qGS4.2 4 4559844 4559844 29.99 29.99 TVLWH _____ N
qGS4.3 4 7553755 7553755 55.81 55.81 ____H _____ Y
qGS4.4 4 13834090 49255608 71.37 73.60 TVLW_ TVLWH Y
qGS4.5 4 51365085 52404369 82.14 84.01 ____H TVLWH Y
qGS4.6 4 54000479 54337837 94.49 94.71 TVLW_ TVLWH Y
qGS4.7 4 57705140 58486661 106.83 106.99 TV_WH TVLW_ Y
qGS4.8 4 61396481 61772353 121.78 123.63 TVLW_ TVLWH Y
qGS4.9 4 66127254 66127254 152.27 152.27 TVLW_ __L__ N
qGS5.1 5 3301226 3575220 22.87 23.25 TVLWH TVLW_ Y
qGS5.2 5 38549008 54107776 64.58 65.76 _VLW_ TVLW_ Y
qGS5.3 5 61218249 61218249 70.70 70.70 _VL__ _VL_H N
qGS5.4 5 62745873 63339491 73.38 73.89 TVLWH TVLW_ Y
qGS5.5 5 70963740 71211419 118.00 119.00 TVLW_ T____ N
qGS6.1 6 2325384 2491386 28.20 30.03 _____ TVLWH N
qGS6.2 6 3407678 3594201 32.68 33.68 T___H TVLWH Y
qGS6.3 6 39213130 43198259 50.50 52.75 ___W_ TVLWH Y
qGS6.4 6 49661049 49672525 89.27 89.28 __L__ ____H Y
qGS6.5 6 54391413 55406078 125.43 125.64 __LWH ____H N
qGS6.6 6 58659162 58659162 157.96 157.96 ___WH __LW_ Y
qGS6.7 6 59994049 59994049 162.94 162.94 T__WH TVLWH Y
qGS7.1 7 1463261 1463261 16.76 16.76 TVLW_ _____ N
qGS7.2 7 4810838 5140654 58.13 58.36 TVLWH _VLW_ N
qGS7.3 7 14300295 38985351 72.21 72.30 TVLWH TV_W_ Y
qGS7.4 7 52822136 54307603 74.35 75.24 __LW_ TV_W_ Y
qGS7.5 7 55675278 55675278 78.67 78.67 TVLWH TVL__ N
qGS7.6 7 60263942 60263942 106.63 106.63 TVLW_ ___W_ Y
qGS8.1 8 5474894 5997557 58.65 59.32 TVLWH TVLWH Y
qGS8.2 8 10712689 10712689 67.33 67.33 TV_WH _V___ Y
qGS8.3 8 58069299 58069299 87.75 87.75 ____H _VLW_ N
qGS8.4 8 59873105 59873105 99.35 99.35 TV__H TV_WH N
qGS8.5 8 60306779 60394256 102.58 103.12 TVLW_ TVLWH N
qGS8.6 8 60911363 60911363 106.38 106.38 T____ TVLWH N
qGS8.7 8 61382971 61382971 109.44 109.44 ____H T____ N
qGS8.8 8 61773861 61868131 111.98 112.60 _____ TVLWH N
qGS8.9 8 61977263 61977263 118.53 118.53 ____H TVL_H N
qGS8.10 8 62329242 62329242 127.16 127.16 _____ TV__H N
qGS9.1 9 1188287 1188287 14.45 14.45 TV_WH T____ N
qGS9.2 9 8955692 15253529 67.33 69.39 TVLW_ TVLWH Y
qGS9.3 9 42995635 45347175 75.64 76.17 TVLWH TVLWH Y
qGS9.4 9 51441730 51441730 94.86 94.86 TV_WH __L_H Y
qGS9.5 9 52757861 52795444 104.82 104.93 T___H T___H N
qGS9.6 9 53947680 53959901 108.19 108.23 T__W_ TVLWH N
qGS9.7 9 56099631 56099631 113.99 113.99 _VL__ TVLWH N
qGS9.8 9 58170657 58198133 128.11 128.33 TV_WH TVLW_ N
qGS10.1 10 777170 777170 13.22 13.22 _V__H T_L_H N
qGS10.2 10 3158490 3445290 31.94 32.55 TVLWH T___H Y
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
qGS10.3 10 9752387 12131845 53.93 56.00 TVLWH TVLWH Y
qGS10.4 10 18074205 40511327 57.96 58.58 TVLWH TVLWH Y
qGS10.5 10 56447108 56447108 89.00 89.00 ____H T___H N
qGS10.6 10 58653903 60172861 102.30 103.56 TVLWH TVLW_ N
721
Chr indicates the chromosome on which the QTL is located. P-start provides the physical coordinate, 722
based on v3.1.1 of the sorghum genome assembly, of the start of the QTL CI. P-end provides the 723
physical coordinate of the end of the QTL CI. G-start provides the predicted cM genetic linkage 724
position of the start of the QTL CI based on the sorghum consensus map (Mace et al 2009). G-end 725
provides the predicted cM genetic linkage position of the end of the QTL CI. The traits affected 726
columns indicate the traits significantly associated with the QTL in diversity panel (DP) and BC-727
NAM population respectively. The five traits were recorded as TVLWH, where T represents TKW, V 728
represents volume, L represents length, W represents width and H represents thickness. The inclusion 729
of the letters for a particular QTL means that the corresponding traits are affected by the QTL in the 730
corresponding population. Missing letters in the order of TVLWH (represented as “_”) mean that the 731
corresponding traits are not affected by the QTL in corresponding population. The Overlap column 732
indicates whether the QTL overlaps with previously published grain size QTL in sorghum. N means 733
the QTL is not overlapped with previously published grain size QTL, while Y mean the QTL is 734
overlapped with previously published QTL. 735
736
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Figure legends 737
Figure 1 Field treatment and its effect on grain size. A) Field treatment of removing half of 738
panicle imposed during flowering time. B) Significant increase of TKW and volume in half 739
heads compared to full heads in DPGAT16. C) Varying increase percentage in TKW of half 740
heads observed across genotypes. D) box-plots show variations of increase in TKW and 741
volume in half heads across genotypes in DPGAT16. 742
Figure 2: Phenotypic distribution and correlations of 5 grain size parameters across the 743
diversity panel and BC-NAM population. The plots on the diagonal show the phenotypic 744
distribution of each trait. The values above the diagonal are pairwise correlation coefficients 745
between traits, and the plots below the diagonal are scatter plots of compared traits. 746
Figure 3: GWAS of 5 grain size parameters in the diversity panel. GWAS results were 747
displayed in Manhattan plots and Q-Q plots. Significant threshold (solid black line) and 748
suggest threshold (dash black line) were 1.45E-06 and 2.91E-05, respectively, according to 749
GEC test. 750
Figure 4: GWAS of 5 grain size parameters in the BC-NAM population. GWAS results were 751
displayed in Manhattan plots and Q-Q plots. Significant threshold (solid black line) and 752
suggest threshold (dash black line) were 5.83E-06 and 1.17E-04, respectively, according to 753
GEC test. 754
Figure 5: Effects of landraces alleles of grain size QTL relative to elite recurrent parents in 755
the BC-NAM population. Each box plot illustrates the effects of exotic alleles of grain size 756
QTL across 17 BC-NAM families derived from crosses between landraces and elite recurrent 757
parental lines. Within each family, effects of a QTL was calculated as difference between 758
average of the exotic allele and average of allele of the recurrent parental line. Sixty-six QTL 759
identified in BC-NAM were ordered based on the median effect of the exotic alleles. 760
Correspondence between numbers on x-axis and QTL identity is provided in Table S16. 761
Figure 6: The overlap of grain size QTL identified in the diversity panel and the BC-NAM 762
population. 763
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Supplementary materials 764
Supplementary Table S1 List of the diversity panel. a, country and region information was 765
obtained from USDA GRIN (https://npgsweb.ars-grin.gov/gringlobal/search.aspx) and 766
personal communication with Jeffery A. Dahlberg. b, race was classified based on population 767
structure analysis, as explained in materials and methods. c records overlap between diversity 768
panel and USSAP according to GRIN. 769
Supplementary Table S2 Summary of the BC-NAM 770
Supplementary Table S3 Summary of field experiments 771
Supplementary Table S4 Racial group breakdown of the diversity panel 772
Supplementary Table S5 List of reported grain size QTL in previous studies 773
Supplementary Table S6 Lis of reported grain size genes in rice and maize 774
Supplementary Table S7 Effect of population structure on grain size. Statistical significance 775
was assessed using one-way analyses of variance (ANOVAs) followed by Tukey’s HSD tests 776
for multiple comparisons. 777
Supplementary Table S8 Marker Trait associations in the diversity panel. Physical position of 778
SNPs were according to Sorghum bicolor v3.1.1. 779
Supplementary Table S9 Effects of QTL in the diversity panel 780
Supplementary Table S10 Marker Trait associations in BC-NAM. Physical position of SNPs 781
were according to Sorghum bicolor v3.1.1. 782
Supplementary Table S11 Effects of QTL in BC-NAM 783
Supplementary Table S12 Tests of associations between population-specific QTL with grain 784
size parameters in the alternative population 785
Supplementary Table S13 Comparison of grain size regions identified in rice and maize with 786
grain size QTL identified in this study 787
Supplementary Table S14 List of grain size candidate genes in sorghum 788
Supplementary Table S15 Summary of candidate genes within close vicinities of grain size 789
QTL 790
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://npgsweb.ars-grin.gov/gringlobal/search.aspxhttps://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
791
Supplementary Table S16 Correspondence between numbers on x-axis of Figure 5 and 792
identities of QTL identified in BC-NAM. 793
Supplementary Figure S1 PCA and LD of BC-NAM. A, PCA of BC-NAM. B, PCA of 794
diversity panel shows the BC-NAM parental lines capture a good proportion of diversity in 795
the diversity panel. Dots in red colour represent non-recurrent parental lines. Dots in green 796
represent recurrent parental lines. C, LD decay in the BC-NAM. 797
Supplementary Figure S2 Bi-plots show PCA analysis of grain size parameters measured in 798
the diversity panel. 799
Supplementary Figure S3 Bi-plots show PCA analysis of grain size parameters traits 800
measured in BC-NAM. 801
Supplementary Figure S4 LD decay, Population structure and PCA of the diversity panel. A, 802
LD decay of the diversity panel. B, population structure of the diversity panel. C, PCA of the 803
diversity panel, dots in colour indicate representatives for each racial group. 804
Supplementary Figure S5 Distribution of SNPs across sorghum genome in the diversity 805
panel. 806
Supplementary Figure S6 Distribution of SNPs across sorghum genome in BC-NAM. 807
Supplementary Figure S7. Correlation of grain size between HH and FH. A, correlation of 808
TKW between HH and FH. B, correlation of volume between HH and FH. 809
Supplementary Figure S8 Manhattan plots and Q-Q plots show GWAS analysis of PCs 810
derived from grain size parameters in the diversity panel. 811
Supplementary Figure S9 Manhattan plots and Q-Q plots show GWAS analysis of PCs 812
derived from grain size parameters in BC-NAM. 813
Supplementary Figure 10: SbGS3 (Sobic.001G341700) controls grain size in sorghum. A) 814
Expression pattern of sbGS3 in sorghum. B) Varying effects of sbGS3 alleles in non-recurrent 815
parental lines compared with the recurrent parental line. 816
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Vo
lum
e (
mm
3)
Th
ou
sand
ke
rne
l w
eig
ht (g
)
P-value
-
Figure 2
BC-NAMDP
(g)
(mm3)(mm3)
(g)
(mm)
(mm)
(mm)
(mm)
(mm)
(mm)
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
TKW
Volume
Length
Width
Thickness
Figure 3 GWAS in DP
1 2 3 4 5 6 7 8 9 10
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Figure 4 GWAS in BC-NAM
TKW
Volume
Length
Width
Thickness
1 2 3 4 5 6 7 8 9 10
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Figure 5TK
W(g
)
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/
-
Figure 6
DP NAM
2826 38
.CC-BY-NC-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted July 22, 2019. . https://doi.org/10.1101/710459doi: bioRxiv preprint
https://doi.org/10.1101/710459http://creativecommons.org/licenses/by-nc-nd/4.0/