1. Introduction
Sorghum [
Sorghum bicolor (L.) Moench] ranks as the fifth significant cereal grain in terms of global production and cultivated area [
1]. It uses less water and endures climate change better than other cereals. In light of climate change and rising global temperatures, sorghum could be a feasible solution for growers [
2]. Due to its high nutritional content, drought tolerance, minimal input requirements, and remarkable environmental adaptability, sorghum is a crucial crop for food security [
3,
4,
5]. Sorghum is a widely cultivated crop grown in over 100 countries, particularly in dry, hot, and arid regions [
6]. The largest Sorghum producers are the United States of America, Nigeria, Sudan, Mexico, Ethiopia, and India [
7]. In the United States, it is mainly grown in rainfed conditions in dry regions on ultisol and mollisol soil types known as the "Sorghum Belt" in Kansas, Texas, Colorado, Oklahoma, and South Dakota [
8,
9].
Sorghum provides essential nutrients and phytochemicals. It contains protein, dietary fiber, and important minerals [
10]. Polyunsaturated fatty acids such as linoleic, oleic, palmitic, linolenic, and stearic can be found in sorghum. Vitamins, particularly those from the B group and fat-soluble vitamins (A, D, E, and K), are abundant in sorghum. Furthermore, sorghum is a rich source of secondary plant metabolites and macro- and microelements such as phenolic acids, flavonoids, sterols, policosanols, and antioxidants [
11,
12]. Compared to other primary cereal grains, sorghum is distinguished by its more significant amount of resistant starch and slowly digestible starch, which contributes to reducing spikes in blood sugar levels after meals [
13]. Sorghum’s diverse bioactive polyphenols can lower the risk of nutrition-linked chronic diseases. Additionally, its high molecular weight tannins are known to alter the functionality of proteins and starch, offering the potential for developing novel bioactive ingredients and enhancing food quality [
14]. Sorghum's abovementioned factors make it a rare climate-resilient crop that can address food and nutrition security.
Sorghum is a multipurpose crop used in biofuel production, forage, ethanol production, and fodder preservation. Sorghum, particularly sweet sorghum, is a promising crop for biofuel production since it is a C4 crop that requires little input, has a high sugar content, and ease of extractability [
15]. After human consumption, the remainder of sorghum is mainly utilized for animal feed [
16]. The ideal mineral and fatty acid balance of sorghum and its protein source suitability for aquafeed production have recently increased its popularity as an aquafeed [
17].
The International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) GeneBank has almost 37,000 Sorghum accessions, 2,247 of which were selected to form a smaller group of germplasm known as the core collection. However, this core collection was also overwhelming. The core collection was evaluated for 11 qualitative and 10 quantitative traits, yielding 21 hierarchical groupings. From each cluster, about 10% or at least one accession was selected to create a mini core of 242 accessions [
18]. The Sorghum mini core contains 10% of the core's accessions, or 1% of the entire collection, representing homogeneity for geographical origin, biological races, qualitative features, means, variances, phenotypic diversity indices, and phenotypic correlation. As a result, it is widely used for evaluating various agronomic traits and resistant traits, both biotic and abiotic, in the current genomic studies [
18,
19,
20].
Sorghum germplasm lines from Senegal have emerged as a valuable source of essential agronomic traits, particularly resistance to biotic stresses such as fungal diseases [
21]. Extensive Genome-wide association studies (GWAS) have dissected sorghum resistance against various fungal pathogens in the germplasms [
21,
22,
23]. However, research on other agronomically important traits, such as seed morphology, has received limited attention.
Morphological variation in seed traits includes variations in seed size and shape. The morphology of seeds is a crucial agricultural characteristic as it reflects a combination of genetic, physiological, and environmental aspects, all of which significantly impact crop yield, quality, and market value [
24]. Apart from market value, seed morphology has proved beneficial in determining taxonomic relationships in plant families. As a result, both seed shape and size are relevant parameters for assessing plant biodiversity [
24]. In addition, investigating the biodiversity of seeds can help characterize intra- and inter-species variation, genotypic discrimination, and correlation—all of which are important for breeding to achieve the target levels of seed yield and quality [
24,
25].
Wang
et al. [
26] evaluated sorghum mini core panel in multiple locations with 6,094,317 single nucleotide polymorphism (SNP) markers and identified one locus for recurving peduncles and eight loci for panicle length, width, and compactness. Sakamoto
et al. [
27] used multi-trait GWAS to analyze 329 sorghum germplasms from different origins and found SNPs that may be related to seed morphology, such as SNP loci S01_50413644, S04_59021202, and S05_9112888. GWAS conducted on the 300 diverse accessions of the sorghum association panel (SAP) with 265,487 SNPs identified 30 SNPs that were strongly associated with traits measured at the seedling stage under cold stress, and 12 SNPs were significantly associated with seedling traits under heat stress [
28]. Our previous study evaluated 162 Senegalese germplasms for seed area size, length, width, length-to-width ratio, perimeter, circularity, the distance between the intersection of length and width (IS) and center of gravity (CG), and seed darkness and brightness with 193,727 publicly available SNPs and identified multiple candidate genes potentially associated with seed morphology [
29].
This study investigated seed morphology in a diverse panel of sorghum accessions, encompassing a subset of mini core collection (115 lines including IS19975 originated from Senegal) and germplasms from Senegal (130 lines excluding IS19975). Eight key quantitative traits related to seed size, shape, and color were evaluated in over 24,000 seeds. The selection of these accessions prioritized the public availability of SNP data, facilitating GWAS to map genetic determinants of the observed phenotypic variation. By statistical analyses, the study explored potential connections between these traits and resistance to three major sorghum diseases: anthracnose, head smut, and downy mildew within the mini-core lines. Lastly, employing the Genome Association and Prediction Integrated Tool (GAPIT) R package, the study conducted GWAS using phenotypic data from the seeds and over 290,000 publicly available SNPs. This analysis identified and mapped significant SNPs associated with various seed morphology traits onto the reference sorghum genome.
4. Discussion
Seed morphology plays a key role in biological and ecological processes such as seed dormancy, germination, dispersal, persistence, evolution, and adaptation [
37]. Despite its versatility, high-stress tolerance, and diverse applications as grain, forage, and biomass [
38], sorghum seed morphology remains relatively unexplored. Correlation analysis of mini core and Senegalese accessions identified significance among the traits, identical to the patterns observed in previous studies with Senegalese germplasm [
29]. Both PCA plots and partial contribution analyses yielded highly similar results, strengthening the consistency of these findings [
29]. The observed consistency in correlation patterns across both studies could be attributed to the overlap of some Senegalese accessions. However, analyzing just the mini-core accessions in this study yielded nearly identical results, suggesting broader generalizability of these findings (data available in Supplementary Data S1).
Furthermore, recent studies identified potential linkages between sorghum seed morphology traits and host resistance against fungal pathogens. A recent study identified significant negative correlations between grain mold severity and seed weight in sorghum [
39]. Similarly, Ahn
et al. [
29] identified correlations between seed morphology traits (circularity and the distance between IS and CG) and the formation of spots on seedling leaves. These spots appeared when seedlings were inoculated with
Sporisorium reilianum, a causal pathogen causing head smut, and submerged under water [
40]. Though spotted plants are considered susceptible, the cause of the spots is unclear. They might be a direct result of fungal infection or, alternatively, a defense mechanism triggered by the seedlings. Regardless of their origin, the association between spot appearance rate and seed morphology traits is notable. While no statistically significant links between seed morphology and anthracnose/downy mildew susceptibility were found except for IS and CG, five out of eight tested traits exhibited associations with head smut susceptibility. The head smut data applied in this study is from syringe needle inoculation (hypodermic injection), with resistance/susceptibility confirmed by the presence/absence of infected heads in mature plants [
19]. The observed correlations between seed morphology and head smut resistance might be rooted in the distinct infection processes of
S. reilianum. Unlike anthracnose caused by
Colletotrichum sublineola, which involves direct contact infection by conidia, head smut relies on systemic fungal growth originating from soilborne spores infecting plants during seed germination and seedling emergence [
41]. This suggests that certain seed morphological traits may influence plant structures or defenses that impact internal fungal spread, but the precise mechanism remains unknown.
Our GWAS analysis revealed over 100 candidate genes linked to seed morphology traits (
Table S2). Intriguingly, several genes with similar functions appeared as top candidates for multiple traits, suggesting shared genetic influences as suggested in correlation analysis. For example, UDP-glycosyltransferases ranked among the top hits for area size, circularity, and distance between IS and CG, indicating their potential impact on seed size and shape. In rice, UDP-glucosyltransferase regulates grain size and abiotic stress tolerance associated with metabolic flux redirection [
42]. Genes associated with zinc finger motifs emerged as candidates for length and LWR, indicating their potential influence on grain size and shape. This is further supported by the C2H2 zinc-finger protein LACKING RUDIMENTARY GLUME 1 (LRG1) in rice, which directly regulates spikelet formation and consequently impacts grain size and yield [
43]. Likewise, F-box genes associated with LWR and brightness support findings in rice where the F-box protein FBX206 and OVATE family proteins network modulate brassinosteroid biosynthesis for grain size control [
44]. Furthermore, leucine-rich repeat protein genes linked to length and brightness and the cytochrome P450 superfamily associated with area size and circularity support their roles in plant development, stress responses, and metabolism [
29,
45,
46,
47,
48]. Notably, GW10, a P450 subfamily member, regulates grain size and number in rice [
49]. These examples, alongside the entire candidate gene list in
Table S2, offer valuable resources for future research and potential candidates for breeding programs aiming to improve sorghum seed morphology and grain yield. Multiple genes previously identified as top candidates in our earlier work [
29] resurfaced as key genes in this study. This repeated association strongly suggests their genuine involvement in shaping seed morphology traits. These genes warrant particular attention for further functional validation studies to explore their roles in determining seed morphology.
Figure 1.
A comparison of the area sizes for IS11473 (PI329738) and IS12697 (PI302116). The seed of (a) IS11473 has one of the largest areas among the seeds compared, while the seed of (b) IS12697 has one of the smallest areas. The scale bars on the bottom right corner indicate 1 cm applied to both (a) and (b).
Figure 1.
A comparison of the area sizes for IS11473 (PI329738) and IS12697 (PI302116). The seed of (a) IS11473 has one of the largest areas among the seeds compared, while the seed of (b) IS12697 has one of the smallest areas. The scale bars on the bottom right corner indicate 1 cm applied to both (a) and (b).
Figure 2.
A comparison of the seed colors for IS9108 (PI682465) and IS7987 (PI685210). The seed of (a) IS9108 has one of the darkest colors among the mini core and Senegalese germplasms, while the seed of (b) IS7987 has one of the brightest colors. The scale bar indicates 1 cm applied to both (a) and (b).
Figure 2.
A comparison of the seed colors for IS9108 (PI682465) and IS7987 (PI685210). The seed of (a) IS9108 has one of the darkest colors among the mini core and Senegalese germplasms, while the seed of (b) IS7987 has one of the brightest colors. The scale bar indicates 1 cm applied to both (a) and (b).
Figure 3.
Scatter plots displaying correlations between the traits based on Pearson’s r. The correlations are additionally shown with a heatmap and fit lines.
Figure 3.
Scatter plots displaying correlations between the traits based on Pearson’s r. The correlations are additionally shown with a heatmap and fit lines.
Figure 4.
The principal component analysis of all seed morphology-related traits from tested sorghum germplasms. The plot displays PC1 vs PC2.
Figure 4.
The principal component analysis of all seed morphology-related traits from tested sorghum germplasms. The plot displays PC1 vs PC2.
Figure 5.
The partial contributions of variables to seed morphology traits in sorghum mini core and Senegalese lines are shown in the plot. The partial contributions toward PC1 (red), PC2 (green), and PC3 (blue) are displayed for each trait.
Figure 5.
The partial contributions of variables to seed morphology traits in sorghum mini core and Senegalese lines are shown in the plot. The partial contributions toward PC1 (red), PC2 (green), and PC3 (blue) are displayed for each trait.
Figure 6.
Manhattan plots of GWAS results showing the significant SNPs associated with eight phenotypic traits across the genome. The traits included: (A) Area size; (B) Brightness; (C) Circularity; (D) Distance between IS and CG; (E) Length; (F) Length to width ratio; (G) Perimeter length; (H) Width. The dots in the Manhattan plot represent SNP markers. The green line corresponds to a Bonferroni-corrected p-value threshold of 1.7E-7 (−log10(p) = 6.8).
Figure 6.
Manhattan plots of GWAS results showing the significant SNPs associated with eight phenotypic traits across the genome. The traits included: (A) Area size; (B) Brightness; (C) Circularity; (D) Distance between IS and CG; (E) Length; (F) Length to width ratio; (G) Perimeter length; (H) Width. The dots in the Manhattan plot represent SNP markers. The green line corresponds to a Bonferroni-corrected p-value threshold of 1.7E-7 (−log10(p) = 6.8).
Table 1.
Top seed morphology accessions across the accessions.
Table 1.
Top seed morphology accessions across the accessions.
Largest area size (mm2) |
Smallest area size (mm2) |
Accession |
Mean ± S.D. |
Accession |
Mean ± S.D. |
IS11473 |
25.60 ± 1.60 |
IS12697 |
4.92 ± 0.83 |
PI514404 |
21.61 ± 3.75 |
IS13264 |
7.76 ± 1.26 |
IS7987 |
21.24 ± 2.70 |
PI514394 |
7.93 ± 1.00 |
PI253986 |
20.04 ± 3.63 |
PI514308 |
8.07 ± 0.89 |
IS28141 |
19.32 ± 2.75 |
PI514474 |
8.18 ± 0.78 |
Longest perimeter (mm) |
Shortest perimeter (mm) |
IS11473 |
20.57 ± 0.65 |
IS12697 |
8.54 ± 0.75 |
PI514404 |
18.16 ± 1.70 |
PI514394 |
11.04 ± 0.72 |
IS7987 |
17.96 ± 1.21 |
IS13264 |
11.08 ± 1.20 |
PI253986 |
17.28 ± 1.54 |
PI514308 |
11.13 ± 0.63 |
IS28141 |
17.15 ± 1.21 |
PI514474 |
11.25 ± 0.57 |
Longest length (mm) |
Shortest length (mm) |
IS11473 |
6.32 ± 0.24 |
IS12697 |
3.03 ± 0.29 |
IS7987 |
5.73 ± 0.37 |
PI514434 |
3.74 ± 0.23 |
PI514404 |
5.59 ± 0.49 |
PI514394 |
3.89 ± 0.24 |
IS28141 |
5.55 ± 0.41 |
PI514308 |
3.90 ± 0.20 |
IS12804 |
5.46 ± 0.41 |
IS9108 |
3.91 ± 0.28 |
Longest width (mm) |
Shortest width (mm) |
IS11473 |
5.59 ± 0.23 |
IS12697 |
2.17 ± 0.18 |
PI514404 |
5.19 ± 0.48 |
IS13264 |
2.61 ± 0.18 |
IS7987 |
5.01 ± 0.35 |
IS3121 |
2.76 ± 0.26 |
IS28141 |
5.00 ± 0.39 |
PI514394 |
2.79 ± 0.20 |
IS11026 |
4.99 ± 0.29 |
PI514474 |
2.82 ± 0.15 |
Highest LWR |
Lowest LWR |
IS12804 |
1.73 ± 0.17 |
IS10302 |
1.06 ± 0.03 |
IS1233 |
1.59 ± 0.08 |
IS11026 |
1.07 ± 0.05 |
IS13264 |
1.58 ± 0.21 |
PI514323 |
1.07 ± 0.04 |
IS3121 |
1.47 ± 0.11 |
PI514283 |
1.08 ± 0.05 |
PI514471 |
1.46 ± 0.08 |
PI514288 |
1.08 ± 0.04 |
Highest circularity (0-1 scale) |
Lowest circularity (0-1 scale) |
IS13294 |
0.87 ± 0.01 |
IS12804 |
0.73 ± 0.05 |
IS2872 |
0.87 ± 0.01 |
IS11473 |
0.76 ± 0.03 |
IS13893 |
0.86 ± 0.01 |
IS1233 |
0.78 ± 0.02 |
IS9108 |
0.86 ± 0.01 |
IS14090 |
0.79 ± 0.03 |
IS12937 |
0.85 ± 0.01 |
IS27034 |
0.79 ± 0.03 |
The longest distance between IS and CG (mm) |
Shortest distance between IS and CG (mm) |
IS27034 |
0.40 ± 0.20 |
PI514394 |
0.18 ± 0.12 |
IS11026 |
0.39 ± 0.19 |
IS12697 |
0.18 ± 0.10 |
PI514288 |
0.37 ± 0.24 |
PI514434 |
0.18 ± 0.10 |
IS11473 |
0.37 ± 0.19 |
IS2872 |
0.19 ± 0.11 |
IS14090 |
0.36 ± 0.18 |
PI514468 |
0.19 ± 0.12 |
Brightest (0-255 scale) |
Darkest (0-255 scale) |
IS7987 |
237.14 ± 6.64 |
IS9108 |
82.80 ± 10.00 |
IS32439 |
234.96 ± 11.57 |
IS11619 |
87.62 ± 13.74 |
IS32349 |
234.14 ± 12.62 |
IS9177 |
87.68 ± 9.00 |
IS7305 |
232.78 ± 17.78 |
IS13264 |
93.12 ± 22.34 |
PI514446 |
230.68 ± 12.47 |
PI11374 |
93.30 ± 12.03 |
Table 2.
Detailed correlations in eight seed morphology-related traits. ***= P < 0.0001, **= P < 0.001 and *= P < 0.01.
Table 2.
Detailed correlations in eight seed morphology-related traits. ***= P < 0.0001, **= P < 0.001 and *= P < 0.01.
|
Area size |
Perimeter |
Length |
Width |
LWR |
Circularity |
IS and CG |
Brightness |
Area size (mm2) |
1.00*** |
0.99*** |
0.91*** |
0.96*** |
-0.57*** |
-0.12 |
0.61*** |
0.22** |
Perimeter (mm) |
0.99*** |
1.00*** |
0.94*** |
0.95*** |
-0.52*** |
-0.19* |
0.62*** |
0.24** |
Length (mm) |
0.91*** |
0.94*** |
1.00*** |
0.79*** |
-0.21** |
-0.34*** |
0.56*** |
0.26*** |
Width (mm) |
0.96*** |
0.95*** |
0.79*** |
1.00*** |
-0.75*** |
0.01 |
0.60*** |
0.17** |
LWR |
-0.57*** |
-0.52*** |
-0.21** |
-0.75*** |
1.00*** |
-0.44*** |
-0.31*** |
-0.04 |
Circularity |
-0.12 |
-0.19* |
-0.34*** |
0.01 |
-0.44*** |
1.00*** |
-0.42*** |
-0.08 |
IS and CG |
0.61*** |
0.62*** |
0.56*** |
0.60*** |
-0.31*** |
-0.42*** |
1.00*** |
-0.07 |
Brightness |
0.22** |
0.24** |
0.26*** |
0.17** |
-0.04 |
-0.08 |
-0.07 |
1.00*** |
Table 3.
Cluster variables analysis among the seed morphology-related traits. Three clusters were formed based on seed characteristics: Size, color, and shape.
Table 3.
Cluster variables analysis among the seed morphology-related traits. Three clusters were formed based on seed characteristics: Size, color, and shape.
Cluster |
Members |
R2 with its own cluster |
R2 with the next closest |
1-R2
|
1 |
Perimeter length |
0.98 |
0.06 |
0.02 |
Area size |
0.97 |
0.07 |
0.03 |
Width |
0.90 |
0.20 |
0.12 |
Length |
0.86 |
0.07 |
0.15 |
Distance between IS and CG |
0.52 |
0.01 |
0.49 |
2 |
Circularity |
0.72 |
0.05 |
0.29 |
Length-to-width ratio |
0.72 |
0.27 |
0.38 |
3 |
Brightness |
1.00 |
0.04 |
0.00 |
Table 4.
Comparison of seed morphology traits in sorghum mini core lines between head smut resistant and susceptible groups. *= P < 0.05, ND = no significant difference.
Table 4.
Comparison of seed morphology traits in sorghum mini core lines between head smut resistant and susceptible groups. *= P < 0.05, ND = no significant difference.
|
Area size (mm2) |
Perimeter (mm) |
Length (mm) |
Width (mm) |
LWR |
Circularity (0-1 scale) |
IS and CG (mm) |
Brightness (0-255 scale) |
Resistant |
14.1 |
14.51 |
4.73 |
4 |
1.19 |
0.83 |
0.26 |
152.12 |
Susceptible |
12.76 |
13.82 |
4.59 |
3.75 |
1.24 |
0.82 |
0.25 |
160 |
Significance based on p-value |
* |
* |
ND |
* |
* |
* |
ND |
ND |
Table 5.
Logistic regression analysis of seed morphology traits and disease resistance in sorghum mini core lines. *= P < 0.05 (Chi-square test), NS = No significant association.
Table 5.
Logistic regression analysis of seed morphology traits and disease resistance in sorghum mini core lines. *= P < 0.05 (Chi-square test), NS = No significant association.
|
Area size (mm2) |
Perimeter (mm) |
Length (mm) |
Width (mm) |
LWR |
Circularity (0-1 scale) |
IS and CG (mm) |
Brightness (0-255 scale) |
Anthracnose |
NS |
NS |
NS |
NS |
NS |
NS |
* |
NS |
Head smut |
* |
* |
NS |
* |
* |
* |
NS |
NS |
Downy mildew |
NS |
NS |
NS |
NS |
NS |
NS |
* |
NS |