c Indian Academy of Sciences
RESEARCH ARTICLE
Genetic diversity, population structure and marker trait associations
for seed quality traits in cotton (Gossypium hirsutum)
ASHOK BADIGANNAVAR1,2 ∗ and GERALD O. MYERS1
1
Louisiana State University Agricultural Center, School of Plant, Environmental, and Soil Sciences, 104 M. B. Sturgis Hall,
Baton Rouge, LA 70803, USA
2
Present address: Nuclear Agriculture and Biotechnology Division, Bhabha Atomic Research Center, Trombay,
Mumbai 400 085, India
Abstract
Cottonseed contains 16% seed oil and 23% seed protein by weight. High levels of palmitic acid provides a degree of stability
to the oil, while the presence of bound gossypol in proteins considerably changes their properties, including their biological
value. This study uses genetic principles to identify genomic regions associated with seed oil, protein and fibre content in
upland cotton cultivars. Cotton association mapping panel representing the US germplasm were genotyped using amplified
fragment length polymorphism markers, yielding 234 polymorphic DNA fragments. Phenotypic analysis showed high genetic
variability for the seed traits, seed oil range from 6.47–25.16%, protein from 1.85–28.45% and fibre content from 15.88–
37.12%. There were negative correlations between seed oil and protein content. With reference to genetic diversity, the average
estimate of FST was 8.852 indicating a low level of genetic differentiation among subpopulations. The AMOVA test revealed
that variation was 94% within and 6% among subpopulations. Bayesian population structure identified five subpopulations
and was in agreement with their geographical distribution. Among the mixed models analysed, mixed linear model (MLM)
identified 21 quantitative trait loci for lint percentage and seed quality traits, such as seed protein and oil. Establishing genetic
diversity, population structure and marker trait associations for the seed quality traits could be valuable in understanding the
genetic relationships and their utilization in breeding programmes.
[Badigannavar A. and Myers G. O. 2015 Genetic diversity, population structure and marker trait associations for seed quality traits in cotton
(Gossypium hirsutum). J. Genet. 94, 87–94]
Introduction
Cotton is grown worldwide for its natural fibre and source
of oil, and the cottonseed meal is used as feed for ruminant
livestock (Wallace et al. 2009). Cottonseed oil is a versatile vegetable oil derived from the seeds of the cotton plant
after the cotton lint has been removed, and comprises about
16% of a seed by weight. It is typically composed of about
26% palmitic acid (C16:0), 15% oleic acid (C18:1) and 58%
linoleic acid (C18:2). The cottonseed meal is the byproduct
after oil extraction and used as a source of fodder protein
in the livestock industry, but its use in agriculture is limited.
Constituting nearly half of a seed’s weight, the meal contains
23% high biological-value protein. Edible cottonseed contains 64 g of protein per 100 g of edible cottonseed and a
source of nine essential amino acids, potassium and complex
carbohydrates (Heuzé et al. 2013). To balance the oil, protein
∗ For correspondence. E-mail: ashokmb1@gmail.com.
and fibre content in the existing germplasm / cultivars, there
is a need to survey the genome to identify genes/controlling
elements responsible for these metabolic pathways. Improving the overall productivity of cotton and value of cottonseed
by manipulation of quality of cotton seed oil, protein content
and removal of toxic gossypol may contribute to increasing
the value of cotton both as a fibre and food crop (Mansoor
and Paterson 2012).
Protein and oil concentration, kernel index and percentage in cotton are controlled by multiple genes (Singh et al.
1985; Dani and Kohel 1989; Ye et al. 2003), and are strongly
influenced by the environment (Kohel and Cherry 1983).
Seed traits may be simultaneously controlled by seed nuclear
genes, cytoplasmic genes and maternal nuclear genes (Ye
et al. 2003). Previous studies have shown significant negative associations between oil and protein content (Kohel and
Cherry 1983; Chen et al. 1986; Sun et al. 1987). Such factors
may hinder progress in the simultaneous improvement of
Keywords. seed oil; protein; AFLP; upland cotton; association mapping.
Journal of Genetics, Vol. 94, No. 1, March 2015
87
Ashok Badigannavar and Gerald O. Myers
these traits in conventional cotton breeding programmes.
Genetic mapping provides a useful tool to understand the
architecture of quantitative traits at the molecular level. DNA
markers linked to QTL controlling seed protein content have
been identified in soybean (Chung et al. 2003; Panthee et al.
2005), rice (Tan et al. 2001), barley (See et al. 2002) and
field pea (Tar’an et al. 2004). DNA markers associated with
loci controlling seed oil content or fatty acid composition
have been identified in soybean (Kianian et al. 1999), rapeseed (Zhao et al. 2006), sunflower (Bert et al. 2003), oilseed
mustard (Gupta et al. 2004) and canola (Hu et al. 2006). In
cotton, 11 single QTLs have been associated with oil and
protein content (Song and Zhang 2007). Amino acid-specific
epistatic QTLs have also been detected, which explain
4.43–9.55% of the phenotypic variation. Using chromosome
substitution lines, chromosome 4 (from the G. barbadense
genotype 3-79, introgressed into a G. hirsutum TM-1
background) was associated with seed oil, protein and fibre
percentage (Wu et al. 2009). A backcross inbred line (BIL)
population involving G. hirsutum (as recurrent parent) and G.
barbadense identified 17 QTLs for oil content, 22 for protein
and three for gossypol content (Yu et al. 2012). Most of the
QTL studies on cotton, using traditional mapping methods,
are unique to a specific genetic background (biparental cross
populations). Association mapping (AM) identifies QTLs
that are detected in random set of genotypes from a diverse
genetic background (Gupta et al. 2005). The concept of AM
has been known for many years, but the increased availability
of molecular markers and the refinement of statistical tools
have created renewed interest in this approach (Achleitner
et al. 2008). Cotton provides a good platform for genomewide AM to catalogue genes for fibre traits owing to its vast
genetic variation. Few recent reports demonstrated the feasibility of conducting linkage disequilibrium (LD) based AM
for fibre traits in tetraploid (Abdurakhmonov et al. 2008;
Zeng et al. 2009) and diploid (Kantartzi and Stewart 2008)
accessions. This study was conducted to identify and map
genomic regions associated with seed protein, seed oil and
fibre content in diverse collection of upland cultivars using
AM principles.
Materials and methods
Plant material
The plant material consist of 75 upland cotton germplasm
lines derived from different geographical regions, namely,
Louisiana (25), Arkansas (17), South Eastern (SE) (22),
Delta (4), and Texas/southwest (SW) (7) (table 1 in electronic supplementary material at http://www.ias.ac.in/jgenet.
Most of the genotypes were selected from advanced breeding lines tested in the Regional Breeder’s Trial Network (RBTN), a multistate testing programme of public
breeding lines covering different cotton producing regions
(http://www.cottonrbtn.com). Plants were field grown as
per the Louisiana Cooperative Extension Service guidelines at the Dean Lee Research Station in Alexandria, LA.
Leaf samples from the representative plants were collected
and bulked for DNA extraction. Phenotypic data on yield
were obtained from the RBTN trial website (http://www.
cottonrbtn.com) of the USA. Replicate data on lint percentage was averaged to calculate variances using SAS
2009 (SAS 9.1.3, SAS Institute, Cary, USA). Deltapine,
DP 393 (PVP 200400266) was considered as the check
variety and all the comparisons were made in relation
with the performance of this cultivar. LP values of other
cultivars in the panel were adjusted based on the relative
performance of the check variety, DP 393.
From remnant planting seeds, 10 g of acid delinted seeds
for each cultivar were sent to the Department of Agricultural
Chemistry, LSU AgCenter, Baton Rouge, Louisiana, to determine total oil, protein and fibre contents. The seed quality
traits were determined following the modified American Oil
Chemist’s Society (AOCS) methods of analysis protocols.
Seed protein was estimated using the nitrogen combustion
method (AOAC 990.03) (AOAC 1999); crude fat/oil content
by petroleum ether as solvent using Soxtec System HT6;
and crude fibre content by AOCS 962.09. Two replications
were run and averaged over each cultivar. Correlation analysis between LP and seed traits was performed using PROC
CORR in SAS.
Genotyping with amplified fragment length polymorphism
(AFLP) markers
Sixty-four primer combinations were used to generate AFLP
data (see table 1 in electronic supplementary material) following the procedure given by Vos et al. (1995) with minor
modifications. Sample DNA was digested with EcoRI and
MseI restriction enzymes and oligonucleotide adapters specific to these restriction sites were ligated to the resulting
fragments through incubation (37◦ C for 180 min) with DNA
ligase. Preamplifications were done using EcoR I+A and
Table 1. Univariate analysis of LP and seed quality traits in upland cotton germplasm lines.
Trait
Min.
Max.
Mean
SE
Variance
SD
Median
Protein
Oil
Fibre
LP
18.05
6.47
15.88
35.67
28.45
25.16
26.53
57.35
23.80
18.09
19.83
42.97
0.31
0.32
0.23
0.57
7.28
7.80
4.20
22.67
2.69
2.79
2.05
4.76
24.30
18.02
19.37
41.53
SE, standard error; SD, standard deviation; LP, lint percentage.
88
Journal of Genetics, Vol. 94, No. 1, March 2015
Association mapping in cotton
Mse I+C oligo primers and selective amplification was carried out using the IR dye labelled EcoRI+ANN oligo primers
(MWG Biotech, Ebersberg, Germany). Touchdown polymerase chain reaction (PCR) was used for selective amplifications using the following profile: initial denaturing step
at 94◦ C for 2 min followed by initial 12 cycles at 94◦ C for
30 s, 65◦ C for 30 s (with 0.7◦ C decrement at every cycle)
and 72◦ C for 1 min, then followed by 23 cycles at 94◦ C for
30 s, 56◦ C for 30 s, and 72◦ C for 1 min with a final extension step at 72◦ C for 2 min. The PCR amplified products
were run on a LI-COR 4300 Sequencer (LI-COR, Lincoln,
USA) and scored. The nomenclature of AFLP loci was followed according to Lacape et al. (2003), Myers et al. (2009)
and Badigannavar et al. (2012), indicating the enzyme primer
combinations with band size.
Molecular diversity and cluster analysis
For each marker used, subpopulationwise diversity statistics
including the number of observed and effective alleles, Nei’s
genetic distances (Nei and Li 1979), expected heterozygosity
and Shannon’s information index were calculated using
GenAlEx 6.5 software (Peakall and Smouse 2012). Allelic
diversity at a given locus can be determined by polymorphic information content
(PIC) and was calculated using the
formula, ‘PIC= 1 − f i2 , where fi is the frequency of the
ith allele (Weir 1996). PROC ALLELE was used to calculate
PIC values and frequency estimate was done using PROC
FREQ (SAS 9.1.3, SAS Institute, Cary, USA). Genetic
differentiation among the subpopulation was estimated
using hierarchical analysis of molecular variance (AMOVA;
Excoffier et al. 2005) in GenAlEx 6.2. Dice similarity coefficient was calculated using the formula D = 2a/(2a + b + c),
where a is the number of fragments present in both accessions, b and c are the numbers of fragments that are present in
either accession, respectively (Sneath and Sokal 1973). From
the similarity data, genetic distance was calculated for each
pair of germplasm lines and dendrogram was generated using
the neighbour joining (NJ) analysis in MEGA 4.0 (Kumar
et al. 2004) software.
Population structure and association analysis
A Bayesian model based clustering was performed using
the software program ‘STRUCTURE’ 2.3.2 (Pritchard et al.
2000). The admixture model was selected in the software
and allele frequencies among populations were assumed to
be correlated. Each run was carried out using 10,000 iterations and 10,000 replications. A total of 2–10 k clusters
were evaluated and the optimum number of cluster was
determined by LnP(D) probabilities (Evanno et al. 2005).
A graphical display (bar plot) of the population structure
was generated using DISTRUCT software (Rosenberg et al.
2002). The pairwise kinship (K matrix) was calculated
using SPAGeDi software (Hardy and Vekemans 2002). The
K matrices and Q matrix describing the assignment of each
genotype to specific clusters were used in mixed linear model
association analyses. The mean phenotypic data was used
for association analysis. Significant marker trait associations
were tested using two different models, a general linear
model (GLM) and a mixed linear model (MLM) in TASSEL
2.1 software (Bradbury et al. 2007). In GLM model, population substructure of cotton mapping panel was incorporated as covariates. In MLM, association was estimated by
simultaneous accounting of multiple levels of population
structure (Q matrix), relative kinship among the individuals (K matrix) and eigenvectors of PCoA as described
by Yu et al. (2006).
Results
Phenotypic analyses
Of the upland cotton germplasm lines studied, 69 were
developed by public breeding programmes and six by commercial seed companies which originate from five relatively
distinct geographical regions. Among the traits analysed, LP
varied from 35.67 to 57.35% (average 42.97%) (table 1).
Among the seed traits, seed protein content ranged from
18.05 to 28.45% (average 23.8%), oil content ranged from
6.47 to 25.16% (average 18.09%), and fibre content varied
from 15.88 to 26.53% (average 19.83%). LP showed greater
genetic variability followed by protein as compared to other
traits. High heritability values (up to 90%) were observed for
quality parameters, while LP showed moderate heritability
(49%).
The correlations among the yield and quality traits are
presented in table 2. There were significant negative correlations between fibre content with oil and protein percentage
(P = 0.0001). While not significant, protein and oil percentages were negatively correlated (−0.224) which complicates
Table 2. Correlation coefficients among LP and seed quality traits.
Trait
LP
Protein
Oil
Fibre content
LP
Protein
Oil
1
−0.240
0.027
0.033
1
−0.224
−0.340∗
1
−0.61∗∗
* Significant at P ≤ 0.05; ** significant at P ≤ 0.01.
Journal of Genetics, Vol. 94, No. 1, March 2015
89
Ashok Badigannavar and Gerald O. Myers
Table 3. Genetic diversity parameters among cotton subgroups.
Subgroup
LA
ARK
SE
DELTA
SW/T
Na ∗
Ne
I
He
UHe
1.885
1.829
1.966
1.560
1.910
1.357
1.395
1.517
1.436
1.557
0.369
0.386
0.477
0.372
0.492
0.231
0.248
0.313
0.244
0.329
0.240
0.269
0.328
0.325
0.362
*Na , no. of different alleles; Ne , no. of effective alleles; I, Shannon’s index; He , expected
heterozygosity; UHe , unbiased expected heterozygosity; LA, Louisiana; ARK, Arkansas; SE,
South Eastern; SW/T, southwest/Texas.
their simultaneous improvement into a single cultivar. All
other correlations, particularly those between LP with seed
quality traits were not significant.
Genetic analyses
A total of 234 polymorphic loci were obtained when screened
across 75 cotton germplasm lines. The Shannon index, a
measurement used to compare diversity between two or more
subpopulations ranged between 0.35 and 0.49 (table 3). The
number of effective alleles was highest for SW/T (1.57)
while lowest for LA genotypes (1.36). The heterozygosity for
the AFLP markers ranged from 0.23 (LA) to 0.32 (SW/T).
The frequency distribution values for relative kinship
revealed that the genetic relatedness ranged from 0 to 0.9
(figure 1). Although 60% of the pairwise kinship estimates
were below 0.5, there were moderate peaks around 0.7
and 0.8. Genetic relatedness is often prominent among elite
genotypes, as they share common genotypes in their
breeding programmes. The PIC measures how different
Figure 1. Dendrogram representing genetic relationship among upland cotton germplasm lines generated by NJ analysis.
90
Journal of Genetics, Vol. 94, No. 1, March 2015
Association mapping in cotton
Figure 2. Bar plot representing population structure of upland cotton lines grouped into five
subpopulations. Each individual genotype is represented by a line partitioned in five coloured
segments that represent the estimated membership fractions to each one of the five subgroups.
The bar plot was generated using Structure (Pritchard et al. 2000) software following the
admixture model.
populations are distinguished based on probability of randomly chosen alleles. The frequency distribution for PIC
using AFLP markers ranged from 0 to 0.40 with more than
90% of them falling between 0.16 and 0.40.
Population structure and genetic diversity study
Using the entire upland cotton AM panel, population structure was analysed using software ‘STRUCTURE’. The
parameters used for this analysis produced the highest loglikelihood score when K was 5. The identified subgroups
highly correspond to the five geographical regions from
where these lines have been derived. The bar plot indicated
LA genotypes showing uniformity with fewer admixtures,
mainly from Delta, SW/T and Ark ancestral genes (figure 2).
Substantial amount of admixture was seen to occur among
the clusters.
To estimate genetic diversity within and among the predefined subpopulations, Wright’s FST index (Wright 1951) was
calculated (table 4). Based on the pairwise FST estimates,
SW/T (southwest/Texas) and Delta were closely related
(0.0078), while Delta and LA group were highly diverse
(0.141). The average estimate of FST was 0.052 indicating a
low level of genetic differentiation among the groups. Estimates of molecular variation present in the upland genotypes revealed that although 94% of the genetic diversity
was attributable to differences within populations, still there
was 6% variation among groups (P = 0.001) (table 5). NJ
analysis of genotypic data for the upland germplasm lines
identified five major clusters as per their geographical
distribution (figure 1). To correlate STRUCTURE results and
that of NJ analysis, we compared the assignment of each
of these germplasm lines into a defined cluster. Except for
few cases, overall, there was good agreement for the phylogenetic relationships between the two estimates. Germplasm
lines originated from Louisiana formed one major cluster,
while Arkansas formed two major clusters. SE also formed a
major cluster and SW, Texas and Delta germplasm lines were
highly diverse and were found scattered.
Association analyses
AM was performed for LP and seed quality traits using
GLM and MLM models using TASSEL software. The
effectiveness of these models in controlling false positives
was determined by monitoring the partial R2 values. GLM
was tested to identify single marker effects on quantitative traits. Partial R2 was least for GLM models across all
the traits. But models using PCA eigenvectors explained
more variation (16–30%) than the models with structure (12–
25%). The naïve MLM model, which included the kinship
matrix, explained more genetic variation (up to 50%) compared to naïve GLM model (15%). LP showed high amount
of partial R2 across all the models compared to seed quality traits. Using MLM (Q+K model) 21 significant QTLs
were found associated with four traits (table 6). Five each
QTLs were associated with LP, protein and fibre content,
while six QTLs were associated with seed oil. E3M6_260
and E4M4_242 markers were found to be significantly associated with seed oil and fibre content. The partial R2 values ranged from 28.47 to 88.90%. Seed protein was significantly associated with five QTLs, among them, E6M2_640
recorded the high partial R2 value (91.8%).
Table 4. Pairwise FST values estimated for cotton subgroups.
FST
LA*
ARK
SE
DELTA
ARK
SE
DELTA
SW/T
0.0823
0.0909
0.141
0.0983
0
0.017
0.0492
0.0212
0
0.0174
0.0095
0
0.0078
* FST , Wright’s fixation index.
Journal of Genetics, Vol. 94, No. 1, March 2015
91
Ashok Badigannavar and Gerald O. Myers
Table 5. Analysis of AMOVA for upland cotton germplasm lines between and within five subgroups.
Source of variation
df
Sum of squares
Estimated variance
% variance
P value
Among pops
Within pops
Total
4
70
74
268.47
2409.77
2678.41
2.33
34.42
36.75
6
94
0.001
0.001
Discussion
An appropriate association mapping panel should encompass
diverse genetic background such that efficient marker system
could be employed to infer true associations (Flint-Garcia
et al. 2005). In this study, phenotypic data on LP and seed
quality traits suggested wide variability for protein, oil and
fibre content. Previously G.hirsutum and G. arboreum cotton accessions also reported wide variability for oil and seed
weight (Kohel 1978; Song and Zhang 2007). We noticed
6.47% of seed oil in G. herbaceum and 25.65% in G.hirsutum
accessions.
Seed quality traits are directly influenced by the lint
percentage, seed cotton yield, seed number, seed weight, seed
coat content, moisture level and environmental factors. We
observed positive correlations between LP and oil, while
negative correlations between LP and protein content were
also noticed. Typically, high yielding plant has a high LP
which is most easily achieved by decreasing seed size. In
this study, fibre content was determined from hulled seeds.
The hull is expected to be higher in fibre than the embryo,
such that seed size decreases with increase in per cent fibres.
Similarily, since a majority of seed protein is in the embryo,
with the increase in lint percentage (smaller seed), protein
percentage is expected to decrease. Simultaneous improvement of oil and protein is complicated, owing to their negative correlation. According to Kohel et al. (1985) and Gotmare et al. (2004), the relationship between percentage of
protein and oil are significantly negative. Oil and protein
percentages in seed also decrease with harvest date, but the
greatest change is in the amount of oil (Kohel and Cherry
1983). Several studies have been conducted to understand the
inheritance pattern and gene action governing quality traits
(Mert et al. 2004). Seed index was found to be predominantly
under the control of genes acting additively. This trait could
easily be manipulated through selection for the production
of pure line varieties. Oil content is governed by dominant
genes (Singh et al. 1985), while significant epistatic interaction was observed for oil percentage and seed index (Dani
and Kohel 1989). Although the effects of environment and
genotype on oil and protein content are well documented
and relationships between yield, seed quality and fibre properties in cotton have been identified, studies on the inheritance and genetic factors governing these traits have not been
widely addressed. This may be due to the lack of understanding of the complex pathways and multiple genes interacting
in an epistatic manner controlling these traits.
Table 6. Significant QTL (P < 0.05) for LP and seed quality traits in upland cotton identified using MLM in TASSEL.
Trait
LP
Oil
Protein
Fibre
AFLP marker
P value
R2∗
E3M6_260
E4M1_365
E4M4_242
E6M4_249
E5M3_65
E3M2_145
E5M7_180
E5M7_195
E4M3_214
E6M4_358
E3M6_260
E6M2_640
E4M1_382
E4M4_217
E4M1_353
E6M3_190
E3M6_260
E5M5_415
E3M2_145
E6M4_303
E3M8_125
0.0009
0.0032
0.0001
0.004
0.016
0.002
0.004
0.005
0.013
0.015
0.017
0.005
0.010
0.011
0.013
0.013
4.42 ×10−4
0.0013
0.0052
0.0074
0.0086
58.53
53.53
46.12
34.50
39.10
28.47
34.90
35.03
36.00
30.45
45.89
81.18
88.90
84.09
88.22
88.33
60.03
39.55
69.27
54.31
62.11
*Adjusted R2 , indicates the percentage of explained variation.
92
Journal of Genetics, Vol. 94, No. 1, March 2015
Association mapping in cotton
Genomewide AM is successful only if appropriate methods are implemented to control the effect of population
structure. Inclusion of population structure would minimize
type I errors due to the spurious associations between nonlinked loci. The six models used in this study, accounting for
Q (population structure) or for K (kinship estimates) or PCA
(eigenvectors of PCA) primarily aimed at reducing the type I
error. For all the traits under study, the models controlling
relative kinship performed better than the model controlling
population structure. Similarly, model controlling structure
is better than PCA in explaining the phenotypic variation. In
addition, MLM models identified 21 QTL for LP and seed
quality traits. Among all, E3M6_260 was significant for LP,
seed oil and fibre content, while E3M2_245 was associated
with seed oil and fibre content. The number of QTLs also
decreased drastically with high partial R2 value when population structure was included in the MLM model. Although
inclusion of PCA values did change R2 values substantially,
but Q+K MLM models recorded higher partial R2 across all
the traits.
During the recent years, molecular marker technology
was successfully applied in cotton diversity studies, creating
genetic linkage maps and identifying QTL for fibre traits
using biparental cross derivatives or association mapping
panel. Compared to other field crops, association mapping
in cotton has not been explored to a great extent. A recent
study by Kantartzi and Stewart (2008) identified 30 marker
trait associations with 19 SSR markers in G. arboreum
germplasm lines. The MLM models greatly reduced type I
error and revealed true associations for fibre traits. However,
measurement of the LD patterns for genomic regions and
extent of LD among different populations of the target organisms is the start point to design and execute association
mapping. A recent study reported significant LD between
pair of SSR loci within 36–37 cM distance in the diverse
upland cotton germplasm lines (Abdurakhmonov et al.
2008). Due to relatively less number of markers used in
finding associations may result in low resolution.
Our results demonstrated the efficiency of MLM models
in identifying true associations for seed quality traits. Adding
more number of markers and expanding the mapping panel
would result in greater precision and power. Looking at
the complex pathways involved in the synthesis of oil and
protein, the addition of more markers to catalogue multienvironment phenotypic variations would also improve the
understanding of the genetic factors governing these traits.
Acknowledgements
We thank the Department of Agricultural Chemistry, LSU, Baton
Rouge for seed quality trait analysis and all the RBTN coordinators for providing phenotypic data. Financial support from Cotton
Incorporated is highly appreciated.
References
Abdurakhmonov I. Y., Kohel R. J., Yu J. Z., Pepper A. E.,
Abdullaev A. A., Kushanov F. N. et al. 2008 Molecular
diversity and association mapping of fibre quality traits in exotic
G. hirsutum L. germplasm. Genomics 92(6), 478–487.
Achleitner A., Nicholas A., Tinker Zechner E. and Buerstmayr H.
2008 Genetic diversity among oat varieties of worldwide origin
and associations of AFLP markers with quantitative traits. Theor.
Appl. Genet. 117 (7), 1041–1053.
AOAC 1999 Official methods of analysis. 16th edition. Association
of official analytical chemists, Washington, USA.
Badigannavar A. M., Myers G. O. and Jones D. C. 2012 Molecular
diversity revealed by AFLP markers in upland cotton genotypes.
J. Crop Improv. 26, 627–640.
Bert P. F., Jouan I., Tourvieille de Labrouhe D., Serre F., Philippon
J., Nicolas J. and Vear P. 2003 Comparative genetic analysis of
quantitative traits in sunflower (Helianthus annuus L.). 2. Characterization of QTL involved in developmental and agronomic
traits. Theor. Appl. Genet. 107, 181–189.
Bradbury P. J., Zhang Z., Kroon D. E., Casstevens T. M., Ramdoss
Y. and Buckler E. S. 2007 TASSEL: Soft ware for association
mapping of complex traits in diverse samples. Bioinform 23,
2633–2635.
Chen Z. F., Zhang Z. W. and Cheng H. L. 1986 The analysis of
upland cotton quality. Acta Agron. Sin. 12, 195–200.
Chung J., Babka H. L., Graef G. L., Staswick P. E., Lee D. J., Cregan
P. B., Shoemaker R. C. and Specht J. E. 2003 The seed protein,
oil, and yield QTL on soybean linkage group I. Crop Sci. 43,
1053–1067.
Dani R. G. and Kohel R. J. 1989 Maternal effects and genera
tion mean analysis of seed-oil content in cotton (Gossypium
hirsutum). Theor. Appl. Genet. 77, 569–575.
Evanno G., Regnaut S. and Goudet J. 2005 Detecting the number of
clusters of individuals using the software structure: a simulation
study. Mol. Ecol. 14, 2611–2620.
Excoffier L., Laval G. and Schneider S. 2005 Arlequin ver. 3.0:
an integrated software package for population genetics data
analysis. Evol. Bioinform. Online 1, 47–50.
Flint-Garcia S. A., Thuillet A. C., Yu J., Pressoir G., Romero S. M.,
Mitchell S. E. et al. 2005 Maize association population: a high
resolution platform for quantitative trait locus dissection. Plant J.
44, 1054–1064.
Gotmare V., Singh P., Mayee C. D., Deshpande V. and Bhagat C.
2004 Genetic variability for seed oil content and seed index in
some wild species and perennial races of cotton. Plant Breed.
123, 207–208.
Gupta P., Rustgi S. and Kulwal P. 2005 Linkage disequilibrium
and association studies in higher plants: present status and future
prospects. Plant Mol. Biol. 57(4), 461–485.
Gupta V., Mukhopadhyay A., Arumugam N., Sodhi Y. S., Pental D.
and Pradhan A. K. 2004 Molecular tagging of erucic acid trait in
oilseed mustard (Brassica juncea) by QTL mapping and single
nucleotide polymorphisms in FAE1 gene. Theor. Appl. Genet.
108, 743–749.
Hardy O. J. and Vekemans X. 2002 SPAGEDi: a versatile computer
program to analyse spatial genetic structure at the individual or
population levels. Mol. Ecol. Notes 2, 618–620.
Heuzé V., Tran G., Bastianelli D., Hassoun P. and Lebas
F. 2013 Cottonseed meal. Feedipedia.org. A programme by
INRA, CIRAD, AFZ and FAO (http://www.feedipedia.org/node/
550).
Hu X., Sullivan-Gilbert M., Gupta M. and Thompson S. A. 2006
Mapping of the loci controlling oleic and linolenic acid contents
and development of fad2 and fad3 allele-specific markers in
canola (Brassica napus L.) Theor. Appl. Genet. 113, 497–507.
Kantartzi S. K. and Stewart J. M. 2008 Association analysis of fibre
traits in Gossypium arboreum accessions. Plant Breed. 127, 173–
179.
Kianian S. F., Egli M. A., Phillips R. L., Rines H. W., Somers D.
A., Gengenbach B. G. et al. 1999 Association of a major groat oil
content QTL and an acetyl-CoA carboxylase gene in oat. Theor.
Appl. Genet. 98, 884–894.
Journal of Genetics, Vol. 94, No. 1, March 2015
93
Ashok Badigannavar and Gerald O. Myers
Kohel R. J. 1978. Survey of G. hirsutum germplasm collections for
seed oil percentage and seed characteristics. USDA-ARS Report.
S-187.
Kohel R. J. and Cherry J. P. 1983 Variation of cottonseed quality
with stratified harvests. Crop Sci. 23, 1119–1124.
Kohel R. J., Glueck J. and Rooney L. W. 1985 Comparison of
cotton germplasm collections for seed protein content. Crop Sci.
25, 961–963.
Kumar S., Tamura K. and Nei M. 2004 MEGA4: integrated
software for molecular evolutionary genetics analysis and
sequence alignment. Brief Bioinform. 5, 150–163.
Lacape J. M., Nguyen T. B., Thibivilliers S., Bojinov B.,
Courtois B., Cantrell R. G. et al. 2003 A combined RFLP–SSR–
AFLP map of tetraploid cotton based on a Gossypium hirsutum ×
Gossypium barbadense backcross population. Genome 46, 612–
626.
Mansoor S. and Paterson A. H. 2012 Genomes for jeans: cotton
genomics for engineering superior fibre. Trends Biotech. 30, 521–
527.
Mert M., Akiscan Y. and Gencer O. 2004 Inheritance of oil and
protein in some cotton generations. Asian J. Plant Sci. 3, 174–
176.
Myers G. O., Baogong J., Akash M. W., Badigannavar A. M. and
Saha S. 2009 Chromosomal assignment of AFLP markers in
upland cotton (Gossypium hirsutum L.) Euphytica 165, 391–399.
Nei M. and Li W. H. 1979 Mathematical model for studying genetic
variation in terms of restriction endonucleases. Proc. Natl. Acad.
Sci. USA 76, 5269–5273.
Panthee D. R., Pantalone V. R., West D. R., Saxton A. M. and
Sams C. E. 2005 Quantitative trait loci for seed protein and oil
concentration, and seed size in soybean. Crop Sci. 45, 2015–
2022.
Peakall R. and Smouse P. E. 2012 GenAlEx 6.5: genetic analysis in
Excel. Population genetic software for teaching and research-an
update. Bioinformatics 28, 2537–2539.
Pritchard J. K., Stephens M. and Donnelly P. 2000 Inference of
population structure using multilocus genotype data. Genetics
155, 945–959.
Rosenberg N., Pritchard J. K., Weber J. L., Cann H. and Kidd
K. 2002 Genetic structure of human populations. Science 298,
2381–2385.
SAS 2009 SAS Statistical Analysis Software for Windows 9.1.3.
Cary, USA.
See D., Kanazin V., Kephart K. and Blake T. 2002 Mapping genes
controlling variation in barley grain protein concentration. Crop
Sci. 42, 680–685.
Singh M., Singh T. H. and Chahal G. S. 1985 Genetic analysis
of some seed quality characters in Upland cotton (Gossypium
hirsutum L.) Theor. Appl. Genet. 71, 126–128.
Sneath P. H. A. and Sokal R. R. 1973 Numerical taxonomy: The
principals and practice of numerical classification, pp. 573.
Freeman, San Francisco, USA.
Song X. and Tian-Zhen Zhang 2007 Identification of quantitative
trait loci controlling seed physical and nutrient traits in cotton.
Seed Sci. Res. 17, 243–251.
Sun S. K., Chen J. H., Xian S. K. and Wei S. J. 1987 Study on the
nutritional quality of cotton seeds. Sci. Agric. Sin. 5, 12–16.
Tan Y. F., Sun M., Xing Y. Z., Hua J. P., Sun X. L., Zhang Q. F. and
Corke H. 2001 Mapping quantitative trait loci for milling quality,
protein content and color characteristics of rice using a recombinant inbred line population derived from an elite rice hybrid.
Theor. Appl. Genet. 103, 1037–1045.
Tar’an B., Warkentin T., Somers D. J., Miranda D., Vandenberg A.,
Blade and Bing D 2004 Identification of quantitative trait loci for
grain yield, seed protein concentration and maturity in field pea
(Pisum sativum L.) Euphytica 136, 297–306.
Vos P., Hogers R., Bleeker M., Reijans M., Van de Lee T., Hornes
M. et al. 1995 AFLP: A new technique for DNA fingerprinting.
Nucl. Acids Res. 23, 4407–4414.
Wallace T. P., Bowman D., Campbell B. T., Chee P., Gutierrez O.
A., Kohel R. J. et al. 2009 Status of the USA cotton germplasm
collection and crop vulnerability. Genet. Resour. Crop Evol. 56,
507–532.
Weir B. S. 1996 Genetic data analysis II. Sinauer Associates,
Sunderland, USA.
Wright S. 1951 The genetical structure of populations. Ann. Eugen.
15, 323–354.
Wu J., Jenkins J. N., McCarty J. C. and Thaxton P. 2009
Seed trait associations with Gossypium barbadense L. chromosomes/arms in a G. hirsutum L. background. Euphytica 167, 371–
380.
Ye Z. H., Lu Z. Z. and Zhu J. 2003 Genetic analysis
for developmental behavior of some seed quality traits in
Upland cotton (Gossypum hirsutum L.). Euphytica 129, 183–
191.
Yu J., Pressoir G., Briggs W. H., Vroh B. I., Yamasaki M., Doebley
J. F. et al. 2006 A unified mixed-model method for association
mapping that accounts for multiple levels of relatedness. Nat.
Genet. 38, 203–208.
Yu J., Zhang K., Li S., Yu S., Zhai H., Wu M. et al. 2012 Mapping
quantitative trait loci for lint yield and fibre quality across environments in a Gossypium hirsutum × Gossypium barbadense
backcross inbred line population. Theor. Appl. Genet. 126, 275–
287.
Zeng L., Meredith W. R. Jr, Gutiirrez O. A. and Boykin D. L.
2009 Identification of associations between SSR markers and
fibre traits in an exotic germplasm derived from multiple crosses
among Gossypium tetraploid species. Theor. Appl. Genet. 119(1),
93–103.
Zhao J. Y., Becker H. C., Zhang D. Q., Zhang Y. F. and Ecke
W. 2006 Conditional QTL mapping of oil content in rapeseed with respect to protein content and traits related to plant
development and grain yield. Theor. Appl. Genet. 113, 33–
38.
Received 3 February 2014, in revised form 30 September 2014; accepted 17 October 2014
Unedited version published online: 20 October 2014
Final version published online: 12 March 2015
94
Journal of Genetics, Vol. 94, No. 1, March 2015