Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Rabbi et al. 2014 Virus Research

G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS Virus Research xxx (2014) xxx–xxx Contents lists available at ScienceDirect Virus Research journal homepage: www.elsevier.com/locate/virusres High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding夽 Ismail Y. Rabbi a,∗ , Martha T. Hamblin b , P. Lava Kumar a , Melaku A. Gedil a , Andrew S. Ikpan a , Jean-Luc Jannink b,c , Peter A. Kulakow a a International Institute for Tropical Agriculture (IITA), Ibadan, Nigeria Department of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA c USDA-ARS, R.W. Holley Center for Agriculture and Health, Ithaca, NY, USA b a r t i c l e i n f o Article history: Available online xxx Keywords: Cassava mosaic disease Breeding Phenotyping Monogenic resistance Genotyping-by-sequencing QTL a b s t r a c t Cassava mosaic disease (CMD), caused by different species of cassava mosaic geminiviruses (CMGs), is the most important disease of cassava in Africa and the Indian sub-continent. The cultivated cassava species is protected from CMD by polygenic resistance introgressed from the wild species Manihot glaziovii and a dominant monogenic type of resistance, named CMD2, discovered in African landraces. The ability of the monogenic resistance to confer high levels of resistance in different genetic backgrounds has led recently to its extensive usage in breeding across Africa as well as pre-emptive breeding in Latin America. However, most of the landraces carrying the monogenic resistance are morphologically very similar and come from a geographically restricted area of West Africa, raising the possibility that the diversity of the single-gene resistance could be very limited, or even located at a single locus. Several mapping studies, employing bulk segregant analysis, in different genetic backgrounds have reported additional molecular markers linked to supposedly new resistance genes. However, it is not possible to tell if these are indeed new genes in the absence adequate genetic map framework or allelism tests. To address this important question, a high-density single nucleotide polymorphism (SNP) map of cassava was developed through genotyping-by-sequencing a bi-parental mapping population (N = 180) that segregates for the dominant monogenic resistance to CMD. Virus screening using PCR showed that CMD symptoms and presence of virus were strongly correlated (r = 0.98). Genome-wide scan and high-resolution composite interval mapping using 6756 SNPs uncovered a single locus with large effect (R2 = 0.74). Projection of the previously published resistance-linked microsatellite markers showed that they co-occurred in the same chromosomal location surrounding the presently mapped resistance locus. Moreover, their relative distance to the mapped resistance locus correlated with the reported degree of linkage with the resistance phenotype. Cluster analysis of the landraces first shown to have this type of resistance revealed that they are very closely related, if not identical. These findings suggest that there is a single source of monogenic resistance in the crop’s genepool tracing back to a common ancestral clone. In the absence of further resistance diversification, the long-term effectiveness of the single gene resistance is known to be precarious, given the potential to be overcome by CMGs due to their fast-paced evolutionary rate. However, combining the quantitative with the qualitative type of resistance may ensure that this resistance gene continues to offer protection to cassava, a crop that is depended upon by millions of people in Africa against the devastating onslaught of CMGs. © 2013 The Authors. Published by Elsevier B.V. All rights reserved. 1. Introduction 1.1. Cassava mosaic disease 夽 This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-No Derivative Works License, which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited. ∗ Corresponding author at: IITA, Headquarters & West Africa Hub, PMB 5320, Oyo Road, Ibadan 200001, Oyo State, Nigeria. Tel.: +234 2 7517472x2719; fax: +44 208 7113785. Cassava (Manihot esculenta Crantz, family Euphorbiaceae) is a starchy root crop that supplies carbohydrate energy to millions of people in the tropics (Ceballos et al., 2004) and it is being used increasingly as an industrial crop (Jansson et al., 2009). Though its remarkable ability to tolerate unfavourable conditions such as 0168-1702/$ – see front matter © 2013 The Authors. Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.virusres.2013.12.028 Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS 2 I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx drought and poor soils makes it a food security crop in many parts of sub-Saharan Africa (SSA), on-farm productivity of cassava has remained stagnant for many years due to several production constraints. Cassava mosaic disease (CMD), caused by several species of cassava mosaic geminiviruses (CMGs), is the most economically important constraint to cassava in SSA and the Indian sub-continent (Herrera-Campo et al., 2011). Though significant efforts have been expended on combating this disease, it still causes huge losses to production. The most striking example of the devastating potential of CMD to undermine food security in Africa is the severe pandemic that started as an epidemic in Uganda in the 1990s and led farmers to abandon the crop in many parts of the country (Otim-Nape and Thresh, 1998), and later spread to most countries in East and Central Africa (Legg and Fauquet, 2004). The pandemic is characterized by rapid spread through super-abundant Bemicia tabaci vectors (Legg and Ogwal, 1998) and is associated with a recombinant strain of the East African cassava mosaic virus – Uganda (EACMV-UG) along with African cassava mosaic virus (ACMV) belonging to the genus Begomovirus, within the Geminiviridiae family (Harrison et al., 1997). Important control measures against CMD include rogueing of symptomatic plants, use of virus-free planting materials and deployment of resistant varieties. The first two options are not only labour intensive and difficult to implement but also require continuous and long-term intervention. Use of resistant varieties is the most effective solution in mitigating the negative effect of CMD in farmers’ fields because this approach not only reduces yield losses due to the disease, but also reduces levels of the virus inoculum in the farming system particularly in varieties that suppress virus accumulation. 1.2. Sources of resistance to the disease Currently deployed resistance against CMD in Africa is of two types: (i) quantitative resistance derived from Manihot glaziovii; and (ii) qualitative resistance conferred by a single resistance gene(s). The quantitative resistance was introgressed into cultivated cassava following an unsuccessful worldwide search for resistant clones in the 1930s (Nichols, 1947; Jennings, 1976). Genetic studies reveal that the polygenic resistance from M. glaziovii is recessive with a heritability of about 60% (Jennings, 1976). The second type of resistance, which is conditioned by a single-gene with a dominant effect, was discovered in the 1980s in landraces from Nigeria and other West African countries (Akano et al., 2002; Fregene et al., 2001). These landraces, which display near-immunity against nearly all species of CMGs, are currently maintained in the IITA germplasm collection referred to as the Tropical Manihot esculenta (TMe) series. Diversity studies using molecular markers have previously shown that most of the original landraces bearing this qualitative resistance to CMD are genetically very similar if not identical (Fregene et al., 2000; Lokko et al., 2006). This suggests that the genetic base of this type of resistance in the African cassava genepool may be narrow, or even just a single locus. In contrast, the relative ease with which the highly heritable monogenic resistance can be transferred between germplasm through simple crosses, has resulted in its extensive usage in breeding across Africa as well as pre-emptive breeding in Latin America (Okogbenin et al., 2007). The long-term stability of this single-gene type of resistance in diverse geographical regions with heterogeneous species and recombinants of CMGs is uncertain given the high evolutionary rate of geminiviruses (Duffy and Holmes, 2009). Several genetic mapping studies have been conducted to find molecular markers linked to the qualitative resistance in the African germplasm. The first study identified two markers, a microsatellite (SSRY28) and an RFLP (GY1) that flank a single locus named CMD2 at distances of 9 and 8 cM, respectively (Akano et al., 2002). Subsequent to the discovery of the CMD2 locus, later studies have reported several additional markers that are linked to new resistance genes in other genetic backgrounds, including landraces and improved varieties derived from them (Lokko et al., 2005; Okogbenin et al., 2012). However, nearly all of these studies relied on the bulk segregant analysis (BSA) approach and/or very sparse maps for interval mapping analysis. The BSA approach provides little or no information regarding the chromosomal location of the identified markers, making it difficult to ascertain the number of unique loci/genes associated with a trait. When sparse maps are used, the confidence interval surrounding a QTL is usually large, making it difficult to determine the precise QTL location. For example, the CMD2-containing linkage group of Akano et al. (2002) had a total of five markers. Lokko et al. (2005) used a linkage map with just 45 markers, of which only three were in the linkage group containing the resistance locus. The objective of this study was to provide a comprehensive framework for describing the breadth of the genetic base of the single-gene resistance to CMD in the African cassava germplasm. Firstly, a full-sib mapping population segregating for qualitative resistance to CMD was developed and phenotyped for three growing seasons. The population was genotyped using genotypeby-sequencing (GBS), and a dense genetic linkage map with more than 8000 single nucleotide polymorphism (SNPs) was constructed. Using this resource, a high-resolution genetic mapping of the CMD resistance locus was carried out. The markers previously reported to be linked to CMD resistance were then projected onto the newly generated genetic map. This revealed their genomic locations, and the spatial relationship between them and the mapped resistance locus from the present study. To confirm the relationship among the CMD resistant landraces, cluster analysis was carried out using genome-wide SNP markers. 2. Materials and methods 2.1. Mapping population development, phenotyping and genotyping A full-sib mapping population segregating for dominant monogenic resistance to CMD was generated by crossing two non-inbred clones. Both parents are elite lines developed by IITA in Nigeria. The female parent, IITA-TMS-011412, is highly resistant to CMD and also rich in pro-vitamin A. Cloned in 1974, the male parent, IITA-TMS-4(2)1425, is an improved variety from a cross between a landrace from Nigeria (TME109, locally known as Oyarugbafunfun) and variety 58308, a hybrid derived directly from recombination of the M. glaziovii × M. esculenta triple-backcrosses (Jennings, 1976). Variety 58,308 was the main source of the quantitative resistance to CMD and produced some of the first generation Tropical Manihot Selection lines (see the discussion section for more background). IITA-TMS-4(2)1425 shows considerable susceptibility to CMD (Fig. 1). The 180 F1 seeds produced were germinated in sterilized garden soil and transplanted one month after sowing. At maturity, the seedlings were cloned, regardless whether they were infected or not, and planted at Ibadan, Nigeria (7.40◦ North latitude, 3.90◦ East longitude) using a randomized complete block design. Each clone was planted in two replicated plots of five stands per plot with plant spacing of 1 m × 0.5 m for three 12-month long cropping seasons established in 2011, 2012 and 2013. Generation-to-generation propagation through cloning was based on use of 12-cm long stem cuttings in both the infected and non-infected F1s. A local landrace that is highly susceptible to CMD, TME117, was planted as spreader row every fifth plot and as border row surrounding the experimental field to facilitate whitefly-mediated inoculation of the F1 population. The Ibadan site is known for high CMD pressure and Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 3 Fig. 1. CMD symptom on the mapping population parents, (a) IITA-TMS-011412 and (b) IITA-TMS-4(2)1425; (c) The frequency distribution of CMD scores using the BLUPs calculated from the three-year data. the planting period coincide with high whitefly activity providing high probability of natural exposure of plant population to CMD inoculum. Individual plants were evaluated for CMD symptoms at one and three months after planting using a scale ranging from 1 to 5, with one for symptomless plants while five is given for most severe symptoms (severe mosaic and distortion of leaves). The entire population was screened for presence of CMGs, particularly for the presence of ACMV and EACMCV, the two predominant species prevailing in West Africa, to confirm virus infection in the infected plants. The third fully expanded leaf from the top was sampled from each of the 5 plants per plot; they were pooled and DNA was extracted and analyzed for ACMV and EACMCV using a multiplex PCR protocol (Alabi et al., 2008). 2.2. Genotyping-by-sequencing DNA was extracted from 180 F1 individuals and the two parents using a modified Dellaporta method (Dellaporta et al., 1983). Genotyping-by-sequencing as described by Elshire et al. (2011) was carried out at the Institute of Genomic Diversity, Cornell University. Briefly, DNAs from the F1 individuals and parents were digested individually with ApeKI restriction enzyme, which recognizes a five base-pair sequence (GCWGC, where W is either A or T). This enzyme was chosen because of its partial sensitivity to DNA methylation, thus avoiding repetitive elements regions, and frequency of DNA-cutting (Elshire et al., 2011). Two 95-plex GBS sequencing libraries were prepared by ligating the digested DNA to unique nucleotide adapters (barcodes) followed by standard PCR. Sequencing was performed using Illumina HiSeq2000. The sequencing reads from different genotypes were de-convoluted using the barcodes and aligned to the version 4.1 of the cassava reference genome (www.phytozome.org/cassava) by using Burrow Wheelers Alignment tool (Li and Durbin, 2009). SNPs were extracted using the GBS pipeline implemented in TASSEL software (Bradbury et al., 2007), and genotypes were called using a custom R script. 2.3. Data analysis The pseudo-testcross linkage mapping strategy that is employed in the analysis of full-sib mapping populations requires unambiguous scoring of the parental genotypes at each marker. To ensure this, the parental DNAs were sequenced redundantly four times and their Illumina reads were pooled to increase the number and accuracy of the called SNPs. Following alignment of the reads against the reference genome, the SNPs that segregated in the parents as ab × ab (both parents heterozygous), aa × ab (male parent heteroygous), and ab × aa (female parent heterozygous) were extracted. Prior to linkage analysis, standard quality control was used to filter out SNPs from paralogous sequences (i.e. loci which appear as heterozygous in both parents and all progenies). Also filtered were loci showing significant deviation from expected genotypic frequencies based on chi-square test (threshold for removal: P ≤ 0.05) as well as those with missing information in more than 20% of the genotyped individuals in the mapping population. 2.3.1. Mapping of GBS-derived ApeKI SNPs Genetic linkage maps were constructed using JoinMap version 4.1 (Van Ooijen, 2006). Following calculation of pair-wise recombination frequencies, linkage groups were identified using the logarithm of odds (LOD) score of independence between pairs of loci at a threshold of 10. Due to the large number of markers per linkage group, the maximum-likelihood mapping algorithm implemented in Joinmap 4.1 was used to find the order of the markers in the linkage groups. This method is suitable for dealing with large datasets compared to the regression mapping method (Cheema and Dicks, 2009) and incorporates several numerical methods: simulated annealing for estimating the best map order by minimizing the sum of recombination frequencies in adjacent segments; Gibbs sampling for estimation of multipoint recombination frequency, given the current map order; and spatial sampling of loci to reduce the influence of unknown or dominant genotypes as well as potential errors. Simulated annealing was carried out using a chain length of 30,000 with an acceptance probability threshold of 0.25. Gibbs sampling for maximum likelihood estimation of multipoint recombination frequencies (Jansen et al., 2001) was done using chain length of 50,000 after a burn-in length of 20,000. 2.3.2. Phenotypic data analysis Because the categorical disease severity scores fitted a bi-modal distribution, for statistical analyses (ANOVA and QTL mapping) the trait was converted to a binary variable (either resistant or susceptible). Individuals with categorical CMD severity score larger than one were classified as Affected; all others were classified as Unaffected. A logistic regression model using generalized linear model was used to estimate the effect of the genotype, replication, environment and genotype-by-environment interaction as follows: yijkl =  + ˇi + Rij + Gk + ˇi∗ Gk + eijkl where yijkl was the phenotype;  the mean, ˇi the year effect; Rij the replication effect; Gk the clone effect; ˇi * Gk is the interaction between clone and year and eijkl is the residual. Mixed model was used to obtain best linear unbiased predictors (BLUPs) for each genotype for the combined three-year data. The mixed model was computed using the R package lme4 (Vazquez et al., 2010), considering the effects of the genotypes as random, while replications within environments were regarded as fixed because trials were Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 4 carried out for three years. Broad-sense heritability for CMD resistance at one and three months after planting was calculated using the formula H2 = g2 g2 + e2 where g2 and e2 are the variance components for the genotype effect and the residual error, respectively, based on individual plants. Correlations were calculated among disease resistance score BLUPs for three growing seasons (2011, 2012 and 2013). 2.3.3. High-resolution mapping of the CMD resistance locus Mapping of the CMD resistance locus in the present population was carried out with the BLUPs obtained for each year, and across the combined analysis of the 2011, 2012 and 2013 data. QTL analysis was performed using three complementary approaches. Because of high marker density, single marker-trait association for all 6756 SNPs was carried out. The markers were considered as fixed effects in a linear model implemented in the GLM function TASSEL (Bradbury et al., 2007). The genome-wide significance threshold for the F-statistic was determined by the Bonferroni method (Bland and Altman, 1995). Secondly, standard interval mapping (intervals step of 2.5 cM) was carried out using the regression mapping function “scanone” implemented in R/qtl package (Broman et al., 2003). The genome-wide significance (˛ = 1%) for declaring a significant QTL locus was determined using 1000 permutations. The 95% Bayesian credible interval for the CMD resistance locus was determined using the function “bayesint” implemented in R/QTL. The proportion of phenotypic variance explained by the resistance locus was obtained by fitting a linear model. Thirdly, QTL analysis using the Composite Interval Mapping (CIM) method was carried out with the number of marker covariates set to three. The number of markers for use as co-factors was determined using the automatic co-factor selection function (stepwiseqtl) implemented in R/qtl. The CIM method enabled a reduction in residual variation and thereby increases the resolution of the QTL location and with it the possibility of detecting any additional genomic regions that underly resistance to CMD.Anchoring CMD-resistance-linked markers in the present high-density genetic map The high-density SNP map developed in this study was used to anchor published loci associated with qualitative resistance to CMD (Table 1). Primer sequences that flank five of these markers (viz. SSRY28, NS198, SSRY106, NS158 and NS169) were used in BLAST searches of the cassava reference genome sequence (www.phytozome.org/cassava). Marker positions were interpolated onto the genetic map on the basis of the scaffolds harbouring them (Table 1). To obtain a linkage/recombination profile of SNPs along the linkage group that bears the CMD resistance locus, pairwise estimates of linkage disequilibrium (r2 ; Flint-Garcia et al., 2003) were calculated for the SNPs from the entire mapping population using the software package Haploview v. 3.31 (Barrett et al., 2005) and plotted in a matrix form. 2.3.4. Genetic relatedness of the CMD-resistant landraces In addition to the mapping analysis in the bi-parental population, the genetic relatedness was examined among the TME clones that were originally identified to be sources of the resistance to CMGs. A total of 2069 GBS markers from 34 clones, including 29 landraces and five TMS clones with the quantitative resistance (as an out-group), were obtained from a previous study (Ly et al., 2013), and used to perform hierarchical clustering using the Euclidean distance between the genotypes. These distances were used to construct a relationship dendrogram of the clones. 3. Results 3.1. Segregation for resistance to CMD in the mapping population The frequency distribution of the CMD severity scores in the mapping population revealed a bi-modal pattern with two peaks (Fig. 1): nearly half of the progenies and the female parent (IITATMS-011412) were resistant to CMD and showed no symptoms while the remainder of the F1 individuals showed disease symptoms ranging from mild (score 2) to severe (score 5). Resistance to CMD found in the female parent is therefore likely to be a single gene with dominant effect. There was very little variation within plots with respect to CMD symptom expression: all stands either showed similar symptoms in infected susceptible plots or no symptoms at all in the resistant plots. The consistency in symptom expression is largely due to the clonal origin from infected cuttings which ensures transmission of viruses across cropping cycles. Moreover, there was very little year-to-year variation in terms of CMD incidences: The disease ratings in the 2011, 2012 and 2013 growing seasons were highly correlated (Pearson’s correlation coefficient, r > 0.91). This was reflected in the large broad-sense heritability of the resistance trait as measured at one and three months after planting (H2 of 0.89 and 0.93, respectively). Analysis of variance using the logistic regression model showed a highly significant effect for clone (p = <2E−16), while the other factors such as environment (year), clone × environment interaction and replication were not significant. Table 1 Summary of the known markers tagging CMD resistance in cassava and their linkage group. Marker Primer sequence Linkage group Scaffold (v4.1) Study Resistance source S5214 780931 S5214 30911 SSRY28 (CMD2) GBS-SNP GBS-SNP Fw:TTGACATGAGTGATATTTTCTTGAG Rev:GCTGCGTGCAAAACTAAAAT 16 16 16 scaffold05214 Scaffold05214 scaffold05214 IITA-TMS-011412 IITA-TMS-961089A TME3; TME7; IITA-TMS-972205 SSR NS158 Fw:GTGCGAAATGGAAATCAATG Rev:TGAAATAGTGATACATGCAAAAGGA Fw:GTGCGAAATGGAAATCAATG Rev:GCCTTCTCAGCATATGGAGC Fw:ATGTTAATGTAATGAAAGAGC Rev:AGAAGAGGGTAGGAGTTATGT Fw:TGCAGCATATCAGGCATTTC Rev:TGGAAGCATGCATCAAATGT Fw:GGAAACTGCTTGCACAAAGA Rev:CAGCAAGACCATCACCAGTTT 16 scaffold06906 Present study Rabbi et al. (in press) Akano et al. (2002), Lokko et al. (2005), Okogbenin et al. (2012) Okogbenin et al. (2007) 16 scaffold06906 Okogbenin et al. (2007) TME3 16 No match Okogbenin et al. (2007) TME3 16 Scaffold04175 Okogbenin et al. (2012) IITA-TMS-972205 16 scaffold07933 Lokko et al. (2005) TME7 SSR NS169 RFLP RME-1 SSR NS198 SSRY106 TME3 Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx Table 2 Screening results of the mapping population for the presence of ACMV and EACMV. Disease status Virus not detected ACMV only ACMV + EACMV Symptomatic Asymptomatic 1 92 65 6 15 1 Genetic map 0 50 3.3. Genotyping of SNP markers and construction of a dense genetic map In all, 17,682 SNPs were obtained from the SNP calling pipeline. The SNP data were subsequently filtered for markers with more than 20% missing values across the genotyped individuals. Also removed were loci that deviated from the expected genotypic frequencies at Chi-square significance threshold of P < 0.05. Linkage analysis was done using 8704 SNPs that passed these two QC filters. A high-density genetic linkage map was constructed using the Maximum-Likelihood approach implemented in Joinmap 4.1 (Table 3). A total of 6756 SNP markers were mapped across 19 linkage groups with between 115 and 559 SNPs (average = 256). With an average inter-SNP distance of 0.52 cM, this is the densest map developed for cassava so far (Fig. 2). Despite the high density of the GBS-derived SNPs mapped, several regions without markers were observed. Most notable were a single region on linkage Location (cM) 3.2. Screening of the population for cassava mosaic geminiviruses using PCR The PCR-based screening of the mapping population detected one or more of the CMGs in 87 of 180 plots assayed, while 93 plots were negative (Table 2). Only two species of CMGs, ACMV and EACMCV, were detected in the trial, which is consistent with known CMGs prevalence in West Africa. ACMV was detected in all the 87 virus-positive plots, whereas EACMCV was detected as a coinfection with ACMV in 16 plots (17% incidence). No case of single infection by EACMCV was detected. CMGs were detected in 79 of 81 symptomatic plots indicating a strong positive correlation between the visual scoring of disease and the PCR results. In addition, PCR also detected occurrence of ACMV in 7 asymptomatic plots and ACMV and EACMCV in one plot. Mean severity of CMD symptoms in ACMV infected plants was 3.1 and plants infected with ACMV and EACMCV was 3.2, which suggest apparent lack of synergistic effect in heightening symptom severity in dually infected plants. 5 100 150 200 250 1 2 2.2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Chromosome Fig. 2. Overview of genetic map developed from the 6756 ApeKI-derived SNPs. group 4 and two regions in linkage group 18. A possible reason for such gaps could be lack of polymorphic markers as a result of identity-by-descent of this region in the two parents. Most of the 12,977 scaffolds that constitute the current 533 Mb of the version 4.1 cassava genome assembly are fairly small; nearly 95% of them are 200 Kb or smaller. A total of 1093 unique scaffolds were anchored in the present map, and ranged from 19 to 89 in the different linkage groups. Despite their relatively small number, the anchored scaffolds covered a total size of 313.3 Mb, and accounted for 58.7% of the current cassava genome assembly. The complete genetic map developed from this work is available in Supplementary Table 1. 3.4. High-resolution mapping of CMD resistance locus Based on the qualitative nature of the resistance to CMD in the present mapping population, a single locus was expected to underlie the resistance phenotype. The high-density genetic map developed with 6756 GBS SNPs permitted a genome-wide search Table 3 Summary statistics of the genetic linkage map developed from ApekI SNP markers. Linkage group Number of SNPs Length (cM) Average distances (cM) Number of scaffolds anchored Cumulative scaffold size (base-pairs) 1 2 2.2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total 473 344 366 419 255 543 275 559 454 207 299 304 115 302 543 451 281 275 291 6756 242 195 175 168 194 230 182 256 222 72 148 203 99 200 242 225 152 156 154 256a 0.51 0.57 0.48 0.40 0.76 0.42 0.67 0.46 0.49 0.35 0.50 0.67 0.87 0.67 0.45 0.50 0.54 0.57 0.53 0.52a 66 52 59 55 57 61 73 66 78 19 55 41 21 60 89 49 64 70 58 1093 30,226,735 20,822,503 20,298,754 12,608,655 13,843,915 15,175,445 15,911,788 21,949,504 18,054,518 6,148,875 15,132,961 13,797,633 6,111,492 17,048,413 19,282,711 18,831,996 16,048,872 17,372,231 14,665,500 313,332,501 a Average values per linkage group. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 6 (Table 4). The approximate 95% Bayesian credible interval for the mapped locus spans from 68.8–70.74 cM region along LG16, irrespective of the scoring time (one or three months after planting) or growing season (2011, 2012 or 2013). The profiles of LOD scores along the linkage group 16 were very similar for the disease scores recorded at one and three months after planting as well as the three seasons of data, supporting the high heritability observed for this trait. Comparison of the phenotypes of the F1s against the resistance-linked marker S5214 780931 genotypes showed a small proportion of recombinants that carry the SNP allele linked to resistance but show susceptibility to the disease and vice versa. 3.5. Genomic localization of markers flanking previously mapped qualitative resistance against CMD Fig. 3. Single-marker association with qualitative resistance to CMD. (a) Genomewide P-values for 6756 SNPs across 19 linkage groups showing the strongest association signal was located in linkage group 16. The x-axis shows the SNPs along each chromosome; y-axis is the −log 10 (P-value) for the association. (b) An association plot for linkage group 16 showing SNPs that are informative in resistant (IITA-TMS-011412) and susceptible (IITA-TMS-4(2)1425) parents. The SNPs from scaffold 5214 with strongest association to resistance phenotype are highlighted. for the locus underlying the qualitative resistance to CMD. A strong association was detected in linkage group 16 that peaked at around 69.12 cM (Fig. 3). This association decreases on both sides of the peak as a result of increasing recombination between the markers and the underlying resistance gene. Most SNPs showing highly significant association (P < 1E−40) came from the 1.46 Mb-long scaffold 5214, with marker S5214 780931 at the peak; this marker explained 74% of the disease resistance variance. Additionally, only those SNPs that are informative in the CMD-resistant female parent show the significant associations while those segregating only in the susceptible male parent do not (Fig. 3). The presently mapped resistance locus occurs in the vicinity of the previously mapped CMD2 locus (Akano et al., 2002); marker S5214 780931 is just 623.24 kb away from a microsatellite marker, SSRY28 (between 157,470 bp to 157,616 bp), that was first reported to be closely linked to the CMD2 locus (Akano et al., 2002), indicating that the same gene may account for the observed resistance to CMD in the present mapping population. The results from the interval-mapping based QTL analysis recapitulated those from single-marker trait associations, and uncovered a single peak with a maximum LOD value of 43 on linkage group 16 (Supplementary Fig. 1). Despite employing the Composite Interval Mapping method, no additional peaks exceeding the significance threshold were detected, confirming that only a single locus conferred the qualitative resistance in the present population. The SNP marker S5214 780931 is flanked by S5214 472282 and S5214 1084049 was the closest to the QTL peak A major objective of the present study was to use the highdensity SNP genetic map to anchor seven molecular markers previously reported to be linked to single dominant gene resistance to CMD in four other genetic backgrounds (Table 1). The SSR and RFLP makers used in those studies were therefore anchored in the present map via their harbouring scaffolds. These markers came from scaffold 05214 (SSRY28), scaffold 06906 (NS158 and NS169), scaffold 04175 (NS198) and scaffold 07933 (SSRY106) all of which fall within the same genomic region of linkage group 16 of the present map (Fig. 4). The primer sequences for the RFLP marker RME-1 did not identify a suitable match in the reference genome. In a parallel study using another bi-parental mapping population derived from a cross between another improved variety that is nearly immune to CMD (IITA-TMS-961089A) and a susceptible landrace (TME117), another SNP marker was identified from the same scaffold (S5214 30911). It was strongly associated with the disease resistance and explained 60% of phenotypic variation (Rabbi et al., in press). The reported percentage of variation explained by these markers shows a gradient that peaks around scaffold 05214, the region that is likely to contain the concerned resistance gene (Fig. 4). Markers away from this region have been reported to be less linked to the resistance gene by the low percentage of variation that they explain, a trend that agrees with the GWA results, particularly considering the segregating markers from the female parent (Fig. 3). 3.6. Recombination pattern in linkage group 16 To visualize the degree of linkage and recombination pattern between the presently mapped CMD resistance locus and other previously published SSR markers along the linkage group, the chromosome-wide pattern of LD on linkage group 16 was examined (Fig. 4). The resulting haplotype map is useful in providing a framework for interpreting the results of the previous mapping studies, particularly the proportion of phenotypic variation explained by the various microsatellite markers and their relative locations in the present map. Though different parents are used in the present and previous mapping studies, these populations have all Table 4 Markers flanking the presently mapped CMD resistance locus calculated from disease severity scored at one- and three-months after planting. The table also presents the linkage group, position and interval mapping-based percentage of phenotype variation explained by the closest marker. Trait b CMD1S CMD3Sb a b SNP Linkage group Position (cM) Peak LODa R2 H2 S5214 1084049 S5214 780931 S5214 472282 16 16 16 68.88 70.00 70.74 45.37 45.59 44.99 0.708 0.892 S5214 1084049 S5214 780931 S5214 472282 16 16 16 68.88 69.31 70.74 42.98 43.20 42.33 0.696 0.928 Logarithm of odds score for presence of QTL. Values calculated using the three-year BLUPs; R2 = percentage variation explained by QTL; H2 = Heritability using the three year-data. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 7 Fig. 4. Graphical display of the variation in the linkage disequilibrium (r2 ) along linkage group 16 calculated for every pair of SNPs in the bi-parental mapping population (left); and the genetic map (right). The location of the mapped CMD resistance locus (underlined SNP, viz. S5214 780931) in scaffold 05214 relative to SSRs other scaffolds (S7933, S4175 and S6906) containing microsatellite markers reported to be linked to resistance to CMD is shown on the right. The percentage of phenotypic variation explained by the (PVE, measured by r2 ) are also presented. The dark shading corresponds to stronger LD (higher r2 ). Names of other SNPs in the linkage group were omitted due to space constraints; but are available in the Supplementary (Supplementary Table 1). undergone a single round of meiosis and are thus expected to have a similar extent of linkage disequilibrium across their chromosomes. The region bearing the CMD resistance locus was characterized by two large haplotype blocks (dark-grey shading between 0 to 32 cM and 36 to 62 cM, respectively). Moderate LD was detected between these blocks (light-grey shading). The first block encompasses scaffold 4175, which harbours microsatellite marker NS198, reported by Okogbenin et al. (2012) to explain 11% of variation in CMD resistance in a bi-parental population. SNPs from this scaffold and those near the CMD2 locus in scaffold 05214 also show moderate LD (r2 ∼ 0.10). The CMD resistance locus occurs in the second LD block. Scaffold 5214 harbours the SNPs that were strongly associated to the resistance as well as the microsatellite marker SSRY28 reported previously (Akano et al., 2002). Though discovered from different genetic backgrounds, the resistance-linked SNPs and SSRY28 explain between 60% and 70% of the disease resistance variance. These markers are just 623.24 kb apart. Another scaffold in this block (07933) which harbours microsatellite marker SSRY106, was reported to explain nearly 40% of variation in CMD resistance (Lokko et al., 2005). 3.7. Genetic relatedness of CMD-resistant landraces using genome-wide SNPs In addition to genetic mapping of the bi-parental population, cluster analysis was performed with key landraces known to possess strong resistance to CMD (Fig. 5). To estimate the “residual genetic distance” between identical clones resulting from genotyping error, several clones – some of which have different names as a result of adoption in different regions – were redundantly genotyped. These pairs include the male parent (IITA-TMS-4(2)1425) and “Kibandameno-white”; IITA-TMS-30572 known as “Migyera” in Uganda; and TME12 (A and B). Most of the CMD resistant landraces, including those that were first discovered in Nigeria, are clearly very closely related, or even perhaps identical, based on comparison of the distance between them and the residual distance between the redundantly genotyped clones, confirming Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 ARTICLE IN PRESS G Model VIRUS-96176; No. of Pages 10 8 I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx IITA-TMS-30337 and IITA-TMS-30572. Since the 1990s, IITA has been making crosses to combine the single dominant gene (CMD2) with the multigenic M. glaziovii resistance and have produced clones with near immunity to CMD (Legg and Fauquet, 2004). Height 0 10 30 4.2. Mapping resolution of the CMD2 locus Cluster Dendrogram dist hclust (*, "complete") IITA_TMS_30572 IITA_TMS_011412 S_TME1 S_TME203 S_TME2 R_TME204 R_TME199 R_TME225 R_TME419 S_TME510 S_TME450 R_TME282 S_TME399 S_TME379 R_TME5 R_TME4 R_TME6 R_TME14 R_TME7 R_TME12 R_TME3 R_TME11 S_TME693 S_TME237 S_TME279 R_TME9 S_TME117 S_TME778 R_TME8 S_TME207 R_TME13 KIBANDAMENO_WHITE IITA_TMS_4.2.1425 50 Fig. 5. A hierarchical clustering tree based on dissimilarities between a selection of landraces using of 2069 SNP markers. Most of the lower TME series landraces that are resistant to CMD form a single, tight cluster (bottom). The prefix “R” denotes CMD-resistant and “S” denotes CMD-susceptible varieties. previous studies using AFLPs (Fregene et al., 2000). These landraces (TME3, TME4, TME6, TME7, TME11, TME12 and TME14) have a very similar morphological appearance, most prominent of which is red petioles. 4. Discussion 4.1. A historical perspective of development and diffusion of early CMD resistant varieties across of Africa Breeding for resistance to CMD has been a major goal of cassava improvement programmes in Africa for more than 70 years. Prior to the discovery of the single-gene resistance (Akano et al., 2002), the primary defence against the disease was the polygenic resistance introgressed into cultivated cassava (M. esculenta) from M. glaziovii after three cycles of backcrossing (Nichols, 1947). Examining the breeding history of improved varieties with the quantitative resistance to CMD reveals a very narrow genetic base tracing back to a single derivative of the M. glaziovii × M. esculenta crosses (Jennings, 2003). None of these descendants is immune to infection by CMGs (Nichols, 1947) although some express mild and sometimes transient symptoms as a result of incomplete systemic infection that leads to reversion of symptoms (Jennings, 1976; Fargette et al., 1994), while others are quite susceptible to the disease. These TMS varieties that trace back to M. glaziovii have been widely adopted in Africa and helped revive cassava farming following the devastation of many local landraces in East Africa (Legg and Fauquet 2004). Some of the adopted TMS varieties are IITA-TMS-60142, Genetic mapping of qualitative resistance to CMD in this study uncovered a single locus on linkage group 16 with a large peak LOD (>40). This locus, S5214 780931, explained 74% of the phenotypic variance, and is co-located with the SSRY28 on scaffold5214, indicating the dominant monogenic resistance gene from the female parent is likely to be the CMD2 locus of Akano et al. (2002). The dense genetic map obtained from GBS has enabled a higher level of mapping resolution of the CMD resistance locus. Linkage group 16 has a total of 281 SNPs, and the approximate 95% Bayesian credible interval around the mapped locus is just 1.1 cM. This resolution was not obtainable using the traditional marker from previous mapping efforts. 4.3. All markers linked to qualitative resistance occur in the same chromosome region The high-density SNP genetic map was used to anchor markers previously reported to be linked to the dominant monogenic resistance to CMD. These markers (Table 1) were interpolated in the present map via the scaffolds that harbour them. All of them (except RME1 and RME4 for which a suitable matches in the current reference genome were not found) came from scaffolds occurring in the same region of linkage group 16. Overall, there is a general congruence between the proportion of genetic variation reported for these microsatellite markers, the marker-trait association profile in linkage group 16 (Fig. 3) and the pattern of linkage disequilibrium (measure as r2 ) between microsatellite locations and the putative position of the resistance locus in the present population (Fig. 4). This pattern supports the idea that most of the microsatellite markers reported to be linked to varying degrees with CMD resistance are associated with a single resistance locus that is most likely the CMD2 gene. Moreover, the strength of the linkage decreased with increasing distance from the gene location. In a parallel mapping study using another improved variety that is nearly immune to CMD (IITA-TMS-961089A) and a susceptible landrace (TME117), a strong QTL signal was found in scaffold05214 that also explained 60% of resistance variation, implicating the same CMD2 locus (Rabbi et al., in press). Considering the results of the present study and those of Akano et al. (2002), Okogbenin et al. (2012), and Lokko et al. (2005), it is highly likely that a there is a single gene, or a cluster of resistance genes (Michelmore and Meyers, 1998). Indeed, the conservation of QTL positions among the different sources of CMD resistance is not surprising given that the majority of the highly resistant cultivars developed recently in Africa trace to a few landrace accessions from Nigeria that were used over the last 12 or so years of resistance breeding in which the merits of the TME materials were appreciated. According to the phylogeny (Fig. 5) it was confirmed that several landraces that were first discovered to be nearly immune to CMD were genetically very similar. These findings agree with a previous study (Fregene et al., 2000). Other duplicates, which are also morphologically very similar, were identified. These landraces were collected from South West and Central Nigeria and have different local names. The resistance in the landrace TME9, which also occurs in a separate clade on the phylogenetic tree, is likely to be CMD2. This landraces is found in the pedigree of IITA-TMS-961089A that was used in a parallel mapping study; resistance-linked markers also came from scaffold 05214 (Rabbi et al., in press). It is postulated that all of these landraces, which come from West Africa, trace back to a single CMD resistant Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 9 mutant that was selected by farmers and rapidly diffused in the region through varietal dissemination efforts. Acharya for preparation of GBS libraries and sequencing, and Oluwafemi Alaba for sample preparation and DNA extraction. 4.4. Screening of cassava mosaic geminiviruses in the mapping population Appendix A. Supplementary data The CMG screening results showed that ACMV was the predominant species in the field, whereas EACMV occurred at lower frequency and only with ACMV as a dual infection. The preponderance of ACMV reflects the known proportions of the two viruses in West Africa and contrasts with the predominance of EACMV-UG in the pandemic regions of Eastern Africa (Legg and Fauquet, 2004). The severe form of EACMV-UG has displaced other CMG species and strains during the recent pandemic that swept through the region (Legg et al., 2006). An important question is whether the CMD2type of resistance available in the West-African cassava germplasm can protect against these more virulent strains of CMG that does not contribute to the disease pressure in West Africa. Germplasm screening in Uganda have demonstrated that the qualitative resistance locus in the TME landraces is highly effective against this virus (Kawuki et al., 2011). Similar results have been obtained through biolistic inoculation methods using-pseudo recombinants of ACMV and EACMV (Sserubombwe et al., 2008). Some of the screened clones include TME5 and TME14, which cluster together with TME3, the original source of the CMD2 locus (Fig. 5). The evidence suggests that CMD2 locus is a monogenic resistance with broad specificity against cassava-infecting geminiviruses. 4.5. Prospects for long-term efficacy of CMD2 resistance gene Despite the apparent broad specificity of the CMD2 gene, whether the locus can confer durable resistance according to the classical definition (Johnson, 1984) depends on whether it will remain effective over a prolonged period of widespread use under conditions conducive to the disease. It is well known that monogenic resistance can be ephemeral and subjected to the well known boom-and-bust cycles (McDonald and Linde, 2002) as a result of the rapid genetic evolution in the pathogen though there is evidence of durable resistance from single gene actions (Johnson, 1984; Stuthman et al., 2007). Since its discovery in the 1980s and subsequent extensive use in breeding programmes in sub-Saharan Africa, there has been no report of breakdown of the CMD2-type of resistance in the field, indicating that the locus could be durable against CMGs. Still, the long-term effectiveness of the resistance locus needs to be augmented by combining it with the quantitative resistance derived from M. glaziovii. Indeed, crosses combining these two types of resistance have given rise to progeny which are near immune to all known CMG species, including the virulent EACMV-Uganda (Legg and Fauquet, 2004; Monde et al., 2012). Additionally, more efforts are needed for a comprehensive screening of the resistant landraces to identify any additional sources that may exist. Though increased resolution was achieved in the mapping analysis, a larger mapping population is required for fine-mapping and cloning of the concerned locus. This will provide insights on the mechanism of the resistance that so far seems to be highly effective against various CMG species. Acknowledgments This research was supported by the CGIAR-Research Programme on Roots, Tubers and Bananas and the “Next Generation Cassava Breeding Project”, through funding from the Bill & Melinda Gates Foundation and the Department for International Development of the United Kingdom. We acknowledge the help of Charlotte Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.virusres. 2013.12.028. References Akano, A., Dixon, A., Mba, C., Barrera, E., Fregene, M., 2002. Genetic mapping of a dominant gene conferring resistance to cassava mosaic disease. Theor. Appl. Genet. 105, 521–525. Alabi, O.J., Kumar, P.L., Naidu, R.A., 2008. Multiplex PCR method for the detection of African cassava mosaic virus and East African cassava mosaic Cameroon virus in cassava. J. Virol. Met. 154, 111–120. Barrett, J.C., Fry, B., Maller, J.D.M.J., Daly, M.J., 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265. Bland, J.M., Altman, D.G., 1995. Multiple significance tests: the Bonferroni method. Br. Med. J. 310, 170. Bradbury, P.J., Zhang, Z., Kroon, D.E., Casstevens, T.M., Ramdoss, Y., Buckler, E.S., 2007. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635. Broman, K.W., Wu, H., Sen, Ś., Churchill, G.A., 2003. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890. Ceballos, H., Iglesias, C.A., Pérez, J.C., Dixon, A.G., 2004. Cassava breeding: opportunities and challenges. Plant Mol. Biol. 56, 503–516. Cheema, J., Dicks, J., 2009. Computational approaches and software tools for genetic linkage map estimation in plants. Brief. Bioinform. 10, 595–608. Dellaporta, S.L., Wood, J., Hicks, J.B., 1983. A plant DNA minipreparation: version II. Plant Mol. Biol. Rep. 1, 19–21. Duffy, S., Holmes, E.C., 2009. Validation of high rates of nucleotide substitution in geminiviruses: phylogenetic evidence from East African cassava mosaic viruses. J. Gen. Virol. 90, 1539–1547. Elshire, R.J., Glaubitz, J.C., Sun, Q., Poland, J.A., Kawamoto, K., Buckler, E.S., Mitchell, S.E., 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6, e19379. Fargette, D., Jeger, M., Fauquet, C., Fishpool, L.D.C., 1994. Analysis of temporal disease progress of African cassava mosaic virus. Phytopathology 84, 91–98. Flint-Garcia, S.A., Thornsberry, J.M., Buckler, I.V.E.S., 2003. Structure of linkage disequilibrium in plants. Annu. Rev. Plant Biol. 54, 357–374. Fregene, M., Bernal, A., Duque, M., Dixon, A., Tohme, J., 2000. AFLP analysis of African cassava (Manihot esculenta Crantz) germplasm resistant to the cassava mosaic disease (CMD). Theor. Appl. Genet. 100, 678–685. Fregene, M., Okogbenin, E., Mba, C., Angel, F., Suarez, M.C., Janneth, G., et al., 2001. Genome mapping in cassava improvement: challenges, achievements and opportunities. Euphytica 120, 159–165. Harrison, B.D., Zhou, X., Otim-Nape, G.W., Liu, Y., Robinson, D.J., 1997. Role of a novel type of double infection in the geminivirus-induced epidemic of severe cassava mosaic in Uganda. Ann. Appl. Biol. 131, 437–448. Herrera-Campo, B.V., Hyman, G., Belloti, A., 2011. Threats to cassava production: known and potential geographic distribution of four key biotic constraints. Food Sec. 3, 329–345. Jansson, C., Westerbergh, A., Zhang, J., Hu, X., Sun, C., 2009. Cassava, a potential biofuel crop in (the) People’s Republic of China. Appl. Energy 86, S95–S99. Jennings, D.L.,1976. Breeding for Resistance to African Cassava Mosaic Disease: Progress and Prospects. In: Interdisiplinary Workshop. IDRC, Muguga (Kenya). Jennings, D.L., 2003. Historical perspective on breeding for resistance to cassava Brown Streak Virus Disease. In: Hillocks, R.J. (Ed.), Cassava Brown Streak Virus Disease: Past, Present, and Future. Proceedings of an International Workshop. Mombasa, Kenya, 27–30 October 2002. Natural Resources International Limited, Aylesford, UK, p. 100. Jansen, J., de Jong, A.G., van Ooijen, J.W., 2001. Constructing dense genetic linkage maps. Theor. Appl. Genet. 102, 1113–1122. Johnson, R., 1984. A critical analysis of durable resistance. Annu. Rev. Phytopath. 22, 309–330. Kawuki, R.S., Pariyo, A., Amuge, T., Nuwamanya, E., Ssemakula, G., Tumwesigye, S., Bua, A., Baguma, Y., Omongo, C., Alicai, T., Orone, J., 2011. A breeding scheme for local adoption of cassava (Manihot esculenta Crantz). J. Plant Breed. Crop Sci. 3, 120–130. Legg, J.P., Fauquet, C.M., 2004. Cassava mosaic geminiviruses in Africa. Plant Mol. Biol. 56, 585–599. Legg, J.P., Ogwal, S., 1998. Changes in the incidence of African cassava mosaic geminivirus and the abundance of its whitefly vector along south–north transects in Uganda. J. Appl. Entomol. 122, 169–178. Legg, J.P., Owor, B., Sseruwagi, P., Ndunguru, J., 2006. Cassava mosaic virus disease in East and Central Africa: epidemiology and management of a regional pandemic. Adv. Virus Res. 67, 355–418. Li, H., Durbin, R., 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx Lokko, Y., Danquah, E.Y., Offei, S.K., Dixon, A.G.O., Gedil, M.A., 2005. Molecular markers associated with a new source of resistance to the cassava mosaic disease. Afr. J. Biotechnol. 4, 873–881. Lokko, Y., Dixon, A., Offei, S., Danquah, E., Fregene, M., 2006. Assessment of genetic diversity among African cassava Manihot esculenta Grantz accessions resistant to the cassava mosaic virus disease using SSR markers. Genet. Resour. Crop Evol. 53, 1441–1453. Ly, D., Hamblin, M., Rabbi, I., Melaku, G., Bakare, M., Gauch, H.G., et al., 2013. Relatedness and genotype × environment interaction affect prediction accuracies in genomic selection: a study in Cassava. Crop Sci. 53, 1312–1325. McDonald, B.A., Linde, C.C., 2002. Pathogen population genetics, evolutionary potential, and durable resistance. Annu. Rev. Phytopathol. 40, 349–379. Michelmore, R.W., Meyers, B.C., 1998. Clusters of resistance genes in plants evolve by divergent selection and a birth-and-death process. Genome Res. 8, 1113–1130. Monde, G., Walangululu, J., Bragard, C., 2012. Screening cassava for resistance to cassava mosaic disease by grafting and whitefly inoculation. Arch. Phytopathol. Pfl. 45, 2189–2201. Nichols, R.F.W., 1947. Breeding cassava for virus resistance. East Afr. Agric. J. 12, 184–194. Okogbenin, E., Egesi, C.N., Olasanmi, B., Ogundapo, O., Kahya, S., Hurtado, P., et al., 2012. Molecular marker analysis and validation of resistance to cassava mosaic disease in elite cassava genotypes in Nigeria. Crop Sci. 52, 2576–2586. Okogbenin, E., Porto, M.C.M., Egesi, C., Mba, C., Espinosa, E., Santos, L.G., et al., 2007. Marker-assisted introgression of resistance to cassava mosaic disease into Latin American germplasm for the genetic improvement of cassava in Africa. Crop Sci. 47, 1895–1904. Otim-Nape, G.W., Thresh, J.M., 1998. The current pandemic of cassava mosaic virus disease in Uganda. In: Jones, G. (Ed.), The Epidemiology of Plant Diseases. Kluwer, Dordrecht, Germany, pp. 423–443. Rabbi, I.Y., Hamblin, M., Gedil, M., Kulakow, P., Ferguson, M., Ikpan, A.S., Ly, D., Jannink, J-L. Genetic mapping using genotyping-by-sequencing in the clonally-propagated cassava. Crop Sci. (in press). Sserubombwe, W.S., Briddon, R.W., Baguma, Y.K., Ssemakula, G.N., Bull, S.E., Bua, A., Alicai, T., Omongo, C., Otim-Nape, G.W., Stanley, J., 2008. Diversity of begomoviruses associated with mosaic disease of cultivated cassava (Manihot esculenta Crantz) and its wild relative (Manihot glaziovii Müll. Arg.) in Uganda. J. Gen. Virol. 89, 1759–1769. Stuthman, D.D., Leonard, K.J., Miller-Garvin, J., 2007. Breeding crops for durable resistances. In: Sparks, D.L. (Ed.), Advances in Agronomy, vol. 95. Elsevier, Amsterdam, pp. 319–367. Van Ooijen, J.W., 2006. JoinMap 4. Software for the Calculation of Genetic Linkage Maps in Experimental Populations. Kyazma BV, Wageningen, Netherlands. Vazquez, A.I., Bates, D.M., Rosa, G.J.M., Gianola, D., Weigel, K.A., 2010. Technical note: an R package for fitting generalized linear mixed models in animal breeding. J. Anim. Sci. 88, 497–504. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028
G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS Virus Research xxx (2014) xxx–xxx Contents lists available at ScienceDirect Virus Research journal homepage: www.elsevier.com/locate/virusres High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding夽 Ismail Y. Rabbi a,∗ , Martha T. Hamblin b , P. Lava Kumar a , Melaku A. Gedil a , Andrew S. Ikpan a , Jean-Luc Jannink b,c , Peter A. Kulakow a a International Institute for Tropical Agriculture (IITA), Ibadan, Nigeria Department of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA c USDA-ARS, R.W. Holley Center for Agriculture and Health, Ithaca, NY, USA b a r t i c l e i n f o Article history: Available online xxx Keywords: Cassava mosaic disease Breeding Phenotyping Monogenic resistance Genotyping-by-sequencing QTL a b s t r a c t Cassava mosaic disease (CMD), caused by different species of cassava mosaic geminiviruses (CMGs), is the most important disease of cassava in Africa and the Indian sub-continent. The cultivated cassava species is protected from CMD by polygenic resistance introgressed from the wild species Manihot glaziovii and a dominant monogenic type of resistance, named CMD2, discovered in African landraces. The ability of the monogenic resistance to confer high levels of resistance in different genetic backgrounds has led recently to its extensive usage in breeding across Africa as well as pre-emptive breeding in Latin America. However, most of the landraces carrying the monogenic resistance are morphologically very similar and come from a geographically restricted area of West Africa, raising the possibility that the diversity of the single-gene resistance could be very limited, or even located at a single locus. Several mapping studies, employing bulk segregant analysis, in different genetic backgrounds have reported additional molecular markers linked to supposedly new resistance genes. However, it is not possible to tell if these are indeed new genes in the absence adequate genetic map framework or allelism tests. To address this important question, a high-density single nucleotide polymorphism (SNP) map of cassava was developed through genotyping-by-sequencing a bi-parental mapping population (N = 180) that segregates for the dominant monogenic resistance to CMD. Virus screening using PCR showed that CMD symptoms and presence of virus were strongly correlated (r = 0.98). Genome-wide scan and high-resolution composite interval mapping using 6756 SNPs uncovered a single locus with large effect (R2 = 0.74). Projection of the previously published resistance-linked microsatellite markers showed that they co-occurred in the same chromosomal location surrounding the presently mapped resistance locus. Moreover, their relative distance to the mapped resistance locus correlated with the reported degree of linkage with the resistance phenotype. Cluster analysis of the landraces first shown to have this type of resistance revealed that they are very closely related, if not identical. These findings suggest that there is a single source of monogenic resistance in the crop’s genepool tracing back to a common ancestral clone. In the absence of further resistance diversification, the long-term effectiveness of the single gene resistance is known to be precarious, given the potential to be overcome by CMGs due to their fast-paced evolutionary rate. However, combining the quantitative with the qualitative type of resistance may ensure that this resistance gene continues to offer protection to cassava, a crop that is depended upon by millions of people in Africa against the devastating onslaught of CMGs. © 2013 The Authors. Published by Elsevier B.V. All rights reserved. 1. Introduction 1.1. Cassava mosaic disease 夽 This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-No Derivative Works License, which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited. ∗ Corresponding author at: IITA, Headquarters & West Africa Hub, PMB 5320, Oyo Road, Ibadan 200001, Oyo State, Nigeria. Tel.: +234 2 7517472x2719; fax: +44 208 7113785. Cassava (Manihot esculenta Crantz, family Euphorbiaceae) is a starchy root crop that supplies carbohydrate energy to millions of people in the tropics (Ceballos et al., 2004) and it is being used increasingly as an industrial crop (Jansson et al., 2009). Though its remarkable ability to tolerate unfavourable conditions such as 0168-1702/$ – see front matter © 2013 The Authors. Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.virusres.2013.12.028 Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS 2 I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx drought and poor soils makes it a food security crop in many parts of sub-Saharan Africa (SSA), on-farm productivity of cassava has remained stagnant for many years due to several production constraints. Cassava mosaic disease (CMD), caused by several species of cassava mosaic geminiviruses (CMGs), is the most economically important constraint to cassava in SSA and the Indian sub-continent (Herrera-Campo et al., 2011). Though significant efforts have been expended on combating this disease, it still causes huge losses to production. The most striking example of the devastating potential of CMD to undermine food security in Africa is the severe pandemic that started as an epidemic in Uganda in the 1990s and led farmers to abandon the crop in many parts of the country (Otim-Nape and Thresh, 1998), and later spread to most countries in East and Central Africa (Legg and Fauquet, 2004). The pandemic is characterized by rapid spread through super-abundant Bemicia tabaci vectors (Legg and Ogwal, 1998) and is associated with a recombinant strain of the East African cassava mosaic virus – Uganda (EACMV-UG) along with African cassava mosaic virus (ACMV) belonging to the genus Begomovirus, within the Geminiviridiae family (Harrison et al., 1997). Important control measures against CMD include rogueing of symptomatic plants, use of virus-free planting materials and deployment of resistant varieties. The first two options are not only labour intensive and difficult to implement but also require continuous and long-term intervention. Use of resistant varieties is the most effective solution in mitigating the negative effect of CMD in farmers’ fields because this approach not only reduces yield losses due to the disease, but also reduces levels of the virus inoculum in the farming system particularly in varieties that suppress virus accumulation. 1.2. Sources of resistance to the disease Currently deployed resistance against CMD in Africa is of two types: (i) quantitative resistance derived from Manihot glaziovii; and (ii) qualitative resistance conferred by a single resistance gene(s). The quantitative resistance was introgressed into cultivated cassava following an unsuccessful worldwide search for resistant clones in the 1930s (Nichols, 1947; Jennings, 1976). Genetic studies reveal that the polygenic resistance from M. glaziovii is recessive with a heritability of about 60% (Jennings, 1976). The second type of resistance, which is conditioned by a single-gene with a dominant effect, was discovered in the 1980s in landraces from Nigeria and other West African countries (Akano et al., 2002; Fregene et al., 2001). These landraces, which display near-immunity against nearly all species of CMGs, are currently maintained in the IITA germplasm collection referred to as the Tropical Manihot esculenta (TMe) series. Diversity studies using molecular markers have previously shown that most of the original landraces bearing this qualitative resistance to CMD are genetically very similar if not identical (Fregene et al., 2000; Lokko et al., 2006). This suggests that the genetic base of this type of resistance in the African cassava genepool may be narrow, or even just a single locus. In contrast, the relative ease with which the highly heritable monogenic resistance can be transferred between germplasm through simple crosses, has resulted in its extensive usage in breeding across Africa as well as pre-emptive breeding in Latin America (Okogbenin et al., 2007). The long-term stability of this single-gene type of resistance in diverse geographical regions with heterogeneous species and recombinants of CMGs is uncertain given the high evolutionary rate of geminiviruses (Duffy and Holmes, 2009). Several genetic mapping studies have been conducted to find molecular markers linked to the qualitative resistance in the African germplasm. The first study identified two markers, a microsatellite (SSRY28) and an RFLP (GY1) that flank a single locus named CMD2 at distances of 9 and 8 cM, respectively (Akano et al., 2002). Subsequent to the discovery of the CMD2 locus, later studies have reported several additional markers that are linked to new resistance genes in other genetic backgrounds, including landraces and improved varieties derived from them (Lokko et al., 2005; Okogbenin et al., 2012). However, nearly all of these studies relied on the bulk segregant analysis (BSA) approach and/or very sparse maps for interval mapping analysis. The BSA approach provides little or no information regarding the chromosomal location of the identified markers, making it difficult to ascertain the number of unique loci/genes associated with a trait. When sparse maps are used, the confidence interval surrounding a QTL is usually large, making it difficult to determine the precise QTL location. For example, the CMD2-containing linkage group of Akano et al. (2002) had a total of five markers. Lokko et al. (2005) used a linkage map with just 45 markers, of which only three were in the linkage group containing the resistance locus. The objective of this study was to provide a comprehensive framework for describing the breadth of the genetic base of the single-gene resistance to CMD in the African cassava germplasm. Firstly, a full-sib mapping population segregating for qualitative resistance to CMD was developed and phenotyped for three growing seasons. The population was genotyped using genotypeby-sequencing (GBS), and a dense genetic linkage map with more than 8000 single nucleotide polymorphism (SNPs) was constructed. Using this resource, a high-resolution genetic mapping of the CMD resistance locus was carried out. The markers previously reported to be linked to CMD resistance were then projected onto the newly generated genetic map. This revealed their genomic locations, and the spatial relationship between them and the mapped resistance locus from the present study. To confirm the relationship among the CMD resistant landraces, cluster analysis was carried out using genome-wide SNP markers. 2. Materials and methods 2.1. Mapping population development, phenotyping and genotyping A full-sib mapping population segregating for dominant monogenic resistance to CMD was generated by crossing two non-inbred clones. Both parents are elite lines developed by IITA in Nigeria. The female parent, IITA-TMS-011412, is highly resistant to CMD and also rich in pro-vitamin A. Cloned in 1974, the male parent, IITA-TMS-4(2)1425, is an improved variety from a cross between a landrace from Nigeria (TME109, locally known as Oyarugbafunfun) and variety 58308, a hybrid derived directly from recombination of the M. glaziovii × M. esculenta triple-backcrosses (Jennings, 1976). Variety 58,308 was the main source of the quantitative resistance to CMD and produced some of the first generation Tropical Manihot Selection lines (see the discussion section for more background). IITA-TMS-4(2)1425 shows considerable susceptibility to CMD (Fig. 1). The 180 F1 seeds produced were germinated in sterilized garden soil and transplanted one month after sowing. At maturity, the seedlings were cloned, regardless whether they were infected or not, and planted at Ibadan, Nigeria (7.40◦ North latitude, 3.90◦ East longitude) using a randomized complete block design. Each clone was planted in two replicated plots of five stands per plot with plant spacing of 1 m × 0.5 m for three 12-month long cropping seasons established in 2011, 2012 and 2013. Generation-to-generation propagation through cloning was based on use of 12-cm long stem cuttings in both the infected and non-infected F1s. A local landrace that is highly susceptible to CMD, TME117, was planted as spreader row every fifth plot and as border row surrounding the experimental field to facilitate whitefly-mediated inoculation of the F1 population. The Ibadan site is known for high CMD pressure and Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 3 Fig. 1. CMD symptom on the mapping population parents, (a) IITA-TMS-011412 and (b) IITA-TMS-4(2)1425; (c) The frequency distribution of CMD scores using the BLUPs calculated from the three-year data. the planting period coincide with high whitefly activity providing high probability of natural exposure of plant population to CMD inoculum. Individual plants were evaluated for CMD symptoms at one and three months after planting using a scale ranging from 1 to 5, with one for symptomless plants while five is given for most severe symptoms (severe mosaic and distortion of leaves). The entire population was screened for presence of CMGs, particularly for the presence of ACMV and EACMCV, the two predominant species prevailing in West Africa, to confirm virus infection in the infected plants. The third fully expanded leaf from the top was sampled from each of the 5 plants per plot; they were pooled and DNA was extracted and analyzed for ACMV and EACMCV using a multiplex PCR protocol (Alabi et al., 2008). 2.2. Genotyping-by-sequencing DNA was extracted from 180 F1 individuals and the two parents using a modified Dellaporta method (Dellaporta et al., 1983). Genotyping-by-sequencing as described by Elshire et al. (2011) was carried out at the Institute of Genomic Diversity, Cornell University. Briefly, DNAs from the F1 individuals and parents were digested individually with ApeKI restriction enzyme, which recognizes a five base-pair sequence (GCWGC, where W is either A or T). This enzyme was chosen because of its partial sensitivity to DNA methylation, thus avoiding repetitive elements regions, and frequency of DNA-cutting (Elshire et al., 2011). Two 95-plex GBS sequencing libraries were prepared by ligating the digested DNA to unique nucleotide adapters (barcodes) followed by standard PCR. Sequencing was performed using Illumina HiSeq2000. The sequencing reads from different genotypes were de-convoluted using the barcodes and aligned to the version 4.1 of the cassava reference genome (www.phytozome.org/cassava) by using Burrow Wheelers Alignment tool (Li and Durbin, 2009). SNPs were extracted using the GBS pipeline implemented in TASSEL software (Bradbury et al., 2007), and genotypes were called using a custom R script. 2.3. Data analysis The pseudo-testcross linkage mapping strategy that is employed in the analysis of full-sib mapping populations requires unambiguous scoring of the parental genotypes at each marker. To ensure this, the parental DNAs were sequenced redundantly four times and their Illumina reads were pooled to increase the number and accuracy of the called SNPs. Following alignment of the reads against the reference genome, the SNPs that segregated in the parents as ab × ab (both parents heterozygous), aa × ab (male parent heteroygous), and ab × aa (female parent heterozygous) were extracted. Prior to linkage analysis, standard quality control was used to filter out SNPs from paralogous sequences (i.e. loci which appear as heterozygous in both parents and all progenies). Also filtered were loci showing significant deviation from expected genotypic frequencies based on chi-square test (threshold for removal: P ≤ 0.05) as well as those with missing information in more than 20% of the genotyped individuals in the mapping population. 2.3.1. Mapping of GBS-derived ApeKI SNPs Genetic linkage maps were constructed using JoinMap version 4.1 (Van Ooijen, 2006). Following calculation of pair-wise recombination frequencies, linkage groups were identified using the logarithm of odds (LOD) score of independence between pairs of loci at a threshold of 10. Due to the large number of markers per linkage group, the maximum-likelihood mapping algorithm implemented in Joinmap 4.1 was used to find the order of the markers in the linkage groups. This method is suitable for dealing with large datasets compared to the regression mapping method (Cheema and Dicks, 2009) and incorporates several numerical methods: simulated annealing for estimating the best map order by minimizing the sum of recombination frequencies in adjacent segments; Gibbs sampling for estimation of multipoint recombination frequency, given the current map order; and spatial sampling of loci to reduce the influence of unknown or dominant genotypes as well as potential errors. Simulated annealing was carried out using a chain length of 30,000 with an acceptance probability threshold of 0.25. Gibbs sampling for maximum likelihood estimation of multipoint recombination frequencies (Jansen et al., 2001) was done using chain length of 50,000 after a burn-in length of 20,000. 2.3.2. Phenotypic data analysis Because the categorical disease severity scores fitted a bi-modal distribution, for statistical analyses (ANOVA and QTL mapping) the trait was converted to a binary variable (either resistant or susceptible). Individuals with categorical CMD severity score larger than one were classified as Affected; all others were classified as Unaffected. A logistic regression model using generalized linear model was used to estimate the effect of the genotype, replication, environment and genotype-by-environment interaction as follows: yijkl =  + ˇi + Rij + Gk + ˇi∗ Gk + eijkl where yijkl was the phenotype;  the mean, ˇi the year effect; Rij the replication effect; Gk the clone effect; ˇi * Gk is the interaction between clone and year and eijkl is the residual. Mixed model was used to obtain best linear unbiased predictors (BLUPs) for each genotype for the combined three-year data. The mixed model was computed using the R package lme4 (Vazquez et al., 2010), considering the effects of the genotypes as random, while replications within environments were regarded as fixed because trials were Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 4 carried out for three years. Broad-sense heritability for CMD resistance at one and three months after planting was calculated using the formula H2 = g2 g2 + e2 where g2 and e2 are the variance components for the genotype effect and the residual error, respectively, based on individual plants. Correlations were calculated among disease resistance score BLUPs for three growing seasons (2011, 2012 and 2013). 2.3.3. High-resolution mapping of the CMD resistance locus Mapping of the CMD resistance locus in the present population was carried out with the BLUPs obtained for each year, and across the combined analysis of the 2011, 2012 and 2013 data. QTL analysis was performed using three complementary approaches. Because of high marker density, single marker-trait association for all 6756 SNPs was carried out. The markers were considered as fixed effects in a linear model implemented in the GLM function TASSEL (Bradbury et al., 2007). The genome-wide significance threshold for the F-statistic was determined by the Bonferroni method (Bland and Altman, 1995). Secondly, standard interval mapping (intervals step of 2.5 cM) was carried out using the regression mapping function “scanone” implemented in R/qtl package (Broman et al., 2003). The genome-wide significance (˛ = 1%) for declaring a significant QTL locus was determined using 1000 permutations. The 95% Bayesian credible interval for the CMD resistance locus was determined using the function “bayesint” implemented in R/QTL. The proportion of phenotypic variance explained by the resistance locus was obtained by fitting a linear model. Thirdly, QTL analysis using the Composite Interval Mapping (CIM) method was carried out with the number of marker covariates set to three. The number of markers for use as co-factors was determined using the automatic co-factor selection function (stepwiseqtl) implemented in R/qtl. The CIM method enabled a reduction in residual variation and thereby increases the resolution of the QTL location and with it the possibility of detecting any additional genomic regions that underly resistance to CMD.Anchoring CMD-resistance-linked markers in the present high-density genetic map The high-density SNP map developed in this study was used to anchor published loci associated with qualitative resistance to CMD (Table 1). Primer sequences that flank five of these markers (viz. SSRY28, NS198, SSRY106, NS158 and NS169) were used in BLAST searches of the cassava reference genome sequence (www.phytozome.org/cassava). Marker positions were interpolated onto the genetic map on the basis of the scaffolds harbouring them (Table 1). To obtain a linkage/recombination profile of SNPs along the linkage group that bears the CMD resistance locus, pairwise estimates of linkage disequilibrium (r2 ; Flint-Garcia et al., 2003) were calculated for the SNPs from the entire mapping population using the software package Haploview v. 3.31 (Barrett et al., 2005) and plotted in a matrix form. 2.3.4. Genetic relatedness of the CMD-resistant landraces In addition to the mapping analysis in the bi-parental population, the genetic relatedness was examined among the TME clones that were originally identified to be sources of the resistance to CMGs. A total of 2069 GBS markers from 34 clones, including 29 landraces and five TMS clones with the quantitative resistance (as an out-group), were obtained from a previous study (Ly et al., 2013), and used to perform hierarchical clustering using the Euclidean distance between the genotypes. These distances were used to construct a relationship dendrogram of the clones. 3. Results 3.1. Segregation for resistance to CMD in the mapping population The frequency distribution of the CMD severity scores in the mapping population revealed a bi-modal pattern with two peaks (Fig. 1): nearly half of the progenies and the female parent (IITATMS-011412) were resistant to CMD and showed no symptoms while the remainder of the F1 individuals showed disease symptoms ranging from mild (score 2) to severe (score 5). Resistance to CMD found in the female parent is therefore likely to be a single gene with dominant effect. There was very little variation within plots with respect to CMD symptom expression: all stands either showed similar symptoms in infected susceptible plots or no symptoms at all in the resistant plots. The consistency in symptom expression is largely due to the clonal origin from infected cuttings which ensures transmission of viruses across cropping cycles. Moreover, there was very little year-to-year variation in terms of CMD incidences: The disease ratings in the 2011, 2012 and 2013 growing seasons were highly correlated (Pearson’s correlation coefficient, r > 0.91). This was reflected in the large broad-sense heritability of the resistance trait as measured at one and three months after planting (H2 of 0.89 and 0.93, respectively). Analysis of variance using the logistic regression model showed a highly significant effect for clone (p = <2E−16), while the other factors such as environment (year), clone × environment interaction and replication were not significant. Table 1 Summary of the known markers tagging CMD resistance in cassava and their linkage group. Marker Primer sequence Linkage group Scaffold (v4.1) Study Resistance source S5214 780931 S5214 30911 SSRY28 (CMD2) GBS-SNP GBS-SNP Fw:TTGACATGAGTGATATTTTCTTGAG Rev:GCTGCGTGCAAAACTAAAAT 16 16 16 scaffold05214 Scaffold05214 scaffold05214 IITA-TMS-011412 IITA-TMS-961089A TME3; TME7; IITA-TMS-972205 SSR NS158 Fw:GTGCGAAATGGAAATCAATG Rev:TGAAATAGTGATACATGCAAAAGGA Fw:GTGCGAAATGGAAATCAATG Rev:GCCTTCTCAGCATATGGAGC Fw:ATGTTAATGTAATGAAAGAGC Rev:AGAAGAGGGTAGGAGTTATGT Fw:TGCAGCATATCAGGCATTTC Rev:TGGAAGCATGCATCAAATGT Fw:GGAAACTGCTTGCACAAAGA Rev:CAGCAAGACCATCACCAGTTT 16 scaffold06906 Present study Rabbi et al. (in press) Akano et al. (2002), Lokko et al. (2005), Okogbenin et al. (2012) Okogbenin et al. (2007) 16 scaffold06906 Okogbenin et al. (2007) TME3 16 No match Okogbenin et al. (2007) TME3 16 Scaffold04175 Okogbenin et al. (2012) IITA-TMS-972205 16 scaffold07933 Lokko et al. (2005) TME7 SSR NS169 RFLP RME-1 SSR NS198 SSRY106 TME3 Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx Table 2 Screening results of the mapping population for the presence of ACMV and EACMV. Disease status Virus not detected ACMV only ACMV + EACMV Symptomatic Asymptomatic 1 92 65 6 15 1 Genetic map 0 50 3.3. Genotyping of SNP markers and construction of a dense genetic map In all, 17,682 SNPs were obtained from the SNP calling pipeline. The SNP data were subsequently filtered for markers with more than 20% missing values across the genotyped individuals. Also removed were loci that deviated from the expected genotypic frequencies at Chi-square significance threshold of P < 0.05. Linkage analysis was done using 8704 SNPs that passed these two QC filters. A high-density genetic linkage map was constructed using the Maximum-Likelihood approach implemented in Joinmap 4.1 (Table 3). A total of 6756 SNP markers were mapped across 19 linkage groups with between 115 and 559 SNPs (average = 256). With an average inter-SNP distance of 0.52 cM, this is the densest map developed for cassava so far (Fig. 2). Despite the high density of the GBS-derived SNPs mapped, several regions without markers were observed. Most notable were a single region on linkage Location (cM) 3.2. Screening of the population for cassava mosaic geminiviruses using PCR The PCR-based screening of the mapping population detected one or more of the CMGs in 87 of 180 plots assayed, while 93 plots were negative (Table 2). Only two species of CMGs, ACMV and EACMCV, were detected in the trial, which is consistent with known CMGs prevalence in West Africa. ACMV was detected in all the 87 virus-positive plots, whereas EACMCV was detected as a coinfection with ACMV in 16 plots (17% incidence). No case of single infection by EACMCV was detected. CMGs were detected in 79 of 81 symptomatic plots indicating a strong positive correlation between the visual scoring of disease and the PCR results. In addition, PCR also detected occurrence of ACMV in 7 asymptomatic plots and ACMV and EACMCV in one plot. Mean severity of CMD symptoms in ACMV infected plants was 3.1 and plants infected with ACMV and EACMCV was 3.2, which suggest apparent lack of synergistic effect in heightening symptom severity in dually infected plants. 5 100 150 200 250 1 2 2.2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Chromosome Fig. 2. Overview of genetic map developed from the 6756 ApeKI-derived SNPs. group 4 and two regions in linkage group 18. A possible reason for such gaps could be lack of polymorphic markers as a result of identity-by-descent of this region in the two parents. Most of the 12,977 scaffolds that constitute the current 533 Mb of the version 4.1 cassava genome assembly are fairly small; nearly 95% of them are 200 Kb or smaller. A total of 1093 unique scaffolds were anchored in the present map, and ranged from 19 to 89 in the different linkage groups. Despite their relatively small number, the anchored scaffolds covered a total size of 313.3 Mb, and accounted for 58.7% of the current cassava genome assembly. The complete genetic map developed from this work is available in Supplementary Table 1. 3.4. High-resolution mapping of CMD resistance locus Based on the qualitative nature of the resistance to CMD in the present mapping population, a single locus was expected to underlie the resistance phenotype. The high-density genetic map developed with 6756 GBS SNPs permitted a genome-wide search Table 3 Summary statistics of the genetic linkage map developed from ApekI SNP markers. Linkage group Number of SNPs Length (cM) Average distances (cM) Number of scaffolds anchored Cumulative scaffold size (base-pairs) 1 2 2.2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total 473 344 366 419 255 543 275 559 454 207 299 304 115 302 543 451 281 275 291 6756 242 195 175 168 194 230 182 256 222 72 148 203 99 200 242 225 152 156 154 256a 0.51 0.57 0.48 0.40 0.76 0.42 0.67 0.46 0.49 0.35 0.50 0.67 0.87 0.67 0.45 0.50 0.54 0.57 0.53 0.52a 66 52 59 55 57 61 73 66 78 19 55 41 21 60 89 49 64 70 58 1093 30,226,735 20,822,503 20,298,754 12,608,655 13,843,915 15,175,445 15,911,788 21,949,504 18,054,518 6,148,875 15,132,961 13,797,633 6,111,492 17,048,413 19,282,711 18,831,996 16,048,872 17,372,231 14,665,500 313,332,501 a Average values per linkage group. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 6 (Table 4). The approximate 95% Bayesian credible interval for the mapped locus spans from 68.8–70.74 cM region along LG16, irrespective of the scoring time (one or three months after planting) or growing season (2011, 2012 or 2013). The profiles of LOD scores along the linkage group 16 were very similar for the disease scores recorded at one and three months after planting as well as the three seasons of data, supporting the high heritability observed for this trait. Comparison of the phenotypes of the F1s against the resistance-linked marker S5214 780931 genotypes showed a small proportion of recombinants that carry the SNP allele linked to resistance but show susceptibility to the disease and vice versa. 3.5. Genomic localization of markers flanking previously mapped qualitative resistance against CMD Fig. 3. Single-marker association with qualitative resistance to CMD. (a) Genomewide P-values for 6756 SNPs across 19 linkage groups showing the strongest association signal was located in linkage group 16. The x-axis shows the SNPs along each chromosome; y-axis is the −log 10 (P-value) for the association. (b) An association plot for linkage group 16 showing SNPs that are informative in resistant (IITA-TMS-011412) and susceptible (IITA-TMS-4(2)1425) parents. The SNPs from scaffold 5214 with strongest association to resistance phenotype are highlighted. for the locus underlying the qualitative resistance to CMD. A strong association was detected in linkage group 16 that peaked at around 69.12 cM (Fig. 3). This association decreases on both sides of the peak as a result of increasing recombination between the markers and the underlying resistance gene. Most SNPs showing highly significant association (P < 1E−40) came from the 1.46 Mb-long scaffold 5214, with marker S5214 780931 at the peak; this marker explained 74% of the disease resistance variance. Additionally, only those SNPs that are informative in the CMD-resistant female parent show the significant associations while those segregating only in the susceptible male parent do not (Fig. 3). The presently mapped resistance locus occurs in the vicinity of the previously mapped CMD2 locus (Akano et al., 2002); marker S5214 780931 is just 623.24 kb away from a microsatellite marker, SSRY28 (between 157,470 bp to 157,616 bp), that was first reported to be closely linked to the CMD2 locus (Akano et al., 2002), indicating that the same gene may account for the observed resistance to CMD in the present mapping population. The results from the interval-mapping based QTL analysis recapitulated those from single-marker trait associations, and uncovered a single peak with a maximum LOD value of 43 on linkage group 16 (Supplementary Fig. 1). Despite employing the Composite Interval Mapping method, no additional peaks exceeding the significance threshold were detected, confirming that only a single locus conferred the qualitative resistance in the present population. The SNP marker S5214 780931 is flanked by S5214 472282 and S5214 1084049 was the closest to the QTL peak A major objective of the present study was to use the highdensity SNP genetic map to anchor seven molecular markers previously reported to be linked to single dominant gene resistance to CMD in four other genetic backgrounds (Table 1). The SSR and RFLP makers used in those studies were therefore anchored in the present map via their harbouring scaffolds. These markers came from scaffold 05214 (SSRY28), scaffold 06906 (NS158 and NS169), scaffold 04175 (NS198) and scaffold 07933 (SSRY106) all of which fall within the same genomic region of linkage group 16 of the present map (Fig. 4). The primer sequences for the RFLP marker RME-1 did not identify a suitable match in the reference genome. In a parallel study using another bi-parental mapping population derived from a cross between another improved variety that is nearly immune to CMD (IITA-TMS-961089A) and a susceptible landrace (TME117), another SNP marker was identified from the same scaffold (S5214 30911). It was strongly associated with the disease resistance and explained 60% of phenotypic variation (Rabbi et al., in press). The reported percentage of variation explained by these markers shows a gradient that peaks around scaffold 05214, the region that is likely to contain the concerned resistance gene (Fig. 4). Markers away from this region have been reported to be less linked to the resistance gene by the low percentage of variation that they explain, a trend that agrees with the GWA results, particularly considering the segregating markers from the female parent (Fig. 3). 3.6. Recombination pattern in linkage group 16 To visualize the degree of linkage and recombination pattern between the presently mapped CMD resistance locus and other previously published SSR markers along the linkage group, the chromosome-wide pattern of LD on linkage group 16 was examined (Fig. 4). The resulting haplotype map is useful in providing a framework for interpreting the results of the previous mapping studies, particularly the proportion of phenotypic variation explained by the various microsatellite markers and their relative locations in the present map. Though different parents are used in the present and previous mapping studies, these populations have all Table 4 Markers flanking the presently mapped CMD resistance locus calculated from disease severity scored at one- and three-months after planting. The table also presents the linkage group, position and interval mapping-based percentage of phenotype variation explained by the closest marker. Trait b CMD1S CMD3Sb a b SNP Linkage group Position (cM) Peak LODa R2 H2 S5214 1084049 S5214 780931 S5214 472282 16 16 16 68.88 70.00 70.74 45.37 45.59 44.99 0.708 0.892 S5214 1084049 S5214 780931 S5214 472282 16 16 16 68.88 69.31 70.74 42.98 43.20 42.33 0.696 0.928 Logarithm of odds score for presence of QTL. Values calculated using the three-year BLUPs; R2 = percentage variation explained by QTL; H2 = Heritability using the three year-data. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 7 Fig. 4. Graphical display of the variation in the linkage disequilibrium (r2 ) along linkage group 16 calculated for every pair of SNPs in the bi-parental mapping population (left); and the genetic map (right). The location of the mapped CMD resistance locus (underlined SNP, viz. S5214 780931) in scaffold 05214 relative to SSRs other scaffolds (S7933, S4175 and S6906) containing microsatellite markers reported to be linked to resistance to CMD is shown on the right. The percentage of phenotypic variation explained by the (PVE, measured by r2 ) are also presented. The dark shading corresponds to stronger LD (higher r2 ). Names of other SNPs in the linkage group were omitted due to space constraints; but are available in the Supplementary (Supplementary Table 1). undergone a single round of meiosis and are thus expected to have a similar extent of linkage disequilibrium across their chromosomes. The region bearing the CMD resistance locus was characterized by two large haplotype blocks (dark-grey shading between 0 to 32 cM and 36 to 62 cM, respectively). Moderate LD was detected between these blocks (light-grey shading). The first block encompasses scaffold 4175, which harbours microsatellite marker NS198, reported by Okogbenin et al. (2012) to explain 11% of variation in CMD resistance in a bi-parental population. SNPs from this scaffold and those near the CMD2 locus in scaffold 05214 also show moderate LD (r2 ∼ 0.10). The CMD resistance locus occurs in the second LD block. Scaffold 5214 harbours the SNPs that were strongly associated to the resistance as well as the microsatellite marker SSRY28 reported previously (Akano et al., 2002). Though discovered from different genetic backgrounds, the resistance-linked SNPs and SSRY28 explain between 60% and 70% of the disease resistance variance. These markers are just 623.24 kb apart. Another scaffold in this block (07933) which harbours microsatellite marker SSRY106, was reported to explain nearly 40% of variation in CMD resistance (Lokko et al., 2005). 3.7. Genetic relatedness of CMD-resistant landraces using genome-wide SNPs In addition to genetic mapping of the bi-parental population, cluster analysis was performed with key landraces known to possess strong resistance to CMD (Fig. 5). To estimate the “residual genetic distance” between identical clones resulting from genotyping error, several clones – some of which have different names as a result of adoption in different regions – were redundantly genotyped. These pairs include the male parent (IITA-TMS-4(2)1425) and “Kibandameno-white”; IITA-TMS-30572 known as “Migyera” in Uganda; and TME12 (A and B). Most of the CMD resistant landraces, including those that were first discovered in Nigeria, are clearly very closely related, or even perhaps identical, based on comparison of the distance between them and the residual distance between the redundantly genotyped clones, confirming Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 ARTICLE IN PRESS G Model VIRUS-96176; No. of Pages 10 8 I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx IITA-TMS-30337 and IITA-TMS-30572. Since the 1990s, IITA has been making crosses to combine the single dominant gene (CMD2) with the multigenic M. glaziovii resistance and have produced clones with near immunity to CMD (Legg and Fauquet, 2004). Height 0 10 30 4.2. Mapping resolution of the CMD2 locus Cluster Dendrogram dist hclust (*, "complete") IITA_TMS_30572 IITA_TMS_011412 S_TME1 S_TME203 S_TME2 R_TME204 R_TME199 R_TME225 R_TME419 S_TME510 S_TME450 R_TME282 S_TME399 S_TME379 R_TME5 R_TME4 R_TME6 R_TME14 R_TME7 R_TME12 R_TME3 R_TME11 S_TME693 S_TME237 S_TME279 R_TME9 S_TME117 S_TME778 R_TME8 S_TME207 R_TME13 KIBANDAMENO_WHITE IITA_TMS_4.2.1425 50 Fig. 5. A hierarchical clustering tree based on dissimilarities between a selection of landraces using of 2069 SNP markers. Most of the lower TME series landraces that are resistant to CMD form a single, tight cluster (bottom). The prefix “R” denotes CMD-resistant and “S” denotes CMD-susceptible varieties. previous studies using AFLPs (Fregene et al., 2000). These landraces (TME3, TME4, TME6, TME7, TME11, TME12 and TME14) have a very similar morphological appearance, most prominent of which is red petioles. 4. Discussion 4.1. A historical perspective of development and diffusion of early CMD resistant varieties across of Africa Breeding for resistance to CMD has been a major goal of cassava improvement programmes in Africa for more than 70 years. Prior to the discovery of the single-gene resistance (Akano et al., 2002), the primary defence against the disease was the polygenic resistance introgressed into cultivated cassava (M. esculenta) from M. glaziovii after three cycles of backcrossing (Nichols, 1947). Examining the breeding history of improved varieties with the quantitative resistance to CMD reveals a very narrow genetic base tracing back to a single derivative of the M. glaziovii × M. esculenta crosses (Jennings, 2003). None of these descendants is immune to infection by CMGs (Nichols, 1947) although some express mild and sometimes transient symptoms as a result of incomplete systemic infection that leads to reversion of symptoms (Jennings, 1976; Fargette et al., 1994), while others are quite susceptible to the disease. These TMS varieties that trace back to M. glaziovii have been widely adopted in Africa and helped revive cassava farming following the devastation of many local landraces in East Africa (Legg and Fauquet 2004). Some of the adopted TMS varieties are IITA-TMS-60142, Genetic mapping of qualitative resistance to CMD in this study uncovered a single locus on linkage group 16 with a large peak LOD (>40). This locus, S5214 780931, explained 74% of the phenotypic variance, and is co-located with the SSRY28 on scaffold5214, indicating the dominant monogenic resistance gene from the female parent is likely to be the CMD2 locus of Akano et al. (2002). The dense genetic map obtained from GBS has enabled a higher level of mapping resolution of the CMD resistance locus. Linkage group 16 has a total of 281 SNPs, and the approximate 95% Bayesian credible interval around the mapped locus is just 1.1 cM. This resolution was not obtainable using the traditional marker from previous mapping efforts. 4.3. All markers linked to qualitative resistance occur in the same chromosome region The high-density SNP genetic map was used to anchor markers previously reported to be linked to the dominant monogenic resistance to CMD. These markers (Table 1) were interpolated in the present map via the scaffolds that harbour them. All of them (except RME1 and RME4 for which a suitable matches in the current reference genome were not found) came from scaffolds occurring in the same region of linkage group 16. Overall, there is a general congruence between the proportion of genetic variation reported for these microsatellite markers, the marker-trait association profile in linkage group 16 (Fig. 3) and the pattern of linkage disequilibrium (measure as r2 ) between microsatellite locations and the putative position of the resistance locus in the present population (Fig. 4). This pattern supports the idea that most of the microsatellite markers reported to be linked to varying degrees with CMD resistance are associated with a single resistance locus that is most likely the CMD2 gene. Moreover, the strength of the linkage decreased with increasing distance from the gene location. In a parallel mapping study using another improved variety that is nearly immune to CMD (IITA-TMS-961089A) and a susceptible landrace (TME117), a strong QTL signal was found in scaffold05214 that also explained 60% of resistance variation, implicating the same CMD2 locus (Rabbi et al., in press). Considering the results of the present study and those of Akano et al. (2002), Okogbenin et al. (2012), and Lokko et al. (2005), it is highly likely that a there is a single gene, or a cluster of resistance genes (Michelmore and Meyers, 1998). Indeed, the conservation of QTL positions among the different sources of CMD resistance is not surprising given that the majority of the highly resistant cultivars developed recently in Africa trace to a few landrace accessions from Nigeria that were used over the last 12 or so years of resistance breeding in which the merits of the TME materials were appreciated. According to the phylogeny (Fig. 5) it was confirmed that several landraces that were first discovered to be nearly immune to CMD were genetically very similar. These findings agree with a previous study (Fregene et al., 2000). Other duplicates, which are also morphologically very similar, were identified. These landraces were collected from South West and Central Nigeria and have different local names. The resistance in the landrace TME9, which also occurs in a separate clade on the phylogenetic tree, is likely to be CMD2. This landraces is found in the pedigree of IITA-TMS-961089A that was used in a parallel mapping study; resistance-linked markers also came from scaffold 05214 (Rabbi et al., in press). It is postulated that all of these landraces, which come from West Africa, trace back to a single CMD resistant Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx 9 mutant that was selected by farmers and rapidly diffused in the region through varietal dissemination efforts. Acharya for preparation of GBS libraries and sequencing, and Oluwafemi Alaba for sample preparation and DNA extraction. 4.4. Screening of cassava mosaic geminiviruses in the mapping population Appendix A. Supplementary data The CMG screening results showed that ACMV was the predominant species in the field, whereas EACMV occurred at lower frequency and only with ACMV as a dual infection. The preponderance of ACMV reflects the known proportions of the two viruses in West Africa and contrasts with the predominance of EACMV-UG in the pandemic regions of Eastern Africa (Legg and Fauquet, 2004). The severe form of EACMV-UG has displaced other CMG species and strains during the recent pandemic that swept through the region (Legg et al., 2006). An important question is whether the CMD2type of resistance available in the West-African cassava germplasm can protect against these more virulent strains of CMG that does not contribute to the disease pressure in West Africa. Germplasm screening in Uganda have demonstrated that the qualitative resistance locus in the TME landraces is highly effective against this virus (Kawuki et al., 2011). Similar results have been obtained through biolistic inoculation methods using-pseudo recombinants of ACMV and EACMV (Sserubombwe et al., 2008). Some of the screened clones include TME5 and TME14, which cluster together with TME3, the original source of the CMD2 locus (Fig. 5). The evidence suggests that CMD2 locus is a monogenic resistance with broad specificity against cassava-infecting geminiviruses. 4.5. Prospects for long-term efficacy of CMD2 resistance gene Despite the apparent broad specificity of the CMD2 gene, whether the locus can confer durable resistance according to the classical definition (Johnson, 1984) depends on whether it will remain effective over a prolonged period of widespread use under conditions conducive to the disease. It is well known that monogenic resistance can be ephemeral and subjected to the well known boom-and-bust cycles (McDonald and Linde, 2002) as a result of the rapid genetic evolution in the pathogen though there is evidence of durable resistance from single gene actions (Johnson, 1984; Stuthman et al., 2007). Since its discovery in the 1980s and subsequent extensive use in breeding programmes in sub-Saharan Africa, there has been no report of breakdown of the CMD2-type of resistance in the field, indicating that the locus could be durable against CMGs. Still, the long-term effectiveness of the resistance locus needs to be augmented by combining it with the quantitative resistance derived from M. glaziovii. Indeed, crosses combining these two types of resistance have given rise to progeny which are near immune to all known CMG species, including the virulent EACMV-Uganda (Legg and Fauquet, 2004; Monde et al., 2012). Additionally, more efforts are needed for a comprehensive screening of the resistant landraces to identify any additional sources that may exist. Though increased resolution was achieved in the mapping analysis, a larger mapping population is required for fine-mapping and cloning of the concerned locus. This will provide insights on the mechanism of the resistance that so far seems to be highly effective against various CMG species. Acknowledgments This research was supported by the CGIAR-Research Programme on Roots, Tubers and Bananas and the “Next Generation Cassava Breeding Project”, through funding from the Bill & Melinda Gates Foundation and the Department for International Development of the United Kingdom. We acknowledge the help of Charlotte Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.virusres. 2013.12.028. References Akano, A., Dixon, A., Mba, C., Barrera, E., Fregene, M., 2002. Genetic mapping of a dominant gene conferring resistance to cassava mosaic disease. Theor. Appl. Genet. 105, 521–525. Alabi, O.J., Kumar, P.L., Naidu, R.A., 2008. Multiplex PCR method for the detection of African cassava mosaic virus and East African cassava mosaic Cameroon virus in cassava. J. Virol. Met. 154, 111–120. Barrett, J.C., Fry, B., Maller, J.D.M.J., Daly, M.J., 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265. Bland, J.M., Altman, D.G., 1995. Multiple significance tests: the Bonferroni method. Br. Med. J. 310, 170. Bradbury, P.J., Zhang, Z., Kroon, D.E., Casstevens, T.M., Ramdoss, Y., Buckler, E.S., 2007. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635. Broman, K.W., Wu, H., Sen, Ś., Churchill, G.A., 2003. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890. Ceballos, H., Iglesias, C.A., Pérez, J.C., Dixon, A.G., 2004. Cassava breeding: opportunities and challenges. Plant Mol. Biol. 56, 503–516. Cheema, J., Dicks, J., 2009. Computational approaches and software tools for genetic linkage map estimation in plants. Brief. Bioinform. 10, 595–608. Dellaporta, S.L., Wood, J., Hicks, J.B., 1983. A plant DNA minipreparation: version II. Plant Mol. Biol. Rep. 1, 19–21. Duffy, S., Holmes, E.C., 2009. Validation of high rates of nucleotide substitution in geminiviruses: phylogenetic evidence from East African cassava mosaic viruses. J. Gen. Virol. 90, 1539–1547. Elshire, R.J., Glaubitz, J.C., Sun, Q., Poland, J.A., Kawamoto, K., Buckler, E.S., Mitchell, S.E., 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6, e19379. Fargette, D., Jeger, M., Fauquet, C., Fishpool, L.D.C., 1994. Analysis of temporal disease progress of African cassava mosaic virus. Phytopathology 84, 91–98. Flint-Garcia, S.A., Thornsberry, J.M., Buckler, I.V.E.S., 2003. Structure of linkage disequilibrium in plants. Annu. Rev. Plant Biol. 54, 357–374. Fregene, M., Bernal, A., Duque, M., Dixon, A., Tohme, J., 2000. AFLP analysis of African cassava (Manihot esculenta Crantz) germplasm resistant to the cassava mosaic disease (CMD). Theor. Appl. Genet. 100, 678–685. Fregene, M., Okogbenin, E., Mba, C., Angel, F., Suarez, M.C., Janneth, G., et al., 2001. Genome mapping in cassava improvement: challenges, achievements and opportunities. Euphytica 120, 159–165. Harrison, B.D., Zhou, X., Otim-Nape, G.W., Liu, Y., Robinson, D.J., 1997. Role of a novel type of double infection in the geminivirus-induced epidemic of severe cassava mosaic in Uganda. Ann. Appl. Biol. 131, 437–448. Herrera-Campo, B.V., Hyman, G., Belloti, A., 2011. Threats to cassava production: known and potential geographic distribution of four key biotic constraints. Food Sec. 3, 329–345. Jansson, C., Westerbergh, A., Zhang, J., Hu, X., Sun, C., 2009. Cassava, a potential biofuel crop in (the) People’s Republic of China. Appl. Energy 86, S95–S99. Jennings, D.L.,1976. Breeding for Resistance to African Cassava Mosaic Disease: Progress and Prospects. In: Interdisiplinary Workshop. IDRC, Muguga (Kenya). Jennings, D.L., 2003. Historical perspective on breeding for resistance to cassava Brown Streak Virus Disease. In: Hillocks, R.J. (Ed.), Cassava Brown Streak Virus Disease: Past, Present, and Future. Proceedings of an International Workshop. Mombasa, Kenya, 27–30 October 2002. Natural Resources International Limited, Aylesford, UK, p. 100. Jansen, J., de Jong, A.G., van Ooijen, J.W., 2001. Constructing dense genetic linkage maps. Theor. Appl. Genet. 102, 1113–1122. Johnson, R., 1984. A critical analysis of durable resistance. Annu. Rev. Phytopath. 22, 309–330. Kawuki, R.S., Pariyo, A., Amuge, T., Nuwamanya, E., Ssemakula, G., Tumwesigye, S., Bua, A., Baguma, Y., Omongo, C., Alicai, T., Orone, J., 2011. A breeding scheme for local adoption of cassava (Manihot esculenta Crantz). J. Plant Breed. Crop Sci. 3, 120–130. Legg, J.P., Fauquet, C.M., 2004. Cassava mosaic geminiviruses in Africa. Plant Mol. Biol. 56, 585–599. Legg, J.P., Ogwal, S., 1998. Changes in the incidence of African cassava mosaic geminivirus and the abundance of its whitefly vector along south–north transects in Uganda. J. Appl. Entomol. 122, 169–178. Legg, J.P., Owor, B., Sseruwagi, P., Ndunguru, J., 2006. Cassava mosaic virus disease in East and Central Africa: epidemiology and management of a regional pandemic. Adv. Virus Res. 67, 355–418. Li, H., Durbin, R., 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028 G Model VIRUS-96176; No. of Pages 10 10 ARTICLE IN PRESS I.Y. Rabbi et al. / Virus Research xxx (2014) xxx–xxx Lokko, Y., Danquah, E.Y., Offei, S.K., Dixon, A.G.O., Gedil, M.A., 2005. Molecular markers associated with a new source of resistance to the cassava mosaic disease. Afr. J. Biotechnol. 4, 873–881. Lokko, Y., Dixon, A., Offei, S., Danquah, E., Fregene, M., 2006. Assessment of genetic diversity among African cassava Manihot esculenta Grantz accessions resistant to the cassava mosaic virus disease using SSR markers. Genet. Resour. Crop Evol. 53, 1441–1453. Ly, D., Hamblin, M., Rabbi, I., Melaku, G., Bakare, M., Gauch, H.G., et al., 2013. Relatedness and genotype × environment interaction affect prediction accuracies in genomic selection: a study in Cassava. Crop Sci. 53, 1312–1325. McDonald, B.A., Linde, C.C., 2002. Pathogen population genetics, evolutionary potential, and durable resistance. Annu. Rev. Phytopathol. 40, 349–379. Michelmore, R.W., Meyers, B.C., 1998. Clusters of resistance genes in plants evolve by divergent selection and a birth-and-death process. Genome Res. 8, 1113–1130. Monde, G., Walangululu, J., Bragard, C., 2012. Screening cassava for resistance to cassava mosaic disease by grafting and whitefly inoculation. Arch. Phytopathol. Pfl. 45, 2189–2201. Nichols, R.F.W., 1947. Breeding cassava for virus resistance. East Afr. Agric. J. 12, 184–194. Okogbenin, E., Egesi, C.N., Olasanmi, B., Ogundapo, O., Kahya, S., Hurtado, P., et al., 2012. Molecular marker analysis and validation of resistance to cassava mosaic disease in elite cassava genotypes in Nigeria. Crop Sci. 52, 2576–2586. Okogbenin, E., Porto, M.C.M., Egesi, C., Mba, C., Espinosa, E., Santos, L.G., et al., 2007. Marker-assisted introgression of resistance to cassava mosaic disease into Latin American germplasm for the genetic improvement of cassava in Africa. Crop Sci. 47, 1895–1904. Otim-Nape, G.W., Thresh, J.M., 1998. The current pandemic of cassava mosaic virus disease in Uganda. In: Jones, G. (Ed.), The Epidemiology of Plant Diseases. Kluwer, Dordrecht, Germany, pp. 423–443. Rabbi, I.Y., Hamblin, M., Gedil, M., Kulakow, P., Ferguson, M., Ikpan, A.S., Ly, D., Jannink, J-L. Genetic mapping using genotyping-by-sequencing in the clonally-propagated cassava. Crop Sci. (in press). Sserubombwe, W.S., Briddon, R.W., Baguma, Y.K., Ssemakula, G.N., Bull, S.E., Bua, A., Alicai, T., Omongo, C., Otim-Nape, G.W., Stanley, J., 2008. Diversity of begomoviruses associated with mosaic disease of cultivated cassava (Manihot esculenta Crantz) and its wild relative (Manihot glaziovii Müll. Arg.) in Uganda. J. Gen. Virol. 89, 1759–1769. Stuthman, D.D., Leonard, K.J., Miller-Garvin, J., 2007. Breeding crops for durable resistances. In: Sparks, D.L. (Ed.), Advances in Agronomy, vol. 95. Elsevier, Amsterdam, pp. 319–367. Van Ooijen, J.W., 2006. JoinMap 4. Software for the Calculation of Genetic Linkage Maps in Experimental Populations. Kyazma BV, Wageningen, Netherlands. Vazquez, A.I., Bates, D.M., Rosa, G.J.M., Gianola, D., Weigel, K.A., 2010. Technical note: an R package for fitting generalized linear mixed models in animal breeding. J. Anim. Sci. 88, 497–504. Please cite this article in press as: Rabbi, I.Y., et al., High-resolution mapping of resistance to cassava mosaic geminiviruses in cassava using genotyping-by-sequencing and its implications for breeding. Virus Res. (2014), http://dx.doi.org/10.1016/j.virusres.2013.12.028