Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Development of a set of SNP markers present in expressed genes of the apple

Genomics, 2008
...Read more
Development of a set of SNP markers present in expressed genes of the apple David Chagné a, 1 , Ksenija Gasic b, 1 , Ross N. Crowhurst c , Yuepeng Han b , Heather C. Bassett a , Deepa R. Bowatte a , Timothy J. Lawrence c , Erik H.A. Rikkerink c , Susan E. Gardiner a , Schuyler S. Korban b, a The Horticulture and Food Research Institute of New Zealand (HortResearch) Palmerston North, PB 11030, Manawatu Mail Centre, Palmerston North 4442, New Zealand b Department of Natural Resources & Environmental Sciences, University of Illinois, Urbana, IL 61801, USA c HortResearch Mount Albert, PB 92169, Auckland 1142, New Zealand abstract article info Article history: Received 5 June 2008 Accepted 29 July 2008 Available online 14 September 2008 Keywords Malus × domestica Single nucleotide polymorphisms Expressed sequence tags Candidate genes Bin mapping Molecular markers associated with gene coding regions are useful tools for bridging functional and structural genomics. Due to their high abundance in plant genomes, single nucleotide polymorphisms (SNPs) are present within virtually all genomic regions, including most coding sequences. The objective of this study was to develop a set of SNPs for the apple by taking advantage of the wealth of genomics resources available for the apple, including a large collection of expressed sequenced tags (ESTs). Using bioinformatics tools, a search for SNPs within an EST database of approximately 350,000 sequences developed from a variety of apple accessions was conducted. This resulted in the identication of a total of 71,482 putative SNPs. As the apple genome is reported to be an ancient polyploid, attempts were made to verify whether those SNPs detected in silico were attributable either to allelic polymorphisms or to gene duplication or paralogous or homeologous sequence variations. To this end, a set of 464 PCR primer pairs was designed, PCR was amplied using two subsets of plants, and the PCR products were sequenced. The SNPs retrieved from these sequences were then mapped onto apple genetic maps, including a newly constructed map of a Royal Gala ×A689-24 cross and a Malling 9×Robusta 5, map using a bin mapping strategy. The SNP genotyping was performed using the high-resolution melting (HRM) technique. A total of 93 new markers containing 210 coding SNPs were successfully mapped. This new set of SNP markers for the apple offers new opportunities for understanding the genetic control of important horticultural traits using quantitative trait loci (QTL) or linkage disequilibrium analysis. These also serve as useful markers for aligning physical and genetic maps, and as potential transferable markers across the Rosaceae family. © 2008 Elsevier Inc. All rights reserved. Introduction Recent advances in plant genomics have moved beyond model systems to various plant species of signicant agronomical and horticultural importance. Since the release of genome sequences of Arabidopsis and rice in the past few years [14], a number of comprehensive tools such as bioinformatics tools for sequence assembly and functional annotation, microarray platforms for high- throughput gene expression, transformation systems, and large cDNA and gDNA libraries have been developed for a range of species, including those with a relatively small research investment such as the apple (Malus × domestica Borkh.) [5]. Following ve years of intensive research efforts and investment in genomics, the apple now possesses a large collection of expressed sequence tags (ESTs) (N 250 K, dbEST April 2008) comparable to those of well-investigated animal livestock species and cereal crops. Moreover, a public database for Rosaceae genomics (www.rosaceae.org), a number of saturated genetic maps [6], and whole genome sequencing, currently in progress (R. Velasco, unpublished), are also available. A major challenge for plant biologists working with apple is to integrate these various tools to better understand genome structure and function in this woody temperate perennial fruit crop. This will provide opportunities for developing robust and reliable molecular markers linked to genes of interest as well as isolation and characterization of target genes and regulatory elements for crop improvement efforts using marker- assisted breeding and/or plant transformation using native genes and regulatory elements, i.e., cis-genesis [7]. To pursue effective crop improvement strategies, when molecular controls of traits of interest are under investigation, these must be associated with observed phenotypic variations. One strategy that can be used to achieve this goal involves correlating observed DNA sequence polymorphism with the phenotype using either genetic linkage mapping or association studies. As a prelude to these approaches, a comprehensive set of molecular markers must be developed from characterized genes using genomics tools. Single nucleotide polymorphisms (SNPs) are good candidates for marker development, as they constitute the most common DNA sequence variations found in genomes of most organisms, including Genomics 92 (2008) 353358 Corresponding author. Fax: +1 217 333 8298. E-mail address: korban@uiuc.edu (S.S. Korban). 1 The rst two authors contributed equally to this work. 0888-7543/$ see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ygeno.2008.07.008 Contents lists available at ScienceDirect Genomics journal homepage: www.elsevier.com/locate/ygeno
the apple [5]. SNPs, dened broadly to include small deletions and insertions, are found in most genomic regions, including coding regions, thus rendering them effective markers for mapping genes. In general, implementation of SNPs for genetic studies, such as linkage mapping or association studies, involves a three-step process. The rst step is the SNP discovery phase, which involves the detection of de novo polymorphism by conducting bioinformatics sequence compar- isons or molecular biology techniques or a combination of both. This is followed by a validation step to distinguish DNA polymorphisms of genuine allelic variants from those of other biological phenomena, such as gene duplication events; i.e., paralogous or homeologous genes, as well as those of technical errors, primarily sequencing errors, as many databases are based on high-throughput single-pass sequencing. The nal step involves characterization of large numbers of individuals using a high-throughput genotyping approach. A wide range of molecular techniques suitable for pursuing these three steps have been described [8], each characterized by a distinct cost-scale and throughput capacity, and utilizing different technology platforms. An example of developing a SNP marker from a candidate gene for red esh and foliage color in apple has recently been reported [9]. In a targeted study on a small number of candidate genes, Chagné et al. [9] have demonstrated that a single 4-bp insertiondeletion marker located within a candidate gene cosegregates with red color in a mapping population of more than 500 individuals. In this study, a comprehensive approach is adopted to identify a set of markers linked (and in a few cases possibly corresponding) to complex as well as simply inherited traits. SNP markers located within expressed genes have been identied in public apple EST collections and then placed on a genetic map constructed for an important commercial apple cultivar, Royal Gala, using selective (or bin) mapping [10,11]. A critical application for these new markers will be the alignment of genetic and physical maps in order to assemble the forthcoming apple genome sequence, and the subsequent fast-tracking of positional cloning strategies for major genes and quantitative trait loci (QTL) controlling important agronomic characters. This resource of new SNP markers will be particularly useful for future applications in reverse genetics studies and in apple breeding, and also for understanding apple genome evolution within the family Rosaceae. Results Genetic map construction and bin mapping set Genetic maps were constructed for the parents Royal Gala and A689-24 using 152 markers for each (Supplemental Material 1). These markers consisted of 73 AFLP fragments,155 SSRs, 51 SSRs that were common between both maps, and 25 SCARs. A total of 20 and 18 linkage groups (LGs) spanning 997.1 and 1053.2 cM were obtained for Royal Gala and A689-24, respectively. For LG 1 in Royal Gala, only two SSRs were mapped and cosegregated perfectly, resulting in a 0 cM group. Most other LGs contained at least two previously described SSR markers, except for four LGs. For Royal Gala, LGs 12 and 13 had only one published SSR and LG 7 did not contain any SSR marker; however, a published SCAR marker (NLscDdARM) [12] was mapped. For A689- 24, only one LG did not contain any published SSR marker; however, as it was the last group to remain orphaned, it was deduced that this probably corresponded to LG 7. Altogether, this allowed alignment of both maps to published apple maps, and assignment of the same LG numbering [13]. For the bin mapping set selection, markers from the Royal Gala dataset were inspected for absence of missing data and for suspect double-recombination events. The number of markers was reduced to allow division of the genome map into 54 and 60 bins for Royal Gala and A689-24, respectively. A subset of 14 individuals was selected, representing a large number of distinct recombination events, and was designated as the bin mapping set. The bin mapping set was validated by adding 10 apple and pear SSR markers of known locations. These markers were chosen to ll in gaps that were not covered by our genetic maps, when compared with published maps. All 10 SSRs mapped to their expected positions (Supplemental Material 1), thus indicating that bin mapping was accurate enough to justify adding new markers. In addition, this strategy made it possible to increase the coverage of the framework map. SNP discovery in the apple EST dataset The new EST assembly using 350,051 EST sequences resulted in a total of 93,959 nonredundant sequences (NRs) at a clustering threshold of 95%. This included 37,885 contigs representing a total estimated sequence length of 32.1 Mb of expressed sequences, approximately 4% of the estimated genome size of the apple. Of these contigs,17,825 contained at least four ESTs per alignment, and could thus be used for SNP discovery. A set of 68 contigs, each containing more than 200 ESTs in their alignment, was excluded because of computing time issues. Finally, a set of 17,757 contigs representing an estimated length of 10.7 Mb were used for SNP detection (Table 1). A total of 71,482 bi-allelic SNPs were detected in 9555 contigs (53.8%), corresponding to an average occurrence of one SNP every 149 bp. A total of 34,361 transitions (AG and CT) and 37,121 transversions (AC, AT, CG, GT) (Fig. 1) were detected, with AG being the most common (18,365; 25.7%) and CG the least common (6181; 8.6%) variation observed. SNPs, bioinformatics analysis, and validation A subset of 600 NR sequences with signicant blast matches were selected. Putative functions of these sequences were recorded based on their blastx scores. In addition, we recorded the types of SNPs detected, and whether or not sequence variations induced changes in amino acid sequences, i.e., synonymous vs nonsynonymous. A total of 1949 SNPs were identied, including 852 (43.7%) nonsynonymous SNPs and 1097 synonymous SNPs. For Royal Gala, 1266 SNPs were found to be heterozygous. From these 600 NRs, a set of 464 PCR primer pairs encompassing 1434 SNPs was designed (Supplemental Material 2). Of these primer pairs, 341 amplied a single PCR product size, while the remaining yielded either no amplication product or complex patterns (more than two size products). Two approaches were employed according to the set of plants used to validate the SNPs. In the rst approach, 25 amplicons were sequenced from the bin mapping set; while in the second approach, 316 amplicons were sequenced from six apple genotypes. A total of 110 amplicons, 10 and 100 using the rst and second approaches, respectively, yielded poor sequence quality that could not be used for SNP detection. The remaining 231 (67.7%) amplicons and their sequences were used for verication of SNPs detected using ESTs. Out of 728 putative SNPs found in transcribed regions of these 231 amplicons, 257 could not be veried because of poor sequence quality. For the remaining 471 putative SNPs, 327 (69.4%) were independently veried in new sequences and Table 1 Description of the contig set used for single nucleotide polymorphism detection Number Cumulative size (Mb) ESTs 350,051 Singletons 56,074 Contigs 37,885 Total 93,959 32.1 NRs (more than 4 ESTs per contig) 17,825 NRs (more than 200 ESTs per contig) 68 Total NRs used for SNP detection 17,757 10.7 NRs containing SNP 9,555 354 D. Chagné et al. / Genomics 92 (2008) 353358
Genomics 92 (2008) 353–358 Contents lists available at ScienceDirect Genomics j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / y g e n o Development of a set of SNP markers present in expressed genes of the apple David Chagné a,1, Ksenija Gasic b,1, Ross N. Crowhurst c, Yuepeng Han b, Heather C. Bassett a, Deepa R. Bowatte a, Timothy J. Lawrence c, Erik H.A. Rikkerink c, Susan E. Gardiner a, Schuyler S. Korban b,⁎ a b c The Horticulture and Food Research Institute of New Zealand (HortResearch) Palmerston North, PB 11030, Manawatu Mail Centre, Palmerston North 4442, New Zealand Department of Natural Resources & Environmental Sciences, University of Illinois, Urbana, IL 61801, USA HortResearch Mount Albert, PB 92169, Auckland 1142, New Zealand a r t i c l e i n f o Article history: Received 5 June 2008 Accepted 29 July 2008 Available online 14 September 2008 Keywords Malus × domestica Single nucleotide polymorphisms Expressed sequence tags Candidate genes Bin mapping a b s t r a c t Molecular markers associated with gene coding regions are useful tools for bridging functional and structural genomics. Due to their high abundance in plant genomes, single nucleotide polymorphisms (SNPs) are present within virtually all genomic regions, including most coding sequences. The objective of this study was to develop a set of SNPs for the apple by taking advantage of the wealth of genomics resources available for the apple, including a large collection of expressed sequenced tags (ESTs). Using bioinformatics tools, a search for SNPs within an EST database of approximately 350,000 sequences developed from a variety of apple accessions was conducted. This resulted in the identification of a total of 71,482 putative SNPs. As the apple genome is reported to be an ancient polyploid, attempts were made to verify whether those SNPs detected in silico were attributable either to allelic polymorphisms or to gene duplication or paralogous or homeologous sequence variations. To this end, a set of 464 PCR primer pairs was designed, PCR was amplified using two subsets of plants, and the PCR products were sequenced. The SNPs retrieved from these sequences were then mapped onto apple genetic maps, including a newly constructed map of a Royal Gala × A689-24 cross and a Malling 9 × Robusta 5, map using a bin mapping strategy. The SNP genotyping was performed using the high-resolution melting (HRM) technique. A total of 93 new markers containing 210 coding SNPs were successfully mapped. This new set of SNP markers for the apple offers new opportunities for understanding the genetic control of important horticultural traits using quantitative trait loci (QTL) or linkage disequilibrium analysis. These also serve as useful markers for aligning physical and genetic maps, and as potential transferable markers across the Rosaceae family. © 2008 Elsevier Inc. All rights reserved. Introduction Recent advances in plant genomics have moved beyond model systems to various plant species of significant agronomical and horticultural importance. Since the release of genome sequences of Arabidopsis and rice in the past few years [1–4], a number of comprehensive tools such as bioinformatics tools for sequence assembly and functional annotation, microarray platforms for highthroughput gene expression, transformation systems, and large cDNA and gDNA libraries have been developed for a range of species, including those with a relatively small research investment such as the apple (Malus × domestica Borkh.) [5]. Following five years of intensive research efforts and investment in genomics, the apple now possesses a large collection of expressed sequence tags (ESTs) (N250 K, dbEST April 2008) comparable to those of well-investigated animal livestock species and cereal crops. Moreover, a public database for Rosaceae genomics (www.rosaceae.org), a number of saturated ⁎ Corresponding author. Fax: +1 217 333 8298. E-mail address: korban@uiuc.edu (S.S. Korban). 1 The first two authors contributed equally to this work. 0888-7543/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ygeno.2008.07.008 genetic maps [6], and whole genome sequencing, currently in progress (R. Velasco, unpublished), are also available. A major challenge for plant biologists working with apple is to integrate these various tools to better understand genome structure and function in this woody temperate perennial fruit crop. This will provide opportunities for developing robust and reliable molecular markers linked to genes of interest as well as isolation and characterization of target genes and regulatory elements for crop improvement efforts using markerassisted breeding and/or plant transformation using native genes and regulatory elements, i.e., cis-genesis [7]. To pursue effective crop improvement strategies, when molecular controls of traits of interest are under investigation, these must be associated with observed phenotypic variations. One strategy that can be used to achieve this goal involves correlating observed DNA sequence polymorphism with the phenotype using either genetic linkage mapping or association studies. As a prelude to these approaches, a comprehensive set of molecular markers must be developed from characterized genes using genomics tools. Single nucleotide polymorphisms (SNPs) are good candidates for marker development, as they constitute the most common DNA sequence variations found in genomes of most organisms, including 354 D. Chagné et al. / Genomics 92 (2008) 353–358 the apple [5]. SNPs, defined broadly to include small deletions and insertions, are found in most genomic regions, including coding regions, thus rendering them effective markers for mapping genes. In general, implementation of SNPs for genetic studies, such as linkage mapping or association studies, involves a three-step process. The first step is the SNP discovery phase, which involves the detection of de novo polymorphism by conducting bioinformatics sequence comparisons or molecular biology techniques or a combination of both. This is followed by a validation step to distinguish DNA polymorphisms of genuine allelic variants from those of other biological phenomena, such as gene duplication events; i.e., paralogous or homeologous genes, as well as those of technical errors, primarily sequencing errors, as many databases are based on high-throughput single-pass sequencing. The final step involves characterization of large numbers of individuals using a high-throughput genotyping approach. A wide range of molecular techniques suitable for pursuing these three steps have been described [8], each characterized by a distinct cost-scale and throughput capacity, and utilizing different technology platforms. An example of developing a SNP marker from a candidate gene for red flesh and foliage color in apple has recently been reported [9]. In a targeted study on a small number of candidate genes, Chagné et al. [9] have demonstrated that a single 4-bp insertion–deletion marker located within a candidate gene cosegregates with red color in a mapping population of more than 500 individuals. In this study, a comprehensive approach is adopted to identify a set of markers linked (and in a few cases possibly corresponding) to complex as well as simply inherited traits. SNP markers located within expressed genes have been identified in public apple EST collections and then placed on a genetic map constructed for an important commercial apple cultivar, Royal Gala, using selective (or bin) mapping [10,11]. A critical application for these new markers will be the alignment of genetic and physical maps in order to assemble the forthcoming apple genome sequence, and the subsequent fast-tracking of positional cloning strategies for major genes and quantitative trait loci (QTL) controlling important agronomic characters. This resource of new SNP markers will be particularly useful for future applications in reverse genetics studies and in apple breeding, and also for understanding apple genome evolution within the family Rosaceae. Results Genetic map construction and bin mapping set Genetic maps were constructed for the parents Royal Gala and A689-24 using 152 markers for each (Supplemental Material 1). These markers consisted of 73 AFLP fragments, 155 SSRs, 51 SSRs that were common between both maps, and 25 SCARs. A total of 20 and 18 linkage groups (LGs) spanning 997.1 and 1053.2 cM were obtained for Royal Gala and A689-24, respectively. For LG 1 in Royal Gala, only two SSRs were mapped and cosegregated perfectly, resulting in a 0 cM group. Most other LGs contained at least two previously described SSR markers, except for four LGs. For Royal Gala, LGs 12 and 13 had only one published SSR and LG 7 did not contain any SSR marker; however, a published SCAR marker (NLscDdARM) [12] was mapped. For A68924, only one LG did not contain any published SSR marker; however, as it was the last group to remain orphaned, it was deduced that this probably corresponded to LG 7. Altogether, this allowed alignment of both maps to published apple maps, and assignment of the same LG numbering [13]. For the bin mapping set selection, markers from the Royal Gala dataset were inspected for absence of missing data and for suspect double-recombination events. The number of markers was reduced to allow division of the genome map into 54 and 60 bins for Royal Gala and A689-24, respectively. A subset of 14 individuals was selected, representing a large number of distinct recombination events, and was designated as the bin mapping set. The bin mapping set was validated by adding 10 apple and pear SSR markers of known locations. These markers were chosen to fill in gaps that were not covered by our genetic maps, when compared with published maps. All 10 SSRs mapped to their expected positions (Supplemental Material 1), thus indicating that bin mapping was accurate enough to justify adding new markers. In addition, this strategy made it possible to increase the coverage of the framework map. SNP discovery in the apple EST dataset The new EST assembly using 350,051 EST sequences resulted in a total of 93,959 nonredundant sequences (NRs) at a clustering threshold of 95%. This included 37,885 contigs representing a total estimated sequence length of 32.1 Mb of expressed sequences, approximately 4% of the estimated genome size of the apple. Of these contigs, 17,825 contained at least four ESTs per alignment, and could thus be used for SNP discovery. A set of 68 contigs, each containing more than 200 ESTs in their alignment, was excluded because of computing time issues. Finally, a set of 17,757 contigs representing an estimated length of 10.7 Mb were used for SNP detection (Table 1). A total of 71,482 bi-allelic SNPs were detected in 9555 contigs (53.8%), corresponding to an average occurrence of one SNP every 149 bp. A total of 34,361 transitions (A↔G and C↔T) and 37,121 transversions (A↔C, A↔T, C↔G, G↔T) (Fig. 1) were detected, with A↔G being the most common (18,365; 25.7%) and C↔G the least common (6181; 8.6%) variation observed. SNPs, bioinformatics analysis, and validation A subset of 600 NR sequences with significant blast matches were selected. Putative functions of these sequences were recorded based on their blastx scores. In addition, we recorded the types of SNPs detected, and whether or not sequence variations induced changes in amino acid sequences, i.e., synonymous vs nonsynonymous. A total of 1949 SNPs were identified, including 852 (43.7%) nonsynonymous SNPs and 1097 synonymous SNPs. For Royal Gala, 1266 SNPs were found to be heterozygous. From these 600 NRs, a set of 464 PCR primer pairs encompassing 1434 SNPs was designed (Supplemental Material 2). Of these primer pairs, 341 amplified a single PCR product size, while the remaining yielded either no amplification product or complex patterns (more than two size products). Two approaches were employed according to the set of plants used to validate the SNPs. In the first approach, 25 amplicons were sequenced from the bin mapping set; while in the second approach, 316 amplicons were sequenced from six apple genotypes. A total of 110 amplicons, 10 and 100 using the first and second approaches, respectively, yielded poor sequence quality that could not be used for SNP detection. The remaining 231 (67.7%) amplicons and their sequences were used for verification of SNPs detected using ESTs. Out of 728 putative SNPs found in transcribed regions of these 231 amplicons, 257 could not be verified because of poor sequence quality. For the remaining 471 putative SNPs, 327 (69.4%) were independently verified in new sequences and Table 1 Description of the contig set used for single nucleotide polymorphism detection Number ESTs Singletons Contigs Total NRs (more than 4 ESTs per contig) NRs (more than 200 ESTs per contig) Total NRs used for SNP detection NRs containing SNP 350,051 56,074 37,885 93,959 17,825 68 17,757 9,555 Cumulative size (Mb) 32.1 10.7 D. Chagné et al. / Genomics 92 (2008) 353–358 Fig. 1. Classes of single nucleotide polymorphism detected in the full contig set of the apple. corresponded to true SNPs, whereas 112 (23.7%) were classified as paralogous sequence variations, as they did not show any segregation in the bin mapping set. The remaining 32 SNPs (6.8%) were deemed probable sequencing errors in EST datasets, as these SNPs were not found following independent sequencing. SNPs verified using the first approach were scored, and nine markers were mapped. For markers validated using the second approach, a highresolution melting (HRM) analysis was used. For both approaches, SNP data were compared with framework markers, and the position of the new marker was assigned by visual inspection. When the HRM analysis was monomorphic for the Royal Gala× A689-24 population, the bin mapping set developed by Celton et al. [14], derived from a Malling 9 × Robusta 5 (M.9 × R5) map, was used instead. Of 167 amplicons tested using the HRM analysis, 84 (50.3%) were polymorphic and were mapped. A total of 93 markers segregated, and the locations of 90 markers were assigned to known bins in the maps of both Royal Gala × A689-24 and M.9 × R5 (Supplemental Material 3). Three markers could not be assigned to any LG. Altogether, a set of 93 new EST-based markers corresponding to 210 new coding SNPs were added to the apple genetic map. To assess the utility of this resource for future comparative genomics studies in the family Rosaceae, contig sequences associated with these 93 markers were blasted against Prunus sequences. A large number of these apple contigs (73 out of 93) detected potentially orthologous sequences in Prunus (Blastn score b1 × 10− 20). Discussion The efficacy of in silico SNP detection in apple ESTs The availability of large sequence databases for a number of plant species makes it possible to identify DNA variations corresponding to SNPs [15,16]. SNP discovery within EST collections using bioinformatics tools has been successful in several plant species [17–20]. In this study, a total of 71,482 SNPs were detected in 9555 nonredundant coding regions from a set of 350,051 apple ESTs. This corresponds to an average occurrence of one SNP every 149 bp, and approximately one out of two contigs contains at least one putative SNP. This is comparable to the frequency of SNP discovery observed in other outcrossing plant species, such as pine (one SNP every 102 bp) [18], and lower than white clover (one SNP every 54 bp) [21]. A preliminary SNP detection conducted using the HortResearch EST dataset alone [5] reported the presence of 18,408 SNPs in 3915 contigs. In this study, with an increase in number of ESTs analyzed from 151,687 to 350,051, number of contigs and cumulative sequence lengths covered showed a 2.1- to 2.4-fold increase, resulting in a drastic increase (3.9-fold) in the number of SNPs detected. This increase is probably due to the larger number of genotypes used in 355 generating cDNA libraries represented in the combined EST dataset. The analysis by Newcomb et al. [5] was based on a dataset predominantly (79.8%) consisting of sequences from one apple cultivar (Royal Gala), while the expanded set was derived from a diverse set of apple cultivars, thus reducing the contribution from Royal Gala sequences to 40.1%, followed by GoldRush (32.2%) and then M.9 (5.0%). These results confirm the reported hypothesis that the number of individuals used for generating ESTs has a strong influence on SNP detection [18] and frequency. The forthcoming whole genome sequence of the apple will be based on a single apple genotype, Golden Delicious. Although Golden Delicious is a diploid cultivar, thus permitting the detection of SNPs heterozygous for this cultivar, many variants present in the wider apple germ plasm base will remain undetected. With a single genotype sequence, the probability of detecting SNPs will be limited by the fact that only two haploid genomes are represented in any one individual, even in cases where the read depth is high (whole genome sequence data usually have a minimum read depth of 10× coverage). Indeed, out of 164 SNPs heterozygous in Royal Gala that were sequenced in Golden Delicious, 76 (46.3%) were found to be homozygous in Golden Delicious. While providing a useful framework for assessing genome variation, the apple whole genome sequence is not expected to provide a complete picture of the extant genetic variation present in the entire species. Additional sequencing using multiple genotypes will be required to enhance the power and/or efficacy of SNP detection and its downstream utilization. This is now possible even for crops such as the apple, given the recent reduction in sequencing costs due to new technologies and platforms for high-throughput sequencing. In this study, a markedly higher proportion of synonymous SNPs (56.3%) have been detected than expected (24%) if mutations occurred randomly. This observation is likely to be due to selective pressure operating on the position of SNPs within a gene and imposing variation constraints. This results in synonymous SNPs being more likely to be retained at certain sites for genes under purifying selection. However, it should be noted that the approach for determining the position of start codons used in this study is very conservative, i.e., comparison of highly conserved sequences between apple and queried protein sequences. This indicates that the set of sequences used in this study is biased and as a result may be subjected to a correspondingly biased set of selection pressures. This renders extending our conclusions to all other apple genes or to the whole genome difficult. Although both synonymous and nonsynonymous SNPs are equally useful for mapping, nonsynonymous SNPs are arguably better targets for correlating genotypes with phenotypes in candidate gene mapping approaches, since nonsynonymous changes are more likely to lead to changes in protein structure which, in turn, are more likely to have an effect on plant phenotype. To determine whether nonsynonymous SNPs can contribute to changes in protein structure, additional bioinformatics analysis is required, e.g., analysis of a number of conservative versus nonconservative substitutions generated by nonsynonymous changes. However, synonymous SNPs may also contribute to phenotypic variations and should not be completely ruled out [22]. This dataset provides an important resource for association studies to determine those SNPs linked to trait(s) of interest to breeders. True SNPs versus paralogous variations When bioinformatics tools are used to assemble EST datasets and detect SNPs, alignments of contig sequences correspond only to putative gene contigs, and may contain both paralogous and homeologous sequences. Unfortunately, there is no set value for sequence assembly that can eliminate this problem, and the risk of compacting contigs creates difficulties. As the apple genome is known to have originated from an ancient tetraploidization [23], presence of homeologs in contigs of highly conserved genes is not surprising. Similarly, it is not unusual to find paralogs within contigs of genes that have 356 D. Chagné et al. / Genomics 92 (2008) 353–358 undergone a recent duplication event or where the duplication event is more ancient, but both copies are under selective constraints. In this study, 23.7% of putative SNPs correspond to either paralogous or homeologous sequence variations, which is a lower proportion than that observed in white clover, another paleotetraploid outcrossing plant [21]. It might be expected that the approach used to select for NR contigs has resulted in additional bias toward contigs containing homeologous genes, and this may have influenced the frequency of SNPs detected. An independent assembly of the public Malus ESTs has been performed and it is available on the Genomics Database for Rosaceae (GDR; www.rosaceae.org; Malus assembly v3). This contig set has detected 14,298 SNPs in 23,868 contigs. However, this bioinformatics analysis is based on assembly parameters less stringent (CAP3, -p 90) than those used in this study, which likely increases the abundance of paralogous sequences clustered within contigs. Hence, we speculate that a substantial number of the putative SNPs detected from the GDR assembly must be due to nonallelic variations. SNP validation and bin mapping the apple are relatively poor in transcribed sequences compared with other crops. This study adds a new set of 93 gene coding markers to the Royal Gala map, which is the largest increase in gene coding sequence for an apple genetic map reported so far. This set represents a good resource for identifying genome colocation events between candidate genes and QTLs. For example, our new set of markers include some resistance gene analogues that map close to major resistance genes for several pests and diseases. SNP markers developed from DQ644420 and EB121887 mapped to the middle and bottom of LG 2, respectively, where several resistance genes controlling the fungal disease apple scab have been reported [31]. Other previously mapped genes, such as ACC synthase (MDU73816, LG 15 [32]) and an allergen protein (EB133053, LG 13 [33]), have been remapped using these new apple SNPs, confirming the validity of the approach. Contigs containing these newly developed SNP markers will enable the development of orthologous markers for comparative genome mapping studies in Rosaceae. When the best blastn score against Prunus sequences was analyzed (Supplemental Material 3), high sequence similarities between ESTs used to design these new markers and Prunus ESTs were found, with 78.5% of the new apple contigs showing an expected blastn score b1 × 10− 20. This dataset presents a particularly useful resource to build on previous mapping studies [34], and thus for assessing the degree of synteny among members of the Rosaceae family. Candidate orthologous markers for these apple genes can be used across the family Rosaceae, using either sequencing or HRM to identify the SNPs present in each genus. This study is the first example of systematic SNP development from sequence information in the apple. It has demonstrated the importance of using sequence databases containing a broad germ plasm base. The candidate genes and approaches used are valuable resources for future SNP development for genetic mapping, comparative genome mapping, and association studies. The SNP validation strategy used in this study consists of sequencing PCR products obtained from genomic DNA PCR amplification. Segregation is then tested in a bin mapping set using both resequencing and HRM. Allelic variants, true SNPs, are likely verified in alignments obtained from sequenced fragments. Moreover, if these SNPs exhibit variation for Royal Gala in EST alignments, they should also segregate in a controlled cross having Royal Gala as one of the parents. For those instances where in silico-detected SNPs could not be detected upon resequencing, these were deemed either sequencing or cloning errors. For those cases where verified sequence variants did not exhibit any segregation in the bin mapping set, these must have resulted from either paralogous or homeologous loci. Overall, only a small subset of original SNPs identified were evaluated by resequencing. From an initial 1434 putative SNPs located in 464 PCR amplicons designed, only 471 SNPs, including paralogous variants and sequencing errors, could be retrieved and analyzed. This is attributed to the sequencing technique used, which generated a high proportion of poor-quality sequences. Similar observations were made when a comparable method was used to develop SNP markers in Vitis vinifera [24]. When the HRM approach was used, a higher proportion of candidate SNPs were validated and mapped (84 out of 167 amplicons), and thus this approach was deemed more efficient. Relatively few HRM reactions gave no amplification products due to either one or more primers traversing a splicing site, but some did exhibit complex melting patterns that were difficult to score. Therefore, the HRM approach is the method of choice for future SNP development and for medium level throughput genotyping. Although the HRM technique has been used to detect mutations associated with chronic diseases in humans [25,26] and for detection of RNA editing in Arabidopsis [27], this is the first example, to our knowledge, of its use in gene mapping in plants. The strategy used in this study went beyond verification of segregation of apple SNP markers, as a selective mapping approach has been also used to identify putative chromosomal locations of these SNPs. Use of a carefully chosen set of individuals has enabled efficient validation with simultaneous mapping of SNPs. The apple genetic framework map developed in this study for this purpose is the first for the high-quality cultivar Royal Gala, and this is the first bin mapping set developed for this cultivar. Previously, a bin mapping strategy has been successfully employed in the peach, another Rosaceae species [10], and in an apple rootstock map [14]. A genetic map was constructed using a population of 173 individuals from a cross between Royal Gala and A689-24. Trees were grown at the HortResearch orchard in Havelock North, New Zealand. DNA was extracted as described by Gardiner et al. [35]. The map was constructed using simple sequence repeats (SSRs), amplified fragment length polymorphisms (AFLPs), and sequence characterized amplified regions (SCARs). SSRs were PCR amplified as described by Maliepaard et al. [36]. The parental linkage maps were constructed with the aid of Joinmap v3.0 [37] using markers informative for each of the parents, and according to the double-pseudo-testcross mapping strategy [38]. A minimum LOD score of 3.0 was used for grouping, the Kosambi function was used for map distance calculation, and the maps were constructed after three rounds. SNPs were validated by evaluating their segregation in a subset of plants selected from the whole mapping population using the following protocol. For each linkage group, markers were sorted and selected to cover the genome with intervals of 10 to 30 cM, referred to as bins. Markers with missing data were not selected. Those individuals presenting the most unique recombination events for these bins were selected to reduce redundancy in the progeny set. A set of 14 of these individuals (the bin mapping set), representing a large number of distinct recombination events over available linkage groups, was selected by an iterative approach using manual expert evaluation [14]. Potential use of these new SNPs for marker/trait associations EST sources and database construction Although some EST-based SSR and NBS-LRR markers have recently been mapped [14,28–30] (166 in total), published genome maps for A total of 350,051 sequences from three different sets of apple ESTs was used to populate the database (Supplemental Material 4). The first Materials and methods Construction of a genetic map for a ‘Royal Gala’ × A689/24 cross and development of a bin mapping set D. Chagné et al. / Genomics 92 (2008) 353–358 set was described by Newcomb et al. [5] and consists of 151,687 ESTs, with a large proportion (78.9%) obtained from the cultivar Royal Gala. A second set was developed at the University of Illinois and consisted of 101,581 publicly released ESTs [39] and 80,660 new sequences originating from a number of apple accessions. A third, smaller set corresponded to various apple coding sequences deposited in GenBank. ESTs were assembled into a single nonredundant dataset and annotated using Bioview [40] as described by Newcomb et al. [5]. Sequencing errors were minimized during the assembly process by removing low-quality sequence regions. The information on the cultivar of origin was recorded for each EST in order to identify the genetic background of each of the sequences, and information for sets one and two is available on the GDR database. SNP discovery and characterization An automated SNP tool developed within Bioview [40] was used to search for SNPs within EST alignments. Sequence variation with both variants occurring at least twice in the contig alignments was retained in order to minimize detection of sequencing errors. A conservative approach was used to find ORFs to ensure accurate open reading frame (ORF) annotation and take into account the lack of a publicly available whole genome sequence for apple for use in aligning gene sequences. A subset of highly conserved contigs rather than an automated bioinformatics analysis (e.g., based on ORF finder type scripts) prone to errors was used. A subset of 600 nonredundant contigs was selected according to the following criteria: (a) best blastx matches with expected values of less than 1 × 10− 20 when checked against plant proteins (UniRef90 database, [41]), and (b) proteins from alignments between queried and translated sequences beginning with the first amino acid, along with the first 15 amino acids being identical. Contigs were first checked for the presence of SNPs heterozygous in Royal Gala, and then SNPs were annotated by comparing the amino acid translations to determine whether they were synonymous or nonsynonymous, i.e., whether they coded for an amino acid change. SNP validation and mapping A set of SNPs located in translated regions, both synonymous and nonsynonymous, were selected for a validation trial and to estimate the proportion of sequence variations corresponding to either true SNPs or sequencing errors or attributable to gene duplication. PCR primer pairs were designed using Primer 3 (http://frodo.wi.mit.edu/ primer3/input.htm) to yield PCR products that could be sequenced in a single-pass sequencing reaction (300 to 550 bp), encompassing at least one SNP, and preferably showing some sequence variability for Royal Gala in EST alignments. Based on different sets of individuals used for PCR amplification and sequencing, two approaches were used to validate SNPs. The first approach consisted of resequencing PCR fragments amplified from the 14 highly informative individuals from the Royal Gala × A689-24 mapping population (bin mapping set) along with Golden Delicious and Coop 17. The second approach consisted of resequencing PCR fragments from six apple genotypes (Royal Gala, Malling 9, Golden Delicious, Coop 17, Fuji, and GoldRush) and then redesigning primers to generate shorter fragments (b300 bp), suitable for SNP analysis by the high-resolution melting (HRM) technique [42] on a LightCycler 480 instrument (Roche Diagnostics), in the bin mapping set. Briefly, this latter technique allows detection of mutations based on differential melting of PCR-amplified doublestranded DNA fragments. The melting analysis is performed at the end of the PCR reaction, and the reaction mix contains a high-fidelity intercalating dye. Products are slowly denatured, reannealed to initiate the formation of heteroduplexes, and then melted again. The decrease in the fluorescence intensity is measured, and the difference in the melting temperature signals whether or not the sample 357 contains heteroduplexes (and hence is heterozygous). For both approaches, raw sequence traces were aligned for each amplicon using Sequencher v4.5 (Gene Codes, Ann Arbor, MI, USA), and SNPs were visually scored for each genotype. Acknowledgments This work was funded by HortResearch (to DC and SEG). This project was also supported by the USDA Cooperative State Research, Education and Extension Service, National Research Initiative, Plant Genome Program Grant 2005-35300-15538 (to SSK). The authors thank Dr. Jean-Marc Celton for providing DNA from the M.9 × R5 population, Paula Jones and Anthony Thrush from Roche Diagnostic NZ Ltd for their help with the HRM technique, and Drs. Nnadozie Oraguzie and Vincent Bus for helpful comments on the manuscript. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.ygeno.2008.07.008. References [1] J. Yu, S.N. Hu, J. Wang, G.K.S. Wong, S.G. Li, B. Liu, Y.J. Deng, L. Dai, Y. Zhou, X.Q. Zhang, et al., A draft sequence of the rice genome (Oryza sativa L. ssp indica), Science 296 (5565) (2002) 79–92. [2] S.A. Goff, D. Ricke, T.H. Lan, G. Presting, R.L. Wang, M. Dunn, J. Glazebrook, A. Sessions, P. Oeller, H. Varma, et al., A draft sequence of the rice genome (Oryza sativa L. ssp japonica), Science 296 (5565) (2002) 92–100. [3] X.Y. Lin, S.S. Kaul, S. Rounsley, T.P. Shea, M.I. Benito, C.D. Town, C.Y. Fujii, T. Mason, C.L. Bowman, M. Barnstead, et al., Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana, Nature 402 (6763) (1999) 761. [4] K. Mayer, C. Schuller, R. Wambutt, G. Murphy, G. Volckaert, T. Pohl, A. Dusterhoft, W. Stiekema, K.D. Entian, N. Terryn, et al., Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana, Nature 402 (6763) (1999) 769. [5] R.D. Newcomb, R.N. Crowhurst, A.P. Gleave, E.H.A. Rikkerink, A.C. Allan, L.L. Beuning, J.H. Bowen, E. Gera, K.R. Jamieson, B.J. Janssen, et al., Analyses of expressed sequence tags from apple, Plant Physiol. 141 (1) (2006) 147–166. [6] S.E. Gardiner, V.G.M. Bus, R.L. Rushholme, D. Chagné, E. Rikkerink, Apple, in: C.R. K (Ed.), Genome Mapping & Molecular Breeding in Plants, Horticultural trees, vol. 4, Springer, New York, 2006, pp. 1–62. [7] C.M. Rommens, All-native DNA transformation: a new approach to plant genetic engineering, Trends Plant Sci. 9 (9) (2004) 457–464. [8] D. Chagné, J. Batley, D. Edwards, J. Forster, Chapter 5: Single Nucleotide Polymorphism genotyping in plants, in: N.C. Oraguzie, E. Rikkerink, S.E. Gardiner, N.H. De Silva (Eds.), Association mapping in plants, Springer, New York, USA, 2006, pp. 77–94. [9] D. Chagné, C.M. Carlisle, C. Blond, R.K. Volz, C.J. Whitworth, N.C. Oraguzie, R.N. Crowhurst, A.C. Allan, R.V. Espley, R.P. Hellens, et al., Mapping a candidate gene (MdMYB10) for red flesh and foliage colour in apple, BMC Genomics (2007) 8. [10] W. Howad, T. Yamamoto, E. Dirlewanger, R. Testolin, P. Cosson, G. Cipriani, A.J. Monforte, L. Georgi, A.G. Abbott, P. Arus, Mapping with a few plants: Using selective mapping for microsatellite saturation of the Prunus reference map, Genetics 171 (3) (2005) 1305–1309. [11] T.J. Vision, D.G. Brown, D.B. Shmoys, R.T. Durrett, S.D. Tanksley, Selective mapping: A strategy for optimizing the construction of high-density linkage maps, Genetics 155 (1) (2000) 407–420. [12] P. Roche, G. van Arkel, A.W. van Heusden, A specific PCR assay for resistance to biotypes 1 and 2 of the rosy leaf curling aphid in apple based on an RFLP marker closely linked to the Sd(1) gene, Plant Breeding 116 (6) (1997) 567–572. [13] R. Liebhard, B. Koller, L. Gianfranceschi, C. Gessler, Creating a saturated reference map for the apple (Malus × domestica Borkh.) genome, Theor. Appl. Genet. 106 (8) (2003) 1497–1508. [14] J.-M. Celton, D.S. Tustin, D. Chagné, S.E. Gardiner, Construction of a dense genetic linkage map for apple rootstocks using SSRs developed from Malus ESTs and Pyrus genomic sequences. Tree Genet. Genomes, doi:10.1007/s11295-008-0171-z. [15] L. Picoult-Newberg, T.E. Ideker, M.G. Pohl, S.L. Taylor, M.A. Donaldson, D.A. Nickerson, M. Boyce-Jacino, Mining SNPs from EST databases, Genome Res. 9 (2) (1999) 167–174. [16] P. Taillon-Miller, Z. Gu, Q. Li, L. Hillier, P.-Y. Kwok, Overlapping genomic sequences: A treasure trove of single-nucleotide polymorphisms, Genome Res. 8 (7) (1998) 748–754. [17] J. Batley, G. Barker, H. O'Sullivan, K.J. Edwards, D. Edwards, Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data, Plant Physiol. 132 (1) (2003) 84–91. [18] L. Le Dantec, D. Chagné, D. Pot, O. Cantin, P. Garnier-Gere, F. Bedon, J.M. Frigerio, P. Chaumeil, P. Leger, V. Garcia, et al., Automated SNP detection in expressed sequence tags: statistical considerations and application to maritime pine sequences, Plant Mol. Biol. 54 (3) (2004) 461–470. 358 D. Chagné et al. / Genomics 92 (2008) 353–358 [19] C. Lopez, B. Piegu, R. Cooke, M. Delseny, J. Tohme, V. Verdier, Using cDNA and genomic sequences as tools to develop SNP strategies in cassava (Manihot esculenta Crantz), Theor. Appl. Genet. 110 (3) (2005) 425–431. [20] D.J. Somers, R. Kirkpatrick, M. Moniwa, A. Walsh, Mining single-nucleotide polymorphisms from hexaploid wheat ESTs, Genome 46 (3) (2003) 431–437. [21] N.O.I. Cogan, R.C. Ponting, A.C. Vecchies, M.C. Drayton, J. George, P.M. Dracatos, M.P. Dobrowolski, T.I. Sawbridge, K.F. Smith, G.C. Spangenberg, et al., Geneassociated single nucleotide polymorphism discovery in perennial ryegrass (Lolium perenne L.), Mol. Genet. Genom. 276 (2) (2006) 101–112. [22] C. Kimchi-Sarfaty, J.M. Oh, I.-W. Kim, Z.E. Sauna, A.M. Calcagno, S.V. Ambudkar, M.M. Gottesman, A “Silent” Polymorphism in the MDR1 Gene Changes Substrate Specificity, Science 315 (5811) (2007) 525–528. [23] R.C. Evans, C.S. Campbell, The origin of the apple subfamily (Maloideae; Rosaceae) is clarified by DNA sequence data from duplicated GBSSI genes, Am. J. Bot. 89 (9) (2002) 1478–1484. [24] M. Troggio, G. Malacarne, G. Coppola, C. Segala, D.A. Cartwright, M. Pindo, M. Stefanini, R. Mank, M. Moroldo, M. Morgante, et al., A Dense Single-Nucleotide Polymorphism-Based Genetic Linkage Map of Grapevine (Vitis vinifera L.) Anchoring Pinot Noir Bacterial Artificial Chromosome Contigs, Genetics 176 (4) (2007) 2637–2650. [25] M. Liew, L. Nelson, R. Margraf, S. Mitchell, M. Erali, R. Mao, E. Lyon, C. Wittwer, Genotyping of human platelet antigens 1 to 6 and 15 by high-resolution amplicon melting and conventional hybridization probes, J. Mol. Diagnost. 8 (1) (2006) 97–104. [26] J. Montgomery, C.T. Wittwer, J.O. Kent, L.M. Zhou, Scanning the cystic fibrosis transmembrane conductance regulator gene using high-resolution DNA melting analysis, Clin. Chem. 53 (11) (2007) 1891–1898. [27] A.L. Chateigner-Boutin, I. Small, A rapid high-throughput method for the detection and quantification of RNA editing based on high-resolution melting of amplicons, Nucl. Acids Res. 35 (17) (2007). [28] F. Calenge, C.G. van der Linden, E. van de Weg, H.J. Schouten, G. van Arkel, C. Denance, C.E. Durel, Resistance gene analogues identified through the NBSprofiling method map close to major genes and QTL for disease resistance in apple, Theoret. Appl. Genet. 110 (4) (2005) 660–668. [29] E. Silfverberg-Dilworth, C.L. Matasci, W.E. van de Weg, M.P.W. van Kaauwen, M. Walser, L.P. Kodde, V. Soglio, L. Gianfranceschi, C.E. Durel, F. Costa, et al., Microsatellite markers spanning the apple (Malus × domestica Borkh.) genome, Tree Genet. Genomes 2 (4) (2006) 202–224. [30] N. Suresh, C. Hampson, K. Gasic, G. Bakkeren, S.S. Korban, Development and linkage mapping of E-STS and RGA markers for functional gene homologues in apple, Genome 49 (8) (2006) 959–968. [31] V.G.M. Bus, E.H.A. Rikkerink, W.E.v.d. Weg, R.L. Rusholme, S.E. Gardiner, H.C.M. Bassett, L.P. Kodde, L. Parisi, F.N.D. Laurens, E.J. Meulenbroek, et al., The Vh2 and Vh4 scab resistance genes in two differential hosts derived from Russian apple R12740-7A map to the same linkage group of apple, Mol. Breeding 15 (1) (2005) 103–116. [32] F. Costa, S. Stella, W.E. Van de Weg, W. Guerra, M. Cecchinel, J. Dallavia, B. Koller, S. Sansavini, Role of the genes Md-ACO1 and Md-ACS1 in ethylene production and shelf life of apple (Malus domestica Borkh), Euphytica 141 (1-2) (2005) 181–190. [33] Z.S. Gao, W.E. van de Weg, J.G. Schaart, H.J. Schouten, D.H. Tran, L.P. Kodde, I.M. van der Meer, A.H.M. van der Geest, J. Kodde, H. Breiteneder, et al., Genomic cloning and linkage mapping of the Mal d 1 (PR-10) gene family in apple (Malus domestica), Theoret. Appl. Genet. 111 (1) (2005) 171–183. [34] E. Dirlewanger, E. Graziano, T. Joobeur, F. Garriga-Caldere, P. Cosson, W. Howad, P. Arus, Comparative mapping and marker-assisted selection in Rosaceae fruit crops, Proc. Natl. Acad. Sci. U.S.A. 101 (26) (2004) 9891–9896. [35] S.E. Gardiner, H.C.M. Bassett, D.A.M. Noiton, V.G. Bus, M.E. Hofstee, A.G. White, R.D. Ball, R.L.S. Forster, E.H.A. Rikkerink, A detailed linkage map around an apple scab resistance gene demonstrates that two disease resistance classes both carry the V (f) gene, Theoret. Appl. Genet. 93 (4) (1996) 485–493. [36] C. Maliepaard, F.H. Alston, G. Van Arkel, L.M. Brown, E. Chevreau, F. Dunemann, K.M. Evans, S. Gardiner, P. Guilford, A.W. Van Heusden, et al., Aligning male and female linkage maps of apple (Malus pumila Mill.) using multi-allelic markers, Theoret. Appl. Genet. 97 (1-2) (1998) 60–73. [37] J.W. Van Ooijen, R.E. Voorrips, JoinMap® 3.0, Software for the calculation of genetic linkage maps, Plant Research International, Wageningen, The Netherlands, 2001. [38] D. Grattapaglia, R. Sederoff, Genetic linkage maps of Eucalyptus grandis and Eucalyptus urophylla using a pseudo-testcross: Mapping strategy and RAPD markers. Genetics 137 (4) (1994) 1121–1137. [39] S. Naik, C. Hampson, K. Gasic, G. Bakkeren, S.S. Korban, Development and linkage mapping of E-STS and RGA markers for functional gene homologs in apple, Genome 49 (2006) 959–968. [40] R.N. Crowhurst, M. Davy, C. Deng, BioView - an enterprise bioinformatics system for automated analysis and annotation of non-genomic DNA sequence, 3rd Roseceae Genomics Conference: 2006, Napier, New Zealand, 2006. [41] B.E. Suzek, H.Z. Huang, P. McGarvey, R. Mazumder, C.H. Wu, UniRef: comprehensive and non-redundant UniProt reference clusters, Bioinformatics 23 (10) (2007) 1282–1288. [42] M. Liew, R. Pryor, R. Palais, C. Meadows, M. Erali, E. Lyon, C. Wittwer, Genotyping of single-nucleotide polymorphisms by high-resolution melting of small amplicons, Clin. Chem. 50 (7) (2004) 1156–1164.