Exploiting EST databases for the development and characterization of EST-SSR markers in castor bean (Ricinus communis L

Please download to get full document.

View again

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Document Description
Exploiting EST databases for the development and characterization of EST-SSR markers in castor bean (Ricinus communis L
Document Share
Document Tags
Document Transcript
  RESEARCH ARTICLE Open Access Exploiting EST databases for the developmentand characterization of EST-SSR markers incastor bean ( Ricinus communis  L.) Lijun Qiu 1,3 , Chun Yang 1 , Bo Tian 1 , Jun-Bo Yang 2 , Aizhong Liu 1* Abstract Background:  The castor bean  (Ricinus communis  L.), a monotypic species in the spurge family (Euphorbiaceae,2n = 20), is an important non-edible oilseed crop widely cultivated in tropical, sub-tropical and temperatecountries for its high economic value. Because of the high level of ricinoleic acid (over 85%) in its seed oil, thecastor bean seed derivatives are often used in aviation oil, lubricants, nylon, dyes, inks, soaps, adhesive andbiodiesel. Due to lack of efficient molecular markers, little is known about the population genetic diversity and thegenetic relationships among castor bean germplasm. Efficient and robust molecular markers are increasinglyneeded for breeding and improving varieties in castor bean. The advent of modern genomics has produced largeamounts of publicly available DNA sequence data. In particular, expressed sequence tags (ESTs) provide valuableresources to develop gene-associated SSR markers. Results:  In total, 18,928 publicly available non-redundant castor bean EST sequences, representing approximately17.03 Mb, were evaluated and 7732 SSR sites in 5,122 ESTs were identified by data mining. Castor bean exhibitedconsiderably high frequency of EST-SSRs. We developed and characterized 118 polymorphic EST-SSR markers from379 primer pairs flanking repeats by screening 24 castor bean samples collected from different countries. A total of 350 alleles were identified from 118 polymorphic SSR loci, ranging from 2-6 per locus (A) with an average of 2.97. The EST-SSR markers developed displayed moderate gene diversity ( H  e ) with an average of 0.41. Geneticrelationships among 24 germplasms were investigated using the genotypes of 350 alleles, showing geographicpattern of genotypes across genetic diversity centers of castor bean. Conclusion:  Castor bean EST sequences exhibited considerably high frequency of SSR sites, and were richresources for developing EST-SSR markers. These EST-SSR markers would be particularly useful for both geneticmapping and population structure analysis, facilitating breeding and crop improvement of castor bean. Background Castor bean (  Ricinus communis  L., Euphorbiaceae, 2n =20) is an important non-edible oilseed crop and its seedderivatives are often used in aviation oil, lubricants,nylon, dyes, inks, soaps, adhesive and biodiesel. Amongall the vegetable oils, castor bean oil is distinctive due toits high level of ricinoleic acid (over 85%), a fatty acidconsisting of 18 carbons, a double bond between C9and C10, and a hydroxyl group attached to C12.Ricinoleic acid is responsible for castor bean oil interest,with the highest and most stable viscosity index amongall the vegetable oils combined with high lubricity, espe-cially under low-temperature conditions. Although itwas found that castor bean seeds had been used by peo-ple dating from about 4000 BC [1], it is still an unan-swered question about the origin of castor beancultivation. Castor bean ’ s contemporary distribution inthe warmer regions is worldwide, although its srcin isobscured by wide dissemination in ancient times andthe ease and rapidity with which it becomes established.Castor bean is indigenous to southeastern Mediterra-nean Basin, Eastern Africa, and India, and most prob-ably srcinated in tropical Africa [2,3]. Because of its * Correspondence: liuaizhong@xtbg.ac.cn 1 Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical BotanicalGarden, Chinese Academy of Sciences, 88 Xuefu Road, Kunming 650223, PRChinaFull list of author information is available at the end of the article Qiu  et al  .  BMC Plant Biology   2010,  10 :278http://www.biomedcentral.com/1471-2229/10/278 © 2010 Qiu et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the srcinal work is properly cited.  high economic value, castor bean is widely cultivated intropical, sub-tropical and temperate countries, particu-larly India, China and Brazil [4]. Due to increaseddemand for castor bean in many countries, breedingand improvement of varieties are drawing great atten-tion from breeders [5].Although the genus  Ricinus  is considered monotypic,castor bean varies greatly in its growth habit, color of foliage and stems, seed size and oil content [6,7]. Most types are large perennials that often develop into smalltrees in tropical or subtropical areas; however it isusually shorter and smaller and grown annually in areasprone to frost. It is obvious that castor bean exhibitsgreat phenotypic diversity and phenotypic plasticity toenvironmental factors. However, little is known aboutcastor bean ’ s genetic diversity and the genetic basis of its phenotypic plasticity. Castor bean is usually consid-ered to be both self- and cross-pollinated by wind, butcontrolled crossing studies suggest that outcrossing is afrequent mode of reproduction [8,9]. Germplasm collections constitute one of the world ’ smost readily available sources of plant genetic material[10]. The USDA-ARS Plant Genetic Resources Conser- vation Unit (at Griffin, GA, USA) collected and main-tained diverse germplasm resources of castor beanworldwide, which provided valuable germplasms for cas-tor bean breeding and improvement of varieties. Thereis an increasing need for distinguishing the varieties reli-ably, establishing their purity, and fingerprintingreleased varieties, hybrids and the parental lines of cas-tor bean germplasm held in different countries by effi-cient molecular markers during breeding andimprovement of varieties. Most cultivars have low pro-ductivity. The castor bean seed, meanwhile, contains thehighly toxic protein ricin which seriously limits itsusage. The main goal of breeding and improvement of  varieties to breeders is to develop high-productivity andnontoxic varieties of castor bean. Developing robust andreliable molecular markers associated with traits of interest will enhance the breeding program efficiency.Simple sequence repeats (SSRs) or microsatellitesshowing extensive length polymorphisms have beenwidely used in DNA fingerprinting, genetic diversity stu-dies, construction of genetic linkage map and breedingapplications [11]. Previous studies of genetic diversity suggested that SSRs are more informative and robustthan other available molecular marker resources, suchas amplified fragment length polymorphism (AFLP) andrandom amplified polymorphic DNA (RAPD) in castorbean [12,13]. In particular, SSR markers are readily  transferable between laboratories as each locus isdefined by the primer sequence. SSRs can be used notonly for identifying cultivars but also for genetic map-ping and marker-assisted selection [14,15]. Development of SSR markers specific to castor bean is critical andshould be a priority for assisting in the breeding andimprovement of varieties [5]. The SSR markers of castorbean are, however, very limited to date because the de novo  development of SSRs is a costly and time con-suming endeavor [16,17]. The advent of modern geno- mics age has produced large amounts of publicly available DNA sequence data. In particular, theexpressed sequence tags (ESTs) provide a valuableresource for identifying and developing gene-associatedSSR markers. Linkage of EST-SSR markers with desiredcharacters may lead to the identification of genes con-trolling these traits [18]. In addition, EST-SSRs are uni- versal and can be applied in comparative mapping andlinkage map construction [19,20]. Therefore, in recent  years, EST-SSRs have already been developed for variouscrops such as wheat and rice [21-25], barley [26-28], grape [29], tomato [30], sugar cane [19], coffee [31-33], oil palm [34] and rubber tree [35]. To our knowledge, there has been no report of devel-opment of EST-SSR markers in castor bean to date.Therefore, we report our work on EST-SSRs derivedfrom castor bean ESTs in the National Centre of Bioin-formatics Information, USA database, based on (1) thefrequency and distribution of SSRs in castor bean ESTs,(2) the establishment and validation of EST-SSR mar-kers for detection of polymorphism in castor bean, and(3) the assessment of genetic relationships among 24germplasm accessions collected from main diversity cen-ters of castor bean by using EST-SSR markers devel-oped. These rich SSR resources from castor bean ESTdatabase are publicly available and the polymorphicEST-SSR markers reported herein would be particularly useful for genetic map-based analyses as well as popula-tion genetic studies, facilitating breeding and cropimprovement of castor bean. Results Frequency and distribution of microsatellites A total of 18,928 non-redundant castor bean ESTsequences trimmed were identified from 62,611 publicly available EST sequences by running the EST-TRIMMERand the CD-HIT programs. The search for microsatel-lites in 18,928 non-redundant castor bean ESTs repre-senting approximately 13.68 Mb revealed 7,732microsatellites in 5,376 ESTs; nearly one in 3.5 uniqueESTs (28.4%) contained at least one SSR; 2,356 ESTscontained more than one SSR and 573 SSRs were foundas compound SSRs. This corresponds to an average dis-tance between SSRs of approximately 1.77 kb (i.e. oneSSR per 1.77 kb) or one SSR-containing EST every 2.45ESTs. The SSRs identified contained 1939 di-, 3698 tri-,220 tetra-, 61 penta-, 138 hexa-, and 1676 mononucleo-tides (Table 1). The trinucleotides are the dominant Qiu  et al  .  BMC Plant Biology   2010,  10 :278http://www.biomedcentral.com/1471-2229/10/278Page 2 of 10  motifs (Figure 1). Among motif repeats, 1624 A/Trepeats accounting for 96.9% of total mononucleotiderepeats (1676) were the dominant mono- motifs; 1350AG/CT repeat accounting for 69.6% of total dinucleo-tide repeats (1939) are the dominant di- motifs. How-ever, the trinucleotide motifs were relatively diverse with321 AAG/CTT, the richest repeat among tri- motifs,accounting for 8.7% of total trinucleotide motifs (3698).Similarly, there were no obvious dominant motifsamong the tetra-, penta- and hexanucleotide motifs.Inspection of SSR location on EST sequences showedthat 1344 mono- repeats (accounting for 80.2%), 1362di- repeats (accounting for 70.3%), 183 tetra- repeats(accounting for 83.2%), and 47 penta- repeats (account-ing for 77.1%) occurred within un-translated regions(UTRs), while 2813 tri- repeats (accounting for 76.1%) Table 1 Occurrence of 7732 SSRs identified in a set of 18,928 non-redundant castor bean ESTs SSR motifs Number of repeats4 5 6 7 8 9 10 11 12 13 14 15 > 15 A/T 435 288 209 138 119 83 352 1624C/G 9 14 11 6 4 3 5 52AC/GT 49 27 11 8 11 3 2 1 2 4 117AG/CT 623 200 130 81 43 58 29 56 38 17 25 49 1350AT/TA 181 63 37 40 28 17 28 15 14 6 7 33 469CG/GC 2 1 3AAC/GTT 142 41 31 11 5 1 2 233AAG/CTT 419 184 109 58 42 20 18 17 1 1 869AAT/ATT 166 96 39 34 2 8 2 2 1 1 351ACC/GGT 326 125 54 28 7 13 1 554ACG/CGT 41 18 8 2 3 2 74ACT/AGT 24 17 8 3 1 53AGC/GCT 349 135 47 28 22 7 5 1 614AGG/CCT 177 50 24 19 10 3 1 1 285ATC/GAT 295 82 30 27 18 6 1 2 461CCG/CGG 136 34 18 16 204AAAC/GTTT 12 1 13AAAG/CTTT 54 24 5 3 4 90AAAT/ATTT 33 3 1 37Other Tetra-* 56 17 6 1 80AAAGA 10 1 11Other Penta-* 44 5 1 50Hexa-* 106 19 11 2 138N 444 302 220 144 123 86 357 1676NN 855 290 179 129 82 78 59 71 53 23 34 86 1939NNN 2095 782 368 226 110 59 27 22 5 0 0 2 2 3698NNNN 155 45 11 4 4 0 0 0 0 0 0 0 1 220NNNNN 54 5 0 1 0 1 0 0 0 0 0 0 0 61NNNNNN 106 19 11 2 0 0 0 0 0 0 0 0 0 138 TOTAL 2410 1706 680 412 243 142 549 383 296 197 146 122 446 7732 * The motif with less 10 SSR was not listed. 05001000150020002500300035004000Mono Di Tri Tetra Penta Hexa SSR Type    S   S   R   N  u  m   b  e  r Exon RegionUTR Region Figure 1  Number of mono-, di-, tri-, tetra-, penta- and hexa-SSRs and their distribution between UTR and exon regions . Qiu  et al  .  BMC Plant Biology   2010,  10 :278http://www.biomedcentral.com/1471-2229/10/278Page 3 of 10  and 101 hexa- repeats (accounting for 73.2%) occurredwithin expression regions (see Figure 1). Polymorphism and genera transferability of EST-SSRsmarkers Out of 6056 SSR embedded within 3871 ESTs, exclud-ing 1676 MNRs, primer pairs could be designed for4223 SSR loci (69.7%) by using PRIMER3. The remain-ing sequences contained either too little DNA sequenceflanking the SSR loci or the sequences were inappropri-ate for primer modeling. Three hundred and seventy-nine primer pairs flanking 151 di-nucleotide repeats(DNRs), 185 tri-nucleotide repeats (TNRs), 35 tetra-nucleotide repeats (TeNRs), 4 penta- nucleotide repeats(PNRs) and 4 Hexa-nucleotide repeats (HNRs) wereassayed to test the polymorphism and genera transfer-ability of EST-SSRs in 24 accessions worldwide (seeadditional file 1, Table S1, additional). In 308 (81.2%)cases, PCR products could be amplified with genomicDNA, while for 71 primer pairs PCR completely failed,amplified too weakly, or amplified multiple bands andthe 71 primers were excluded from further analysis (seeadditional file 2 Table S2, additional). In 21 cases, theamplicons obtained were of obviously larger size thanexpected from the EST sequence, probably due to thepresence of introns. The amplification of introns may cause problems, since fragments above 300 bp could notbe scored accurately for small differences in fragmentsize. Additionally, it can be assumed that in severalcases the observed polymorphism is caused by a sizepolymorphism within the intron, which may overshadow a putative polymorphism of the microsatellite. Thus the21 primer pairs containing obvious introns and produ-cing over 300 bp fragments were also excluded fromfurther analyses. One Hundred and sixty-nine primerpairs were monomorphic, covering 56 di- motif loci, 104tri- motif loci and 9 tetra- motif loci. In total, 118 poly-morphic EST-SSR markers from 287 primer pairs wereidentified, including 68 di- motif loci, 42 tri- motif lociand 8 tetra- motif loci (see additional file 2, Table S2,additional). The proportion of polymorphic primers was41.1%. The polymorphic proportion of di-, tri-, andtetra- motif loci were 54.8%, 28.8% and 47%, respec-tively. From the 118 loci we identified 350 alleles withan average of 2.97 alleles per locus (Table S3, Figure 2).Of the 350 alleles, 223 alleles were from di- loci with anaverage of 3.28 per locus, 107 alleles were from tri- lociwith an average of 2.49 per locus. Across 118 loci, genediversity (expected heterozygosity,  He ) ranged from 0.08to 0.78 (mean = 0.41 ± 0.02). Among 68 dinucleotideloci and 42 trinucleotide loci, the mean of   He  were 0.44and 0.37, respectively. Across dinucleotide and trinu-cleotide loci, dinucleotide SSRs were significantly morepolymorphic than trinucleotide SSRs (nA and  H  e both  P   < 0.01; 2-sample  t   test). Across 118 loci, PIC valuesranged from 0.07 to 0.73 (mean = 0.36 ± 0.02), suggest-ing the EST-SSR markers developed had moderate levelof polymorphism. BLAST analyses showed that 76 ESTsequences from the developed 118 polymorphic SSRmarkers shared significant homology to  Arabidopsis  loci.The functional annotations of markers developed werelisted in Table S3 (see additional file 3, additional).To test the genera transferability of EST-SSRs identifiedin castor bean to  Jatropha curcas  and  Speranskia canto-nensis , the 308 primer pairs, which could successfully amplify PCR products in castor bean were tested foramplification of the genomic DNA of   J. curcas  and S. cantonensis  with the same PCR conditions used incastor bean. 155 of 308 (50.2%) primer pairs amplifiedin  S. cantonensis , and 74 of 308 (24.0%) primer pairsamplified in  J. curcas  (see additional file 1, Table S1,additional). Genetic relationships among germplasms A dendrogram based on UPGMA Nei-Li ’ s criteria wasgenerated with five distinct clusters (Figure 3). Cluster I Figure 2  PCR products and their length polymorphisms of four EST-SSR markers (Rc05, Rc85, Rc28 and Rc158) on agarose gel among24 germplasms (see Table 2 for the codes of germplasms) . Qiu  et al  .  BMC Plant Biology   2010,  10 :278http://www.biomedcentral.com/1471-2229/10/278Page 4 of 10  included two African (SA and MA) and two South Ameri-can (BR and PE) accessions; Cluster II contained one Afri-can (DZ), one Russian (RU), and two west Asian (PK andIR) accessions; Cluster III comprised of one North Ameri-can (MX) and two Indian (IN-1 and IN-2) accessions;Cluster IV covered all Chinese (CN1-9) and Vietnam(VN1-2) accessions. The dendrogram based on Neighbor-Joining criteria was very similar to the UPGMA tree, andthe five distinct clusters (Cluster I, Cluster II, Cluster III,Cluster IV and Cluster V in Figure 3) were again identi-fied, though there were slight differences in branch lengthwithin clusters (data not shown). Figure 3  Dendrogram constructed from genetic distances estimated from genotypes of 118 EST-SSRs among 24 germplasms usingthe UPGMA Nei-Li criteria within PAUP* . The numbers beside lines denote the branch length (see Table 2 for the codes of germplasms). Qiu  et al  .  BMC Plant Biology   2010,  10 :278http://www.biomedcentral.com/1471-2229/10/278Page 5 of 10
Similar documents
View more...
Search Related
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks