基于黄地老虎转录组测序的SSR和SNP特征分析

    Analysis of SSR and SNP Loci in Agrotis segetum Based on Transcriptome Sequencing

    • 摘要:
      目的 基于黄地老虎转录组测序结果,对其SSR及SNP位点信息进行分析。
      方法 对黄地老虎总RNA进行提取,使用Illumina平台对转录组进行测序,利用MISA和GATK软件分别对黄地老虎转录组的SSR和SNP位点特征信息进行分析。
      结果 黄地老虎转录组数据库包含66 469条Unigene序列。利用MISA对Unigene序列进行搜索,得到SSR位点4 438个,分布于4 048条Unigene序列上,占总序列的6.09%。其中,以单碱基和三碱基重复为主,分别占总重复类型的54.73%和29.29%。A/T基元是黄地老虎SSR的优势基元。SSR基元包含4~24次重复,以5次重复(21.67%)为主。SSR位点序列长度为12~127 bp,包含3 493个中等多态性位点、820个高度多态性位点。利用GATK对Unigene序列SNP位点进行搜索,得到转换类型(237 619个)和颠换类型(133 529个)共371 148个。C/T占SNP位点总数的18.61%,比率最高;G/A位居其次(18.28%),均属于转换类型。转换类型(64.02%)显著高于颠换类型(35.98%)。
      结论 黄地老虎转录组数据库中SSR位点分布频率较高、种类较多且多态性丰富,转换类型是主要的SNP变异类型,可为今后开展黄地老虎种群遗传结构与分化、遗传进化关系、迁飞规律及该害虫综合防治等研究提供重要科学依据。

       

      Abstract:
      Objective Based on transcriptome dataset of Agrotis segetum, the SSR and SNP loci were analyzed.
      Method Total RNA of A. segetum was extracted and the transcriptomes were sequenced with Illumina sequencing platform. SSR and SNP information in total unigene sequences were separately analyzed by MISA and GATK.
      Result In the study, the A. segetum tra nscriptome database contained 66 469 unigene sequences. MISA was used to search the Unigene sequences and 4 438 SSR loci were obtained, which were distributed on 4 048 Unigene sequences, accounting for 6.09% of the total sequences. Mononucleotide and trinucleotide repeats accounted for 54.73% and 29.29% of the total repeats, respectively. The A/T motifs were the dominant microsatellite loci of A. segetum SSR. SSR motifs contained 4 to 24 repeats, with 5 repetitions (21.67%) as the dominant repetitions. The length of SSR loci sequence ranged from 12 bp to 127 bp, including 3 493 moderately polymorphic loci and 820 highly polymorphic loci. GATK was used to search SNP loci in Unigene sequence, and 371 148 SNP loci were successfully searched, inculding 237 619 conversion types and 133 529 transversion types. C/T accounted for 18.61% of the total SNP sites, with the highest ratio, followed by G/A accounting for 18.28%, and all of which belonged to conversion types. The ratio of conversion type (64.02%) was significantly higher than that of transversion type (35.98%).
      Conclusion The SSR loci in the transcriptome database of A. segetum have higher distribution frequency, more species and richer polymorphism; and the conversion type is the main type of SNP variation, which can provide a very important scientific basis for future studies on the genetic structure and differentiation, genetic evolution, migratory rule and integrated control of A. segetum population.

       

    /

    返回文章
    返回