[关键词]
[摘要]
为深入解析苏氏圆腹䰶的遗传信息和开展基因功能研究,本实验分别提取性成熟苏氏圆腹䰶脑、鳃、心脏、肝脏、脾脏、头肾、胃、肠、性腺和肌肉组织的总RNA,等质量混合为一个样本后,利用PacBio高通量测序平台单分子实时测序技术对其进行了测序分析,获得了性成熟苏氏圆腹䰶各组织的全长转录组信息。结果显示,在苏氏圆腹䰶中共获得1 487 336条高质量reads,平均长度和N50分别为83 592和162 901 bp;校正后共获得1 005 955条循环一致序列 (circular consensus sequencing,CCS),过滤后共鉴定出667 973条含有polyA结构的全长非嵌合序列 (full-length non-concatemer,FLNC) 序列,平均长度和N50分别为2 057和2 359 bp。614 078条 (91.93%) FLNC用于基因和转录本的注释,鉴定到的19 835个已知基因对应80 915个转录本,9 348个新基因对应9 954 条新转录本,预测到50 311个开放阅读框、79 922个可变剪接、18个融合基因和20 215个选择性多聚腺苷酸化位点。新基因在NR、GO、KEGG、KOG和SwissProt数据库中分别有3 912、2 385、2 167、81和1 520个获得注释。另外,还预测到4 624个长链非编码RNA,调控32 283个靶mRNA。研究表明,通过全长转录组测序数据及功能注释分析,丰富了苏氏圆腹䰶的遗传资源信息。本研究可为进一步开展苏氏圆腹䰶生物学特性、基因功能研究提供基础。
[Key word]
[Abstract]
Pangasius sutchi, a significant freshwater economic fish in Southeast Asia, is characterized by rapid growth, ease of cultivation, rich nutritional content, and the absence of small intermuscular bones. First introduced to China from Thailand in 1978, P. sutchi achieved a breakthrough in artificial breeding in 1997 and has since been extensively promoted in Guangdong, Guangxi, and Hainan provinces. Current research on P. sutchi primarily focused on breeding models, nutritional feed development, disease control, and fish product processing technology, with less emphasis on basic biology, particularly molecular biology. This study sequenced the full-length transcriptome from brain, gills, heart, liver, spleen, head kidney, stomach, intestines, gonads, and muscles of sexually matured P. sutchi using Single Molecule Real-Time (SMRT) sequencing on the PacBio Sequel platform to elucidate the genetic basis and support molecular biology research. A total of 1 487 336 high-quality reads were obtained, averaging 83 592 bp in length with an N50 of 162 901 bp. After self-correction, 1 005 955 CCS (Circular Consensus Sequence) were derived, and following filtration, 667 973 polyA-containing FLNC (full-length non-concatenated) were identified, averaging 2 057 bp in length with an N50 of 2 359 bp. For gene and transcript annotation, 614 078 (91.93%) FLNC were used, identifying 19 835 known genes and 9 348 novel genes. In addition, 50 311 ORF (open reading frame), 79 922 alternative splicing, 18 fusion genes, and 20 215 alternative polyadenylation sites were predicted. Of the 9 348 novel genes, 3 912, 2 385, 2 167, 81 and 1 520 were annotated in NR (non-redundant protein sequences), GO (gene ontology), KEGG (Kyoto encyclopedia of genes and genomes), KOG (eukaryotic orthologous groups) and SwissProt databases, respectively. GO enrichment analysis revealed that 1 309, 1 351, and 1 524 new genes were enriched in cellular process, cellular anatomical entities, and binding terms, respectively. KEGG enrichment indicated that the new genes were primarily enriched in cellular processes such as eukaryotes (106), signal transduction (276), folding, sorting and degradation (79), amino acid metabolism (63), and endocrine system (197). 4 624 lncRNA were obtained in P. sutchi, regulating 32 283 target mRNA. GO enrichment results showed that target mRNA were mainly enriched in cellular processes (12 084), cellular anatomical entity (18 034) and binding (12 772). KEGG analysis indicated that the target mRNA were predominantly enriched in the transport and catabolism pathway (1 437), signal transduction (4 165) pathway, folding, sorting and degradation (643) pathway, carbohydrate metabolism (584) pathway and immune system (2 135). In this study, the full-length transcriptome sequencing data analysis and functional annotation have enriched the genetic resources of P. sutchi and provided a basis for further research on the biological characteristics and gene function of P. sutchi.
[中图分类号]
Q 786;S 965.1
[基金项目]
国家自然科学基金(31201996) ;广东海洋大学“南海学者计划”青年人才项目(QNXZ201903,201807) ;广东海洋大学博士启动项目