|
计算机科学技术学报 ›› 2021,Vol. 36 ›› Issue (2): 288-298.doi: 10.1007/s11390-021-0798-x
所属专题: Emerging Areas
Xiu-Juan Lei1, Senior Member, CCF, Member, ACM, IEEE, Chen Bian1, and Yi Pan2,3,*, Senior Member, IEEE
Xiu-Juan Lei1, Senior Member, CCF, Member, ACM, IEEE, Chen Bian1, and Yi Pan2,3,*, Senior Member, IEEE
1、研究背景。环状RNA(circRNA)是一种拥有独特性质和多种功能的特殊的内源性非编码RNA。近年来,随着高通量测序技术的快速发展,在古生菌、植物和动物中均检测了circRNA,这种现象引起了学者的广泛关注,随着circRNA数据的增加,circRNA的部分生物学功能也逐渐清晰,如充当miRNA分子海绵、参与转录调控、联结RNA结合蛋白、发挥翻译功能等等。因此,circRNA失调将会导致细胞功能紊乱、表达异常和生长缺陷等。经研究发现,多种circRNA已经被确认与胃癌、结直肠癌、肝癌、神经胶质瘤等复杂疾病的产生与发展存在着重要的联系。
2、目的:通过生物实验来预测circRNA-疾病关联关系的方法需要耗费大量的财力与时间,该方向的发展因此受到了很大的限制。基于目前多种数据库和其它RNA研究的进展,计算方法可以解决生物实验方法成本过高等问题。考虑到目前已证实的circRNA和疾病关联数量相对较少的困境,我们提出了一个新的计算模型,利用较少的circRNA和疾病关联,更多的circRNA生物学信息来发现未知的circRNA和疾病关联。
3、方法:我们提出了基于改进的加权带偏置的元结构算法预测circRNA与疾病关系模型。第一步:为了扩大circRNA的数量,我们从exoRBase获得了1511个circRNA的表达谱。接下来,我们从CircR2Disease、CircAtlas 2.0、Circ2Disease和CircRNADisease四个数据库中获得了1511个circRNA和疾病的已知关系。第二步:计算了circRNA的表达谱相似性、序列相似性和高斯核相似性以及疾病的语义相似性和高斯核相似性。第三步:将circRNA和疾病的关系网络、circRNA整合后相似性网络和疾病整合后相似性网络相结合,构建异构网络。第四步:在异构网络上采用改进的加权带偏置的元结构算法预测circRNA与疾病关系。
4、结果:通过留一交叉验证、10折交叉验证、5折交叉验证,我们的模型得到的ROC曲线下面积(AUC)分别为0.9216、0.9172和0.9005。此外,CDWBMS在准确率(0.86),F1-score(0.88)和Matthews相关系数(0.727)方面表现良好。通过对胃癌、结直肠癌和乳腺癌进行案例研究,表明CDWBMS可以预测未知的circRNA疾病关联。
5、结论:基于目前较少的circRNA和疾病关联数量,我们的模型扩展了circRNA的数量,为探索新的未知circRNA与疾病之间的关系提供了更多空间。另外,通过对元结构算法的改进,使得算法的预测性能得到进一步提升。然而,与疾病相关的circRNAs,大部分往往没有表达谱数据,因此,我们将进一步整合circRNA与疾病的关联关系数据,结合circRNA的生物学数据,提高该模型的有效性。
[1] Kristensen L S, Andersen M S, Stagsted L V W, Ebbesen K K, Kjems J. The biogenesis, biology and characterization of circular RNAs. Nat. Rev. Genet., 2019, 20(7):675-691. DOI:10.1038/s41576-019-0158-7. [2] Wilusz J E. A 360? view of circular RNAs:From biogenesis to functions. Wiley Interdiscip. Rev. RNA, 2018, 9(4):Article No. e1478. DOI:10.1002/wrna.1478. [3] Sanger H L, Klotz G, Riesner D, Gross H J, Kleinschmidt A K. Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures. Proc Natl. Acad. Sci. USA, 1976, 73(11):3852-3856. DOI:10.1073/pnas.73.11.3852. [4] Capel B, Swain A, Nicolis S, Hacker A, Walter M, Koopman P, Goodfellow P, Lovell-Badge R. Circular transcripts of the testis-determining gene Sry in adult mouse testis. Cell, 1993, 73(5):1019-1030. DOI:10.1016/0092-8674(93)90279-Y. [5] Cocquerelle C, Daubersies P, Majérus M A, Kerckaert J P, Bailleul B. Splicing with inverted order of exons occurs proximal to large introns. EMBO J., 1992, 11(3):1095-1098. DOI:10.1002/j.1460-2075.1992.tb05148.x. [6] Cocquerelle C, Mascrez B, Hétuin D, Bailleul B. Missplicing yields circular RNA molecules. FASEB J., 1993, 7(1):155-160. DOI:10.1096/fasebj.7.1.7678559. [7] Nigro J M, Cho K R, Fearon E R, Kern S E, Ruppert J M, Oliner J D, Kinzler K W, Vogelstein B. Scrambled exons. Cell, 1991, 64(3):607-613. DOI:10.1016/0092-8674(91)90244-S. [8] Hansen T B, Jensen T I, Clausen B H, Bramsen J B, Finsen B, Damgaard C K, Kjems J. Natural RNA circles function as efficient microRNA sponges. Nature, 2013, 495(7441):384-388. DOI:10.1038/nature11993. [9] Memczak S, Jens M, Elefsinioti A et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature, 2013, 495(7441):333-338. DOI:10.1038/nature11928. [10] Enuka Y, Lauriola M, Feldman M E, Sas-Chen A, Ulitsky I, Yarden Y. Circular RNAs are long-lived and display only minimal early alterations in response to a growth factor. Nucleic Acids Res., 2016, 44(3):1370-1383. DOI:10.1093/nar/gkv1367. [11] Pamudurti N R, Bartok O, Jens M et al. Translation of CircRNAs. Mol. Cell, 2017, 66(1):9-21. DOI:10.1016/j.molcel.2017.02.021. [12] Maass P G, Glažar P, Memczak S et al. A map of human circular RNAs in clinically relevant tissues. J. Mol. Med., 2017, 95(11):1179-1189. DOI:10.1007/s00109-017-1582-9. [13] Aufiero S, Van Den Hoogenhof M M G, Reckman Y J et al. Cardiac circRNAs arise mainly from constitutive exons rather than alternatively spliced exons. RNA, 2018, 24(6):815-827. DOI:10.1261/rna.064394.117. [14] Rybak-Wolf A, Stottmeister C, Glažar P et al. Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Mol. Cell, 2015, 58(5):870-885. DOI:10.1016/j.molcel.2015.03.027. [15] Li Z, Huang C, Bao C et al. Exon-intron circular RNAs regulate transcription in the nucleus. Nat. Struct. Mol. Biol., 2015, 22(3):256-264. DOI:10.1038/nsmb.2959. [16] Lux S, Bullinger L. Circular RNAs in cancer. Adv. Exp. Med. Biol., 2018, 1087:215-230. DOI:10.1007/978-981-13-1426-117. [17] Liu J, Li D, Luo H, Zhu X. Circular RNAs:The star molecules in cancer. Mol. Aspects. Med., 2019, 70:141-152. DOI:10.1016/j.mam.2019.10.006. [18] Smid M, Wilting S M, Uhr K et al. The circular RNome of primary breast cancer. Genome Res., 2019, 29(3):356-366. DOI:10.1101/gr.238121.118. [19] Liu H, Bi J, Dong W, Yang M, Shi J, Jiang N, Lin T, Huang J. Invasion-related circular RNA circFNDC3B inhibits bladder cancer progression through the miR-1178-3p/G3BP2/SRC/FAK axis. Mol. Cancer, 2018, 17(1):Article No. 161. DOI:10.1186/s12943-018-0908-8. [20] Xia Q, Ding T, Zhang G, Li Z, Zeng L, Zhu Y, Guo J, Hou J, Zhu T, Zheng J, Wang J. Circular RNA expression profiling identifies prostate cancer-specific circRNAs in prostate cancer. Cell Physiol. Biochem., 2018, 50(5):1903-1915. DOI:10.1159/000494870. [21] Fan C, Lei X, Fang Z, Jiang Q, Wu F X. CircR2Disease:A manually curated database for experimentally supported circular RNAs associated with various diseases. Database (Oxford), 2018, 2018:Article No. bay044. DOI:10.1093/database/bay044. [22] Ji P, Wu W, Chen S, Zheng Y, Zhou L, Zhang J, Cheng H, Yan J, Zhang S, Yang P, Zhao F. Expanded expression landscape and prioritization of circular RNAs in mammals. Cell Rep., 2019, 26(12):3444-3460. DOI:10.1016/j.celrep.2019.02.078. [23] Yao D, Zhang L, Zheng M, Sun X, Lu Y. Circ2Disease:A manually curated database of experimentally validated circRNAs in human disease. 2018, 8(1):Article No. 11018. DOI:10.1038/s41598-018-29360-3. [24] Zhao Z, Wang K, Wu F, Wang W, Zhang K, Hu H, Liu Y, Jiang T. circRNA disease:A manually curated database of experimentally supported circRNA-disease associations. Cell Death and Disease, 2018, 9(5):Article No. 475. DOI:10.1038/s41419-018-0503-3. [25] Ghosal S, Das S, Sen R, Basak P, Chakrabarti J. Circ2Traits:A comprehensive database for circular RNA potentially associated with disease and traits. Front. Genet., 2013, 4:Article No. 283. DOI:10.3389/fgene.2013.00283. [26] Lan W, Wang J, Li M, Liu J, Wu F X, Pan Y. Predicting microRNA-disease associations based on improved microRNA and disease similarities. IEEE/ACM Trans. Comput. Biol. Bioinform., 2018, 15(6):1774-1782. DOI:10.1109/TCBB.2016.2586190. [27] Lan W, Li M, Zhao K, Liu J, Wu F X, Pan Y, Wang J. LDAP:A web server for lncRNA-disease association prediction. Bioinformatics, 2017, 33(3):458-460. DOI:10.1093/bioinformatics/btw639. [28] Yan C, Wang J, Ni P, Lan W, Wu F X, Pan Y. DNRLMFMDA:Predicting microRNA-disease associations based on similarities of microRNAs and diseases. IEEE/ACM Trans. Comput. Biol. Bioinform., 2019, 16(1):233-243. DOI:10.1109/TCBB.2017.2776101. [29] Peng W, Lan W, Yu Z, Wang J, Pan Y. A framework for integrating multiple biological networks to predict microRNA-disease associations. IEEE Trans. Nano Bioscience, 2017, 16(2):100-107. DOI:10.1109/TNB.2016.2633276. [30] Wu L, Li M, Wang J X, Wu F X. Controllability and its applications to biological networks. Journal of Computer Science and Technology, 2019, 34(1):16-34. DOI:10.1007/s11390-019-1896-x. [31] Fang Z, Lei X. Prediction of miRNA-circRNA associations based on k-NN multi-label with random walk restart on a heterogeneous network. Big Data Mining and Analytics, 2019, 2(4):261-272. DOI:10.26599/BDMA.2019.9020010. [32] Fan C, Lei X, Wu F X. Prediction of CircRNA-disease associations using KATZ model based on heterogeneous networks. Int. J. Biol. Sci., 2018, 14(14):1950-1959. DOI:10.7150/ijbs.28260. [33] Lei X, Fang Z, Chen L, Wu F X. PWCDA:Path weighted method for predicting circRNA-disease associations. Int. J. Mol. Sci., 2018, 19(11):Article No. 3410. DOI:10.3390/ijms19113410. [34] Yan C, Wang J, Wu F X. DWNN-RLS:Regularized least squares method for predicting circRNA-disease associations. BMC Bioinformatics, 2018, 19(Suppl 19):Article No. 520. DOI:10.1186/s12859-018-2522-6. [35] Wei H, Liu B. iCircDA-MF:Identification of circRNAdisease associations based on matrix factorization. Brief Bioinform., 2019, 21(4):1356-1367. DOI:10.1093/bib/bbz057. [36] Zhang W, Yu C, Wang X, Liu F. Predicting circRNAdisease associations through linear neighborhood label propagation method. IEEE Access, 2019, 7:83474-83483. DOI:10.1109/ACCESS.2019.2920942. [37] Lei X, Zhang W. BRWSP:Predicting circRNA-disease associations based on biased random walk to search paths on a multiple heterogeneous network. Complexity, 2019, 2019:Article No. 5938035. DOI:10.1155/2019/5938035. [38] Wang Y, Nie C, Zang T, Wang Y. Predicting circRNAdisease associations based on circRNA expression similarity and functional similarity. Frontiers in Genetics, 2019, 10:Article No. 832. DOI:10.3389/fgene.2019.00832. [39] Li S, Li Y, Chen B, Zhao J, Yu S, Tang Y, Zheng Q, Li Y, Wang P, He X, Huang S. exoRBase:A database of circRNA, lncRNA and mRNA in human blood exosomes. Nucleic Acids Res., 2018, 46(D1):D106-D112. DOI:10.1093/nar/gkx891. [40] Glažar P, Papavasileiou P, Rajewsky N. circBase:A database for circular RNAs. RNA, 2014, 20(11):1666-1670. DOI:10.1261/rna.043687.113. [41] Muppirala U K, Honavar V G, Dobbs D. Predicting RNA-protein interactions using only sequence information. BMC Bioinformatics, 2011, 12(1):Article No. 489. DOI:10.1186/1471-2105-12-489. [42] Van Laarhoven T, Nabuurs S B, Marchiori E. Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics, 2011, 27(21):3036-3043. DOI:10.1093/bioinformatics/btr500. [43] Wang D, Wang J, Lu M, Song F, Cui Q. Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases. Bioinformatics, 2010, 26(13):1644-1650. DOI:10.1093/bioinformatics/btq241. [44] Huang Z, Zheng Y, Cheng R, Sun Y, Mamoulis N, Li X. Meta structure:Computing relevance in large heterogeneous information networks. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.1595-1604. DOI:10.1145/2939672.2939815. [45] Zhao H, Yao Q, Li J, Song Y, Lee D L. Meta-graph based recommendation fusion over heterogeneous information networks. In Proc. the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2017, pp.635-644. DOI:10.1145/3097983.3098063. [46] Long Y, Luo J. WMGHMDA:A novel weighted metagraph-based model for predicting human microbe-disease association on heterogeneous information network. BMC Bioinformatics, 2019, 20(1):Article No. 541. DOI:10.1186/s12859-019-3066-0. [47] Lei X, Tie J. Prediction of disease-related metabolites using bi-random walks. PLoS ONE, 2019, 14(11):Article No. e0225380. DOI:10.1371/journal.pone.0225380. [48] Jiang Y, Liu B, Yu L, Yan C, Bian H. Predict MiRNAdisease association with collaborative filtering. Neuroinformatics, 2018, 16(3):363-372. DOI:10.1007/s12021-018-9386-9. [49] Necula L, Matei L, Dragu D, Neagu A I, Mambet C, Nedeianu S, Bleotu C, Diaconu C C, Chivu-Economescu M. Recent advances in gastric cancer early diagnosis. World J. Gastroenterol, 2019, 25(17):2029-2044. DOI:10.3748/wjg.v25.i17.2029. [50] Weitz J, Koch M, Debus J, Höhler T, Galle P R, Büchler M W. Colorectal cancer. Lancet, 2005, 365(9454):153-165. DOI:10.1016/S0140-6736(05)17706-X. [51] Sun Y S, Zhao Z, Yang Z N, Xu F, Lu H J, Zhu Z Y, Shi W, Jiang J, Yao P P, Zhu H P. Risk factors and preventions of breast cancer. Int. J. Biol. Sci., 2017, 13(11):1387-1397. DOI:10.7150/ijbs.21635. |
[1] | Rafael Geraldeli Rossi, Alneu de Andrade Lopes, Thiago de Paulo Faleiros, and Solange Oliveira Rezende. 使用异构网络的文本分类归纳模型生成研究[J]. , 2014, 29(3): 361-375. |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |