计算机科学技术学报 ›› 2021,Vol. 36 ›› Issue (2): 288-298.doi: 10.1007/s11390-021-0798-x

所属专题: Emerging Areas

• • 上一篇    下一篇

基于改进的加权带偏置的元结构来预测环状RNA与疾病的关系

Xiu-Juan Lei1, Senior Member, CCF, Member, ACM, IEEE, Chen Bian1, and Yi Pan2,3,*, Senior Member, IEEE   

  1. 1 School of Computer Science, Shaanxi Normal University, Xi'an 710119, China;
    2 School of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
    3 Department of Computer Science, Georgia State University, Atlanta, GA 30302, U.S.A
  • 收稿日期:2020-07-13 修回日期:2021-02-23 出版日期:2021-03-05 发布日期:2021-04-01
  • 通讯作者: Yi Pan E-mail:yipan@gsu.edu
  • 作者简介:Xiu-Juan Lei is a professor and Ph.D. supervisor at Shaanxi Normal University, Xi'an. She received her Ph.D. degree in Northwestern Polytechnical University, Xi'an, in 2005. Her research interests include bioinformatics and intelligent computing.
  • 基金资助:
    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61972451, 61672334 and 61902230, and the Fundamental Research Funds for the Central Universities of China under Grant No. GK201901010.

Predicting CircRNA-Disease Associations Based on Improved Weighted Biased Meta-Structure

Xiu-Juan Lei1, Senior Member, CCF, Member, ACM, IEEE, Chen Bian1, and Yi Pan2,3,*, Senior Member, IEEE        

  1. 1 School of Computer Science, Shaanxi Normal University, Xi'an 710119, China;
    2 School of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
    3 Department of Computer Science, Georgia State University, Atlanta, GA 30302, U.S.A
  • Received:2020-07-13 Revised:2021-02-23 Online:2021-03-05 Published:2021-04-01
  • Contact: Yi Pan E-mail:yipan@gsu.edu
  • About author:Xiu-Juan Lei is a professor and Ph.D. supervisor at Shaanxi Normal University, Xi'an. She received her Ph.D. degree in Northwestern Polytechnical University, Xi'an, in 2005. Her research interests include bioinformatics and intelligent computing.
  • Supported by:
    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61972451, 61672334 and 61902230, and the Fundamental Research Funds for the Central Universities of China under Grant No. GK201901010.

1、研究背景。环状RNA(circRNA)是一种拥有独特性质和多种功能的特殊的内源性非编码RNA。近年来,随着高通量测序技术的快速发展,在古生菌、植物和动物中均检测了circRNA,这种现象引起了学者的广泛关注,随着circRNA数据的增加,circRNA的部分生物学功能也逐渐清晰,如充当miRNA分子海绵、参与转录调控、联结RNA结合蛋白、发挥翻译功能等等。因此,circRNA失调将会导致细胞功能紊乱、表达异常和生长缺陷等。经研究发现,多种circRNA已经被确认与胃癌、结直肠癌、肝癌、神经胶质瘤等复杂疾病的产生与发展存在着重要的联系。
2、目的:通过生物实验来预测circRNA-疾病关联关系的方法需要耗费大量的财力与时间,该方向的发展因此受到了很大的限制。基于目前多种数据库和其它RNA研究的进展,计算方法可以解决生物实验方法成本过高等问题。考虑到目前已证实的circRNA和疾病关联数量相对较少的困境,我们提出了一个新的计算模型,利用较少的circRNA和疾病关联,更多的circRNA生物学信息来发现未知的circRNA和疾病关联。
3、方法:我们提出了基于改进的加权带偏置的元结构算法预测circRNA与疾病关系模型。第一步:为了扩大circRNA的数量,我们从exoRBase获得了1511个circRNA的表达谱。接下来,我们从CircR2Disease、CircAtlas 2.0、Circ2Disease和CircRNADisease四个数据库中获得了1511个circRNA和疾病的已知关系。第二步:计算了circRNA的表达谱相似性、序列相似性和高斯核相似性以及疾病的语义相似性和高斯核相似性。第三步:将circRNA和疾病的关系网络、circRNA整合后相似性网络和疾病整合后相似性网络相结合,构建异构网络。第四步:在异构网络上采用改进的加权带偏置的元结构算法预测circRNA与疾病关系。
4、结果:通过留一交叉验证、10折交叉验证、5折交叉验证,我们的模型得到的ROC曲线下面积(AUC)分别为0.9216、0.9172和0.9005。此外,CDWBMS在准确率(0.86),F1-score(0.88)和Matthews相关系数(0.727)方面表现良好。通过对胃癌、结直肠癌和乳腺癌进行案例研究,表明CDWBMS可以预测未知的circRNA疾病关联。
5、结论:基于目前较少的circRNA和疾病关联数量,我们的模型扩展了circRNA的数量,为探索新的未知circRNA与疾病之间的关系提供了更多空间。另外,通过对元结构算法的改进,使得算法的预测性能得到进一步提升。然而,与疾病相关的circRNAs,大部分往往没有表达谱数据,因此,我们将进一步整合circRNA与疾病的关联关系数据,结合circRNA的生物学数据,提高该模型的有效性。

关键词: 环状RNA与疾病的关系, 元结构, 异构网络

Abstract: Circular RNAs (circRNAs) are RNAs with a special closed loop structure, which play important roles in tumors and other diseases. Due to the time consumption of biological experiments, computational methods for predicting associations between circRNAs and diseases become a better choice. Taking the limited number of verified circRNA-disease associations into account, we propose a method named CDWBMS, which integrates a small number of verified circRNA-disease associations with a plenty of circRNA information to discover the novel circRNA-disease associations. CDWBMS adopts an improved weighted biased meta-structure search algorithm on a heterogeneous network to predict associations between circRNAs and diseases. In terms of leave-one-out-cross-validation (LOOCV), 10-fold cross-validation and 5-fold cross-validation, CDWBMS yields the area under the receiver operating characteristic curve (AUC) values of 0.921 6, 0.917 2 and 0.900 5, respectively. Furthermore, case studies show that CDWBMS can predict unknow circRNA-disease associations. In conclusion, CDWBMS is an effective method for exploring disease-related circRNAs.

Key words: circular RNA (circRNA), circRNA-disease association, meta-structure, heterogeneous network

[1] Kristensen L S, Andersen M S, Stagsted L V W, Ebbesen K K, Kjems J. The biogenesis, biology and characterization of circular RNAs. Nat. Rev. Genet., 2019, 20(7):675-691. DOI:10.1038/s41576-019-0158-7.
[2] Wilusz J E. A 360? view of circular RNAs:From biogenesis to functions. Wiley Interdiscip. Rev. RNA, 2018, 9(4):Article No. e1478. DOI:10.1002/wrna.1478.
[3] Sanger H L, Klotz G, Riesner D, Gross H J, Kleinschmidt A K. Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures. Proc Natl. Acad. Sci. USA, 1976, 73(11):3852-3856. DOI:10.1073/pnas.73.11.3852.
[4] Capel B, Swain A, Nicolis S, Hacker A, Walter M, Koopman P, Goodfellow P, Lovell-Badge R. Circular transcripts of the testis-determining gene Sry in adult mouse testis. Cell, 1993, 73(5):1019-1030. DOI:10.1016/0092-8674(93)90279-Y.
[5] Cocquerelle C, Daubersies P, Majérus M A, Kerckaert J P, Bailleul B. Splicing with inverted order of exons occurs proximal to large introns. EMBO J., 1992, 11(3):1095-1098. DOI:10.1002/j.1460-2075.1992.tb05148.x.
[6] Cocquerelle C, Mascrez B, Hétuin D, Bailleul B. Missplicing yields circular RNA molecules. FASEB J., 1993, 7(1):155-160. DOI:10.1096/fasebj.7.1.7678559.
[7] Nigro J M, Cho K R, Fearon E R, Kern S E, Ruppert J M, Oliner J D, Kinzler K W, Vogelstein B. Scrambled exons. Cell, 1991, 64(3):607-613. DOI:10.1016/0092-8674(91)90244-S.
[8] Hansen T B, Jensen T I, Clausen B H, Bramsen J B, Finsen B, Damgaard C K, Kjems J. Natural RNA circles function as efficient microRNA sponges. Nature, 2013, 495(7441):384-388. DOI:10.1038/nature11993.
[9] Memczak S, Jens M, Elefsinioti A et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature, 2013, 495(7441):333-338. DOI:10.1038/nature11928.
[10] Enuka Y, Lauriola M, Feldman M E, Sas-Chen A, Ulitsky I, Yarden Y. Circular RNAs are long-lived and display only minimal early alterations in response to a growth factor. Nucleic Acids Res., 2016, 44(3):1370-1383. DOI:10.1093/nar/gkv1367.
[11] Pamudurti N R, Bartok O, Jens M et al. Translation of CircRNAs. Mol. Cell, 2017, 66(1):9-21. DOI:10.1016/j.molcel.2017.02.021.
[12] Maass P G, Glažar P, Memczak S et al. A map of human circular RNAs in clinically relevant tissues. J. Mol. Med., 2017, 95(11):1179-1189. DOI:10.1007/s00109-017-1582-9.
[13] Aufiero S, Van Den Hoogenhof M M G, Reckman Y J et al. Cardiac circRNAs arise mainly from constitutive exons rather than alternatively spliced exons. RNA, 2018, 24(6):815-827. DOI:10.1261/rna.064394.117.
[14] Rybak-Wolf A, Stottmeister C, Glažar P et al. Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Mol. Cell, 2015, 58(5):870-885. DOI:10.1016/j.molcel.2015.03.027.
[15] Li Z, Huang C, Bao C et al. Exon-intron circular RNAs regulate transcription in the nucleus. Nat. Struct. Mol. Biol., 2015, 22(3):256-264. DOI:10.1038/nsmb.2959.
[16] Lux S, Bullinger L. Circular RNAs in cancer. Adv. Exp. Med. Biol., 2018, 1087:215-230. DOI:10.1007/978-981-13-1426-117.
[17] Liu J, Li D, Luo H, Zhu X. Circular RNAs:The star molecules in cancer. Mol. Aspects. Med., 2019, 70:141-152. DOI:10.1016/j.mam.2019.10.006.
[18] Smid M, Wilting S M, Uhr K et al. The circular RNome of primary breast cancer. Genome Res., 2019, 29(3):356-366. DOI:10.1101/gr.238121.118.
[19] Liu H, Bi J, Dong W, Yang M, Shi J, Jiang N, Lin T, Huang J. Invasion-related circular RNA circFNDC3B inhibits bladder cancer progression through the miR-1178-3p/G3BP2/SRC/FAK axis. Mol. Cancer, 2018, 17(1):Article No. 161. DOI:10.1186/s12943-018-0908-8.
[20] Xia Q, Ding T, Zhang G, Li Z, Zeng L, Zhu Y, Guo J, Hou J, Zhu T, Zheng J, Wang J. Circular RNA expression profiling identifies prostate cancer-specific circRNAs in prostate cancer. Cell Physiol. Biochem., 2018, 50(5):1903-1915. DOI:10.1159/000494870.
[21] Fan C, Lei X, Fang Z, Jiang Q, Wu F X. CircR2Disease:A manually curated database for experimentally supported circular RNAs associated with various diseases. Database (Oxford), 2018, 2018:Article No. bay044. DOI:10.1093/database/bay044.
[22] Ji P, Wu W, Chen S, Zheng Y, Zhou L, Zhang J, Cheng H, Yan J, Zhang S, Yang P, Zhao F. Expanded expression landscape and prioritization of circular RNAs in mammals. Cell Rep., 2019, 26(12):3444-3460. DOI:10.1016/j.celrep.2019.02.078.
[23] Yao D, Zhang L, Zheng M, Sun X, Lu Y. Circ2Disease:A manually curated database of experimentally validated circRNAs in human disease. 2018, 8(1):Article No. 11018. DOI:10.1038/s41598-018-29360-3.
[24] Zhao Z, Wang K, Wu F, Wang W, Zhang K, Hu H, Liu Y, Jiang T. circRNA disease:A manually curated database of experimentally supported circRNA-disease associations. Cell Death and Disease, 2018, 9(5):Article No. 475. DOI:10.1038/s41419-018-0503-3.
[25] Ghosal S, Das S, Sen R, Basak P, Chakrabarti J. Circ2Traits:A comprehensive database for circular RNA potentially associated with disease and traits. Front. Genet., 2013, 4:Article No. 283. DOI:10.3389/fgene.2013.00283.
[26] Lan W, Wang J, Li M, Liu J, Wu F X, Pan Y. Predicting microRNA-disease associations based on improved microRNA and disease similarities. IEEE/ACM Trans. Comput. Biol. Bioinform., 2018, 15(6):1774-1782. DOI:10.1109/TCBB.2016.2586190.
[27] Lan W, Li M, Zhao K, Liu J, Wu F X, Pan Y, Wang J. LDAP:A web server for lncRNA-disease association prediction. Bioinformatics, 2017, 33(3):458-460. DOI:10.1093/bioinformatics/btw639.
[28] Yan C, Wang J, Ni P, Lan W, Wu F X, Pan Y. DNRLMFMDA:Predicting microRNA-disease associations based on similarities of microRNAs and diseases. IEEE/ACM Trans. Comput. Biol. Bioinform., 2019, 16(1):233-243. DOI:10.1109/TCBB.2017.2776101.
[29] Peng W, Lan W, Yu Z, Wang J, Pan Y. A framework for integrating multiple biological networks to predict microRNA-disease associations. IEEE Trans. Nano Bioscience, 2017, 16(2):100-107. DOI:10.1109/TNB.2016.2633276.
[30] Wu L, Li M, Wang J X, Wu F X. Controllability and its applications to biological networks. Journal of Computer Science and Technology, 2019, 34(1):16-34. DOI:10.1007/s11390-019-1896-x.
[31] Fang Z, Lei X. Prediction of miRNA-circRNA associations based on k-NN multi-label with random walk restart on a heterogeneous network. Big Data Mining and Analytics, 2019, 2(4):261-272. DOI:10.26599/BDMA.2019.9020010.
[32] Fan C, Lei X, Wu F X. Prediction of CircRNA-disease associations using KATZ model based on heterogeneous networks. Int. J. Biol. Sci., 2018, 14(14):1950-1959. DOI:10.7150/ijbs.28260.
[33] Lei X, Fang Z, Chen L, Wu F X. PWCDA:Path weighted method for predicting circRNA-disease associations. Int. J. Mol. Sci., 2018, 19(11):Article No. 3410. DOI:10.3390/ijms19113410.
[34] Yan C, Wang J, Wu F X. DWNN-RLS:Regularized least squares method for predicting circRNA-disease associations. BMC Bioinformatics, 2018, 19(Suppl 19):Article No. 520. DOI:10.1186/s12859-018-2522-6.
[35] Wei H, Liu B. iCircDA-MF:Identification of circRNAdisease associations based on matrix factorization. Brief Bioinform., 2019, 21(4):1356-1367. DOI:10.1093/bib/bbz057.
[36] Zhang W, Yu C, Wang X, Liu F. Predicting circRNAdisease associations through linear neighborhood label propagation method. IEEE Access, 2019, 7:83474-83483. DOI:10.1109/ACCESS.2019.2920942.
[37] Lei X, Zhang W. BRWSP:Predicting circRNA-disease associations based on biased random walk to search paths on a multiple heterogeneous network. Complexity, 2019, 2019:Article No. 5938035. DOI:10.1155/2019/5938035.
[38] Wang Y, Nie C, Zang T, Wang Y. Predicting circRNAdisease associations based on circRNA expression similarity and functional similarity. Frontiers in Genetics, 2019, 10:Article No. 832. DOI:10.3389/fgene.2019.00832.
[39] Li S, Li Y, Chen B, Zhao J, Yu S, Tang Y, Zheng Q, Li Y, Wang P, He X, Huang S. exoRBase:A database of circRNA, lncRNA and mRNA in human blood exosomes. Nucleic Acids Res., 2018, 46(D1):D106-D112. DOI:10.1093/nar/gkx891.
[40] Glažar P, Papavasileiou P, Rajewsky N. circBase:A database for circular RNAs. RNA, 2014, 20(11):1666-1670. DOI:10.1261/rna.043687.113.
[41] Muppirala U K, Honavar V G, Dobbs D. Predicting RNA-protein interactions using only sequence information. BMC Bioinformatics, 2011, 12(1):Article No. 489. DOI:10.1186/1471-2105-12-489.
[42] Van Laarhoven T, Nabuurs S B, Marchiori E. Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics, 2011, 27(21):3036-3043. DOI:10.1093/bioinformatics/btr500.
[43] Wang D, Wang J, Lu M, Song F, Cui Q. Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases. Bioinformatics, 2010, 26(13):1644-1650. DOI:10.1093/bioinformatics/btq241.
[44] Huang Z, Zheng Y, Cheng R, Sun Y, Mamoulis N, Li X. Meta structure:Computing relevance in large heterogeneous information networks. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.1595-1604. DOI:10.1145/2939672.2939815.
[45] Zhao H, Yao Q, Li J, Song Y, Lee D L. Meta-graph based recommendation fusion over heterogeneous information networks. In Proc. the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2017, pp.635-644. DOI:10.1145/3097983.3098063.
[46] Long Y, Luo J. WMGHMDA:A novel weighted metagraph-based model for predicting human microbe-disease association on heterogeneous information network. BMC Bioinformatics, 2019, 20(1):Article No. 541. DOI:10.1186/s12859-019-3066-0.
[47] Lei X, Tie J. Prediction of disease-related metabolites using bi-random walks. PLoS ONE, 2019, 14(11):Article No. e0225380. DOI:10.1371/journal.pone.0225380.
[48] Jiang Y, Liu B, Yu L, Yan C, Bian H. Predict MiRNAdisease association with collaborative filtering. Neuroinformatics, 2018, 16(3):363-372. DOI:10.1007/s12021-018-9386-9.
[49] Necula L, Matei L, Dragu D, Neagu A I, Mambet C, Nedeianu S, Bleotu C, Diaconu C C, Chivu-Economescu M. Recent advances in gastric cancer early diagnosis. World J. Gastroenterol, 2019, 25(17):2029-2044. DOI:10.3748/wjg.v25.i17.2029.
[50] Weitz J, Koch M, Debus J, Höhler T, Galle P R, Büchler M W. Colorectal cancer. Lancet, 2005, 365(9454):153-165. DOI:10.1016/S0140-6736(05)17706-X.
[51] Sun Y S, Zhao Z, Yang Z N, Xu F, Lu H J, Zhu Z Y, Shi W, Jiang J, Yao P P, Zhu H P. Risk factors and preventions of breast cancer. Int. J. Biol. Sci., 2017, 13(11):1387-1397. DOI:10.7150/ijbs.21635.
[1] Rafael Geraldeli Rossi, Alneu de Andrade Lopes, Thiago de Paulo Faleiros, and Solange Oliveira Rezende. 使用异构网络的文本分类归纳模型生成研究[J]. , 2014, 29(3): 361-375.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李万学;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 .
[2] C.Y.Chung; 华宣仁;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[3] 章萃; 赵沁平; 徐家福;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 .
[4] 王建潮; 魏道政;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[5] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] 黄学东; 蔡莲红; 方棣棠; 迟边进; 周立; 蒋力;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[7] 史忠植;. Knowledge-Based Decision Support System[J]. , 1987, 2(1): 22 -29 .
[8] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[9] 夏培肃; 方信我; 王玉祥; 严开明; 张廷军; 刘玉兰; 赵春英; 孙继忠;. Design of Array Processor Systems[J]. , 1987, 2(3): 163 -173 .
[10] 孙永强; 陆汝占; 黄小戎;. Termination Preserving Problem in the Transformation of Applicative Programs[J]. , 1987, 2(3): 191 -201 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: