›› 2014,Vol. 29 ›› Issue (6): 1083-1093.doi: 10.1007/s11390-014-1492-z

所属专题: Emerging Areas

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

基于标签传播的蛋白质复合体识别算法

Qi-Guo Dai1(代启国), Mao-Zu Guo1(郭茂祖), Xiao-Yan Liu1(刘晓燕), Zhi-Xia Teng1,2(滕志霞), Chun-Yu Wang1(王春宇)   

  1. 1 School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;
    2 School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • 收稿日期:2013-11-28 修回日期:2014-09-18 出版日期:2014-11-05 发布日期:2014-11-05
  • 作者简介:Qi-Guo Dai received his B.S. degree in computer science from Hubei University of Automobile Technology in 2006 and M.S. degree in computer science from Beijing University of Technology in 2010. He is a Ph.D. candidate in the School of Computer Science and Technology, Harbin Institute of Technology, China. His research interests include bioinformatics and machine learning.
  • 基金资助:

    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61271346, 61172098, and 91335112, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20112302110040, and the Fundamental Research Funds for the Central Universities of China under Grant No. HIT.KISTP.201418.

CPL: Detecting Protein Complexes by Propagating Labels on Protein-Protein Interaction Network

Qi-Guo Dai1(代启国), Mao-Zu Guo1(郭茂祖), Xiao-Yan Liu1(刘晓燕), Zhi-Xia Teng1,2(滕志霞), Chun-Yu Wang1(王春宇)   

  1. 1 School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;
    2 School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • Received:2013-11-28 Revised:2014-09-18 Online:2014-11-05 Published:2014-11-05
  • About author:Qi-Guo Dai received his B.S. degree in computer science from Hubei University of Automobile Technology in 2006 and M.S. degree in computer science from Beijing University of Technology in 2010. He is a Ph.D. candidate in the School of Computer Science and Technology, Harbin Institute of Technology, China. His research interests include bioinformatics and machine learning.
  • Supported by:

    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61271346, 61172098, and 91335112, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20112302110040, and the Fundamental Research Funds for the Central Universities of China under Grant No. HIT.KISTP.201418.

蛋白质复合体是由多个蛋白质通过相互作用而形成,它在细胞活动中具有重要作用.通过对蛋白质相互作用网络进行聚类识别蛋白质复合体,是当前研究热点.本文提出一种基于标签传播的新型识别算法(CPL算法).该算法通过合理的标签传播,使得参与同一复合体的蛋白质具有相似的标签信息.与现有方法相比,该算法的特点是不依赖于特定的拓扑结构.将CPL算法在多个公开的酵母蛋白质相互作用网络上进行了测试.与其他算法相比,CPL算法所识别的复合体具有更高的准确性和功能富集性.

Abstract: Proteins usually bind together to form complexes, which play an important role in cellular activities. Many graph clustering methods have been proposed to identify protein complexes by finding dense regions in protein-protein interaction networks. We present a novel framework (CPL) that detects protein complexes by propagating labels through interactions in a network, in which labels denote complex identifiers. With proper propagation in CPL, proteins in the same complex will be assigned with the same labels. CPL does not make any strong assumptions about the topological structures of the complexes, as in previous methods. The CPL algorithm is tested on several publicly available yeast protein-protein interaction networks and compared with several state-of-the-art methods. The results suggest that CPL performs better than the existing methods. An analysis of the functional homogeneity based on a gene ontology analysis shows that the detected complexes of CPL are highly biologically relevant.

[1] Spirin V, Mirny L A. Protein complexes and functional modules in molecular networks. Proceedings of the National Academy of Sciences, 2003, 100(21): 12123-12128.

[2] Chen B, Fan W, Liu J et al. Identifying protein complexes and functional modules——From static PPI networks to dynamic PPI networks. Briefings in Bioinformatics, 2014, 15(2): 177194.

[3] Geva G, Sharan R. Identification of protein complexes from co-immuno-precipitation data. Bioinformatics, 2011, 27(1): 111-117.

[4] Ji J, Zhang A, Liu C et al. Survey: Functional module detection from protein-protein interaction networks. IEEE Knowledge and Data Engineering, 2014, 26(2): 261-277.

[5] Li X, Wu M, Kwoh C K et al. Computational approaches for detecting protein complexes from protein interaction networks: A survey. BMC Genomics, 2010, 11(Suppl 1): S3.

[6] Wang J, Li M, Deng Y et al. Recent advances in clustering methods for protein interaction networks. BMC Genomics, 2010, 11(Suppl 3): S10.

[7] Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks. Nature Methods, 2012, 9(5): 471-472.

[8] Becker E, Robisson B, Chapple C E et al. Multifunctional proteins revealed by overlapping clustering in protein interaction network. Bioinformatics, 2012, 28(1): 84-90.

[9] Chen B, Shi J, Zhang S et al. Identifying protein complexes in protein-protein interaction networks by using clique seeds and graph entropy. Proteomics, 2013, 13(2): 269-277.

[10] Habibi M, Eslahchi C, Wong L. Protein complex prediction based on k-connected subgraphs in protein interaction network. BMC Systems Biology, 2010, 4(1): 129.

[11] Zhang C, Liu S, Zhou Y. Fast and accurate method for identifying high-quality protein-interaction modules by clique merging and its application to yeast. Journal of Proteome Research, 2006, 5(4): 801-807.

[12] Altaf-Ul-Amin M, Shinbo Y, Mihara K et al. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics, 2006, 7(1): 207.

[13] Bader G D, Hogue C W V. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 2003, 4(1): 2.

[14] Wu M, Li X, Kwoh C K et al. A core-attachment based method to detect protein complexes in PPI networks. BMC Bioinformatics, 2009, 10(1): 169.

[15] Adamcsek B, Palla G, Farkas I J et al. CFinder: Locating cliques and overlapping modules in biological networks. Bioinformatics, 2006, 22(8): 1021-1023.

[16] Palla G, Derényi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 2005, 435(7043): 814-818.

[17] Wang S, Wu F. Detecting overlapping protein complexes in PPI networks based on robustness. Proteome Science, 2013, 11(Suppl 1): S18.

[18] Enright A, Van Dongen S, Ouzounis C. An effcient algorithm for large-scale detection of protein families. Nucleic Acids Research, 2002, 30(7): 1575-1584.

[19] Pizzuti C, Rombo S. A co-clustering approach for mining large protein-protein interaction networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2012, 9(3): 717-730.

[20] Anirban M, Sumanta R, Moumita D. Detecting protein complexes in a PPI network: A gene ontology based multiobjective evolutionary approach. Molecular BioSystems, 2012, 8(11): 3036-3048.

[21] Eileen M H. Detection of overlapping protein complexes using a protein ranking algorithm. In Proc. the 9th Int. Conference on Innovations in Information Technology, March 2013, pp.233-236.

[22] Zaki N, Berengueres J, Efimov D. Detection of protein complexes using a protein ranking algorithm. Proteins: Structure, Function, and Bioinformatics, 2012, 80(10): 2459-2468.

[23] Wang Y, Qian X. Functional module identification in protein interaction networks by interaction patterns. Bioinformatics, 2014, 30(1): 81-93.

[24] Raghavan U N, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 2007, 76(3): 036106.

[25] Stark C, Breitkreutz B J, Reguly T et al. BioGRID: A general repository for interaction datasets. Nucleic Acids Research, 2006, 34(Suppl 1): D535-D539.

[26] Salwinski L, Miller C S, Smith A J et al. The database of interacting proteins: 2004 update. Nucleic Acids Research, 2004, 32(Database Issue): D449-D451.

[27] Mewes H W, Amid C, Arnold R et al. MIPS: Analysis and annotation of proteins from whole genomes. Nucleic Acids Research, 2004, 32(Database Issue): D41-D44.

[28] Pu S, Wong J, Turner B et al. Up-to-date catalogues of yeast protein complexes. Nucleic Acids Research, 2009, 37(3): 825831.

[29] Wu Z H, Lin Y F, Gregory S et al. Balanced multi-label propagation for overlapping community detection in social networks. Journal of Computer Science and Technology, 2012, 27(3): 468-479.

[30] Hong E L, Balakrishnan R, Dong Q et al. Gene ontology annotations at SGD: New data sources and annotation methods. Nucleic Acids Research, 2008, 36(Suppl 1): D577-D581.

[31] Boyle E I, Weng S, Gollub J et al. GO::TermFinder——Open source software for accessing gene ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics, 2004, 20(18): 3710-3715.

[32] Berman H M, Westbrook J, Feng Z et al. The protein data bank. Nucleic Acids Research, 2000, 28(1): 235-242.

[33] Naveed H, Han J J. Structure-based protein-protein interaction networks and drug design. Quantitative Biology, 2013, 1(3): 183-191.

[34] Zhang Q C, Petrey D, Deng L et al. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature, 2012, 490(7421): 556-560.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘明业; 洪恩宇;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] 高庆狮; 张祥; 杨树范; 陈树清;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] 闵应骅; 韩智德;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] 朱鸿;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] 李明慧;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: