Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (2): 310-322.doi: 10.1007/s11390-021-0844-8

Special Issue: Emerging Areas

• Special Section on AI and Big Data Analytics in Biology and Medicine • Previous Articles     Next Articles

Collaborative Matrix Factorization with Soft Regularization for Drug-Target Interaction Prediction

Li-Gang Gao1,2, Meng-Yun Yang1,2,3, and Jian-Xin Wang1,2,*, Senior Member, CCF, IEEE, Member, ACM        

  1. 1 School of Computer Science and Engineering, Central South University, Changsha 410083, China;
    2 Hunan Provincial Key Laboratory of Bioinformatics, Central South University, Changsha 410083, China;
    3 School of Science, Shaoyang University, Shaoyang 422000, China
  • Received:2020-07-29 Revised:2021-03-09 Online:2021-03-05 Published:2021-04-01
  • Contact: Jian-Xin Wang
  • About author:Li-Gang Gao received his B.S. degree in information and computational science from the University of South China, Hengyang, in 2018. He is a graduate student at the School of Computer Science and Engineering, Central South University, Changsha. His main research interests include drug-target interactions prediction and recommendation systems.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China under Grant No. 61972423, and Hunan Provincial Science and Technology Program under Grant No. 2018wk4001.

Identifying the potential drug-target interactions (DTI) is critical in drug discovery. The drug-target interaction prediction methods based on collaborative filtering have demonstrated attractive prediction performance. However, many corresponding models cannot accurately express the relationship between similarity features and DTI features. In order to rationally represent the correlation, we propose a novel matrix factorization method, so-called collaborative matrix factorization with soft regularization (SRCMF). SRCMF improves the prediction performance by combining the drug and the target similarity information with matrix factorization. In contrast to general collaborative matrix factorization, the fundamental idea of SRCMF is to make the similarity features and the potential features of DTI approximate, not identical. Specifically, SRCMF obtains low-rank feature representations of drug similarity and target similarity, and then uses a soft regularization term to constrain the approximation between drug (target) similarity features and drug (target) potential features of DTI. To comprehensively evaluate the prediction performance of SRCMF, we conduct cross-validation experiments under three different settings. In terms of the area under the precision-recall curve (AUPR), SRCMF achieves better prediction results than six state-of-the-art methods. Besides, under different noise levels of similarity data, the prediction performance of SRCMF is much better than that of collaborative matrix factorization. In conclusion, SRCMF is robust leading to performance improvement in drug-target interaction prediction.

Key words: drug-target interaction; collaborative matrix factorization; soft regularization; noisy data;

[1] Morgan S, Grootendorst P, Lexchin J, Cunningham C, Greyson D. The cost of drug development:A systematic review. Health Policy, 2011, 100(1):4-17. DOI:10.1016/j.healthpol.2010.12.002.
[2] Pushpakom S P, Iorio F, Eyers P A et al. Drug repurposing:Progress, challenges and recommendations. Nature Reviews Drug Discovery, 2019, 18(1):41-58. DOI:10.1038/nrd.2018.168.
[3] Drews J. Drug discovery:A historical perspective. Science, 2000, 287(5460):1960-1964. DOI:10.1126/science.287.5460.1960.
[4] Mohs R C, Greig N H. Drug discovery and development:Role of basic biological research. Alzheimer's & Dementia:Translational Research & Clinical Interventions, 2017, 3(4):651-657. DOI:10.1016/j.trci.2017.10.005.
[5] Wang Y, Bryant S H, Cheng T, Wang J, Gindulyte A, Shoemaker B, Thiessen P, He S, Zhang J. PubChem BioAssay:2017 update. Nucleic Acids Research, 2017, 45(D1):D955-D963. DOI:10.1093/nar/gkw1118.
[6] Whitebread S, Hamon J, Bojanic D, Urban L. Keynote review:in vitro safety pharmacology profiling:An essential tool for successful drug development. Drug Discovery Today, 2005, 10(21):1421-1433. DOI:10.1016/S1359-6446(05)03632-9.
[7] Keiser M, Roth B, Armbruster N, Ernsberger P, Irwin J, Shoichet B. Relating protein pharmacology by ligand chemistry. Nature Biotechnology, 2007, 25(2):197-206. DOI:10.1038/nbt1284.
[8] Li H, Gao Z, Kang L, Zhang H. TarFisDock:A web server for identifying drug targets with docking approach. Nucleic Acids Research, 2006, 34(2):W219-W224. DOI:10.1093/nar/gkl114.
[9] Ezzat A, Wu M, Li X, Kwoh C. Computational prediction of drug-target interactions using chemogenomic approaches:An empirical survey. Briefings in Bioinformatics, 2019, 20(4):1337-1357. DOI:10.1093/bib/bby002.
[10] Bleakley K, Yamanishi Y. Supervised prediction of drugtarget interactions using bipartite local models. Bioinformatics, 2009, 25(18):2397-2403. DOI:10.1093/bioinformatics/btp433.
[11] Mei J, Kwoh C, Yang P, Li X, Zheng J. Drug-target interaction prediction by learning from local information and neighbors. Bioinformatics, 2013, 29(2):238-245. DOI:10.1093/bioinformatics/bts670.
[12] Twan V, Elena M, Peter C. Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile. PLoS ONE, 2013, 8(6):Article No. e66952. DOI:10.1371/journal.pone.0066952.
[13] Yan C, Wang J, Lan W et al. SDTRLS:Predicting drugtarget interactions for complex diseases based on chemical substructures. Complexity, 2017, 2017:Article No. 2713280. DOI:10.1155/2017/2713280.
[14] Wang W, Yang S, Li J. Drug target predictions based on heterogeneous graph inference. In Proc. the 2013 Pacific Symposium on Biocomputing, January 2013, pp.53-64. DOI:10.1142/97898144479730006.
[15] Wang L, You Z, Chen X. RFDT:A rotation forest-based predictor for predicting drug-target interactions using drug structure and protein sequence information. Current Protein and Peptide Science, 2018, 19(5):445-454. DOI:10.2174/1389203718666161114111656.
[16] Olayan R, Ashoor H, Bajic V. DDR:Efficient computational method to predict drug-target interactions using graph mining and machine learning approaches. Bioinformatics, 2018, 34(7):1164-1173. DOI:10.1093/bioinformatics/btx731.
[17] Huang Y, You Z, Chen X. A systematic prediction of drug-target interactions using molecular fingerprints and protein sequences. Current Protein and Peptide Science, 2018, 19(5):468-478. DOI:10.2174/1389203718666161122103057.
[18] Yang M, Li Y, Wang J. Feature and nuclear norm minimization for matrix completion. IEEE Transactions on Knowledge and Data Engineering. DOI:10.1109/TKDE.2020.3005978.
[19] Yang M, Luo H, Li Y, Wu F X, Wang J. Overlap matrix completion for predicting drug-associated indications. PLoS Computational Biology, 2019, 15(12):Article No. e1007541. DOI:10.1371/journal.pcbi.1007541.
[20] Yang M, Luo H, Li Y, Wang J. Drug repositioning based on bounded nuclear norm regularization. Bioinformatics, 2019, 35(14):i455-i463. DOI:10.1093/bioinformatics/btz331.
[21] Gönen M. Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics, 2012, 28(18):2304-2310. DOI:10.1093/bioinformatics/bts360.
[22] Zheng X, Ding H, Mamitsuka H, Zhu S. Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In Proc. the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2013, pp.1025-1033. DOI:10.1145/2487575.2487670.
[23] Liu Y, Wu M, Miao C, Zhao P, Li X. Neighborhood regularized logistic matrix factorization for drugtarget interaction prediction. PLoS Computational Biology, 2016, 12(2):Article No. e1004760. DOI:10.1371/journal.pcbi.1004760.
[24] Ezzat A, Zhao P, Wu M, Li X, Kwoh C. Drug-target interaction prediction with graph regularized matrix factorization. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2016, 14(3):646-656. DOI:10.1109/TCBB.2016.2530062.
[25] Mohamed S, Nováek V, Nounu A. Discovering protein drug targets using knowledge graph embeddings. Bioinformatics, 2020, 36(2):603-610. DOI:10.1093/bioinformatics/btz600.
[26] Thafar M, Olayan R, Ashoor H et al. DTiGEMS+:Drug-target interaction prediction using graph embedding, graph mining, and similarity-based techniques. Journal of Cheminformatics, 2020, 12(1):Article No. 44. DOI:10.1186/s13321-020-00447-2.
[27] Wang B, Mezlini A, Demir F et al. Similarity network fusion for aggregating data types on a genomic scale. Nature Methods, 2014, 11(3):333-337. DOI:10.1038/nmeth.2810.
[28] Yang M, Wu G, Zhao Q, Li Y, Wang J. Computational drug repositioning based on multi-similarities bilinear matrix factorization. Briefings in Bioinformatics. DOI:10.1093/bib/bbaa267.
[29] Zhang L, Zhang S. A general joint matrix factorization framework for data integration and its systematic algorithmic exploration. IEEE Transactions on Fuzzy Systems, 2020, 28(9):1971-1983. DOI:10.1109/TFUZZ.2019.2928518.
[30] Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M. Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics, 2008, 24(13):i232-i240. DOI:10.1093/bioinformatics/btn162.
[31] Kanehisa M, Goto S, Hattori M, Aoki-Knoshita K, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M. From genomics to chemical genomics:New developments in KEGG. Nucleic Acids Research, 2006, 34:D354-D357. DOI:10.1093/nar/gkj102.
[32] Schomburg I, Chang A, Ebeling C et al. BRENDA, the enzyme database:Updates and major new developments. Nucleic Acids Research, 2004, 32(suppl 1):D431-D433. DOI:10.1093/nar/gkh081.
[33] Günther S, Kuhn M, Dunkel M et al. SuperTarget and matador:Resources for exploring drug-target relationships. Nucleic Acids Research, 2008, 36(suppl 1):D919-D922. DOI:10.1093/nar/gkm862.
[34] Wishart D, Knox C, Guo A, Cheng D, Shrivastava S, Tzur D, Gautam B, Hassanali M. DrugBank:A knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Research, 2008, 36(suppl 1):D901-D906. DOI:0.1093/nar/gkm958.
[35] Hattori M, Okuno Y, Goto S, Kanehisa M. Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. Journal of the American Chemical Society, 2003, 125(39):11853-11865. DOI:10.1021/ja036030u.
[36] Gaulton A, Bellis L J, Bento A P et al. ChEMBL:A large-scale bioactivity database for drug discovery. Nucleic Acids Research, 2012, 40(D1):D1100-D1107. DOI:10.1093/nar/gkr777.
[37] Sayers E W, Agarwala R, Bolton E E et al. Database resources of the national center for biotechnology information. Nucleic Acids Research, 2019, 47(D1):D23-D28. DOI:10.1093/nar/gky1069.
[38] Gürgen S G, Yazıcı G N, Gözükara C et al. Metoclopramide use to induce lactation can alter BDNF and DRD2 in the prefrontal cortex of offspring. Journal of Chemical Neuroanatomy, 2020, 109:Article No. 101844. DOI:10.1016/j.jchemneu.2020.101844.
[39] Naveen M, Patil A N, Pattanaik S et al. ABCB1 and DRD3 polymorphism as a response predicting biomarker and tool for pharmacogenetically guided clozapine dosing in Asian Indian treatment resistant schizophrenia patients. Asian Journal of Psychiatry, 2020, 48:Article No. 101918. DOI:10.1016/j.ajp.2019.101918.
[1] ZHANG Xiaolong (张晓龙) and Masayuki Numao. Toward Effective Knowledge Acquisition with First-Order Logic Induction [J]. , 2002, 17(5): 0-0.
Full text



[1] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[2] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .
[3] Han Qilong; Lu Ruzhan; Sun Yongqiang;. An Improved Bottom-up Method for Implementing Equational Programming Language[J]. , 1994, 9(1): 63 -69 .
[4] Wang Xianchang; Chen Huowang; Zhao Qinping;. On the Relationship Between TMS and Logic Programs[J]. , 1994, 9(3): 245 -251 .
[5] Li Deyi;. Knowledge Representation in KDD Based on Linguistic Atoms[J]. , 1997, 12(6): 481 -496 .
[6] Zhang Yin; Xu Zhuoqun;. Concurrent Manipulation of Expanded AVL Trees[J]. , 1998, 13(4): 325 -336 .
[7] Zhou Chaochen;. An Overview of Duration Calculus[J]. , 1998, 13(6): 552 .
[8] QI Yuesheng; WANG Baozhong; KANG Lishan;. Genetic Programming with Simple Loops[J]. , 1999, 14(4): 429 -433 .
[9] WEI Xiaohui; JU Jiubin;. SFT: A Consistent Checkpointing Algorithm with Short Freezing Time[J]. , 2000, 15(2): 169 -175 .
[10] ZHOU Aoying; JIN Wen; ZHOU Shuigeng; QIAN Weining; TIAN Zenping;. Incremental Mining of the Schema of Semistructured Data[J]. , 2000, 15(3): 241 -248 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved