基于软正则化的协同矩阵分解在药物-靶标相互作用预测中的应用

doi:10.1007/s11390-021-0844-8

基于软正则化的协同矩阵分解在药物-靶标相互作用预测中的应用

Collaborative Matrix Factorization with Soft Regularization for Drug-Target Interaction Prediction

摘要

摘要: 识别潜在的药物-靶标相互作用（DTI）是药物发现的关键。基于协同过滤的矩阵分解方法凭借其天然的降维和挖掘潜在特征的属性在药物重定位和DTI预测中得到了广泛的应用。然而，基于协同矩阵分解的模型仅仅是简单的令相似性数据与DTI数据的特征相等，没有准确地表示特征之间的关系。为了合理地表示数据特征之间的相关性，我们提出了一种新的矩阵分解方法，即软正则化协同矩阵分解（SRCMF）。SRCMF通过将药物和靶标的相似性信息融入矩阵分解模型来提高预测性能。它的基本思想是通过引入软正则化项来约束DTI潜在特征和相似性数据的特征尽可能相近，而不是完全相等。具体来说，SRCMF利用软正则化项对药物（靶标）相似性特征与DTI的药物（靶标）潜在特征之间的相等关系进行松弛，从而更合理的表示特征之间的关系。为了综合评估SRCMF的预测性能，本文在三种不同的预测任务设置下进行了十倍交叉验证实验，并给出了对应的AUPR和F1值。通过对比当前六种最先进的DTI预测方法，SRCMF在三种预测任务中都表现出了更好的预测性能。为了验证SRCMF在噪声数据下的鲁棒性，本文在相似性数据中添加了高斯噪声来进行DTI预测实验。实验结果表明，在不同的数据噪声水平中，SRCMF的预测性能和鲁棒性都优于对比的预测方法。此外，为了验证SRCMF在实际药物发现中的效果，本文进行了案例分析来验证预测的潜在DTI。研究结果显示，SRCMF在GPCR数据集中预测的前十个得分最高的DTI中有六个在实践中得到验证。这进一步说明了SRCMF在DTI预测中是有效的。

Abstract: Identifying the potential drug-target interactions (DTI) is critical in drug discovery. The drug-target interaction prediction methods based on collaborative filtering have demonstrated attractive prediction performance. However, many corresponding models cannot accurately express the relationship between similarity features and DTI features. In order to rationally represent the correlation, we propose a novel matrix factorization method, so-called collaborative matrix factorization with soft regularization (SRCMF). SRCMF improves the prediction performance by combining the drug and the target similarity information with matrix factorization. In contrast to general collaborative matrix factorization, the fundamental idea of SRCMF is to make the similarity features and the potential features of DTI approximate, not identical. Specifically, SRCMF obtains low-rank feature representations of drug similarity and target similarity, and then uses a soft regularization term to constrain the approximation between drug (target) similarity features and drug (target) potential features of DTI. To comprehensively evaluate the prediction performance of SRCMF, we conduct cross-validation experiments under three different settings. In terms of the area under the precision-recall curve (AUPR), SRCMF achieves better prediction results than six state-of-the-art methods. Besides, under different noise levels of similarity data, the prediction performance of SRCMF is much better than that of collaborative matrix factorization. In conclusion, SRCMF is robust leading to performance improvement in drug-target interaction prediction.

HTML全文

参考文献()

施引文献

资源附件()