用于软件可维护性预测的机器学习技术：精度分析

doi:10.1007/s11390-020-9668-1

用于软件可维护性预测的机器学习技术：精度分析

Machine Learning Techniques for Software Maintainability Prediction: Accuracy Analysis

摘要

摘要: 维护部署在终端用户端的软件非常费力，并且就其生命周期而言，其花销通常比最初的软件开发多很多。工业界对于减少成本，尤其是维护成本的期望，使得软件可维护性的预测成为一个重要研究主题。研究人员和实践人员一直致力于提出和找到能更好地预测可维护性的技术，包括统计方法和机器学习（ML）方法。本综述旨在对使用ML技术进行软件产品可维护性预测（SPMP）工作的准确性进行经验性分析。从2000年-2018年发表的文献中挑选出77篇，对其工作进行经验性分析，挑选时所考虑的方面包括：可维护性预测技术、检测方法、精度标准、ML技术的整体精度和最佳性能技术。研究过程遵循系统综述过程，广泛使用N折验证和和留一交叉验证方法，以MMRE和PRED为精度标准。研究工作结果显示可维护性预测广泛使用ML技术，其中人工神经网络（ANN），支持向量机/回归（SVM/R）,回归&决策树（DT）和模糊&神经模糊（FNF）技术在PRED和MMRE方面更加准确。一般而言，ML技术优于非机器学习技术，如，回归分析（RA）技术，而在大多数实验中FNF优于SVM/R,DT和ANN。然而，曾被报告为表现优异的各种技术中，没有一种是公认最佳的。

Abstract: Maintaining software once implemented on the end-user side is laborious and, over its lifetime, is most often considerably more expensive than the initial software development. The prediction of software maintainability has emerged as an important research topic to address industry expectations for reducing costs, in particular, maintenance costs. Researchers and practitioners have been working on proposing and identifying a variety of techniques ranging from statistical to machine learning (ML) for better prediction of software maintainability. This review has been carried out to analyze the empirical evidence on the accuracy of software product maintainability prediction (SPMP) using ML techniques. This paper analyzes and discusses the findings of 77 selected studies published from 2000 to 2018 according to the following criteria: maintainability prediction techniques, validation methods, accuracy criteria, overall accuracy of ML techniques, and the techniques offering the best performance. The review process followed the well-known systematic review process. The results show that ML techniques are frequently used in predicting maintainability. In particular, artificial neural network (ANN), support vector machine/regression (SVM/R), regression & decision trees (DT), and fuzzy & neuro fuzzy (FNF) techniques are more accurate in terms of PRED and MMRE. The N-fold and leave-one-out cross-validation methods, and the MMRE and PRED accuracy criteria are frequently used in empirical studies. In general, ML techniques outperformed non-machine learning techniques, e.g., regression analysis (RA) techniques, while FNF outperformed SVM/R, DT, and ANN in most experiments. However, while many techniques were reported superior, no specific one can be identified as the best.

HTML全文

参考文献()

施引文献

资源附件()