We use cookies to improve your experience with our site.

使用OPTICS聚类进行基于错误划分的多错误定位

FATOC: Bug Isolation Based Multi-Fault Localization by Using OPTICS Clustering

  • 摘要: 错误划分是一种常用的多错误定位(multi-fault localization,MFL)方法,该方法将所有失败的测试用例聚类至数个类簇中,然后将每类簇中的失败测试用例与所有通过的测试用例结合用于定位单个错误。但是,现有的聚类算法通常无法获得完全正确的聚类结果,这对于基于错误划分的MFL方法是潜在的威胁。为了解决这个问题,我们首先分析聚类精度对MFL性能的影响,我们的实证研究结果表明,使用精度最高的类簇可以实现最佳的MFL性能。此外,当前对聚类算法的研究表明,类簇的密度较高时表示该类簇中的元素具有更高的相似性。基于上述动机,我们提出了一种新的方法FATOC(one-fault-at-a-time via OPTICS clustering)。FATOC方法通过OPTICS聚类算法对失败的测试用例进行聚类,然后识别出密度最高的聚类。OPTICS算法是基于密度的聚类算法,可以减少聚类误差并为每个类簇计算密度值,这将有助于选取相似度最高的类簇。FATOC将该类簇中的失败测试用例与所有通过的测试用例结合,然后通过传统的基于频谱的错误定位(spectrum-based fault localization,SBFL)公式来定位单个错误。在定位并修复此错误之后,FATOC将使用相同的方法来定位下一个单错误,直到所有测试用例通过。我们的评估结果表明,在来自9个实际程序的804个多错误版本中,FATOC方法的效果显著优于传统的SBFL技术和最新的MFL方法MSeer。除此之外,在使用Ochiai公式的A-EXAM指标时,FATOC的性能比传统的SBFL高出10.32%。此外,结果还表明,在检查所有错误版本的1%,3%和5%语句时,FATOC可以分别定位所有错误的36.91%,48.50%和66.93%,这也优于传统的SBFL和MSeer方法。

     

    Abstract: Bug isolation is a popular approach for multi-fault localization (MFL), where all failed test cases are clustered into several groups, and then the failed test cases in each group combined with all passed test cases are used to localize only a single fault. However, existing clustering algorithms cannot always obtain completely correct clustering results, which is a potential threat for bug isolation based MFL approaches. To address this issue, we first analyze the influence of the accuracy of the clustering on the performance of MFL, and the results of a controlled study indicate that using the clustering algorithm with the highest accuracy can achieve the best performance of MFL. Moreover, previous studies on clustering algorithms also show that the elements in a higher density cluster have a higher similarity. Based on the above motivation, we propose a novel approach FATOC (One-Fault-at-a-Time via OPTICS Clustering). In particular, FATOC first leverages the OPTICS (Ordering Points to Identify the Clustering Structure) clustering algorithm to group failed test cases, and then identifies a cluster with the highest density. OPTICS clustering is a density-based clustering algorithm, which can reduce the misgrouping and calculate a density value for each cluster. Such a density value of each cluster is helpful for finding a cluster with the highest clustering effectiveness. FATOC then combines the failed test cases in this cluster with all passed test cases to localize a single-fault through the traditional spectrum-based fault localization (SBFL) formula. After this fault is localized and fixed, FATOC will use the same method to localize the next single-fault, until all the test cases are passed. Our evaluation results show that FATOC can significantly outperform the traditional SBFL technique and a state-of-the-art MFL approach MSeer on 804 multi-faulty versions from nine real-world programs. Specifically, FATOC's performance is 10.32% higher than that of traditional SBFL when using Ochiai formula in terms of metric A-EXAM. Besides, the results also indicate that, when checking 1%, 3% and 5% statements of all subject programs, FATOC can locate 36.91%, 48.50% and 66.93% of all faults respectively, which is also better than the traditional SBFL and the MFL approach MSeer.

     

/

返回文章
返回