计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (5): 942-956.doi: 10.1007/s11390-019-1953-5

所属专题: Software Systems

• Special Section on Software Systems 2019 • 上一篇    下一篇

基于缺陷报告传递序列建模的缺陷分派方法

Sheng-Qu Xi1, Yuan Yao1,*, Member, CCF, Xu-Sheng Xiao2, Member, ACM, IEEE, Feng Xu1, Member, CCF, Jian Lv1, Fellow, CCF   

  1. 1 State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China;
    2 Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland Ohio 44106-7071, U.S.A
  • 收稿日期:2019-03-01 修回日期:2019-08-05 出版日期:2019-08-31 发布日期:2019-08-31
  • 通讯作者: Yuan Yao E-mail:y.yao@nju.edu.cn
  • 作者简介:Sheng-Qu Xi currently is a Ph.D. student of a five-year educational system in State Key Laboratory for Novel Software, the Department of Computer Science and Technology at Nanjing University, Nanjing. He received his B.S. degree in computer science from Nanjing University, Nanjing, in 2014. His major research interest includes mining bug reports and deep learning for software engineering.
  • 基金资助:
    This work is supported by the National Natural Science Foundation of China under Grant Nos. 61690204, 61672274, and 61702252, and the Collaborative Innovation Center of Novel Software Technology and Industrialization at Nanjing University.

Bug Triaging Based on Tossing Sequence Modeling

Sheng-Qu Xi1, Yuan Yao1,*, Member, CCF, Xu-Sheng Xiao2, Member, ACM, IEEE, Feng Xu1, Member, CCF, Jian Lv1, Fellow, CCF   

  1. 1 State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China;
    2 Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland Ohio 44106-7071, U.S.A
  • Received:2019-03-01 Revised:2019-08-05 Online:2019-08-31 Published:2019-08-31
  • Contact: Yuan Yao E-mail:y.yao@nju.edu.cn
  • About author:Sheng-Qu Xi currently is a Ph.D. student of a five-year educational system in State Key Laboratory for Novel Software, the Department of Computer Science and Technology at Nanjing University, Nanjing. He received his B.S. degree in computer science from Nanjing University, Nanjing, in 2014. His major research interest includes mining bug reports and deep learning for software engineering.
  • Supported by:
    This work is supported by the National Natural Science Foundation of China under Grant Nos. 61690204, 61672274, and 61702252, and the Collaborative Innovation Center of Novel Software Technology and Industrialization at Nanjing University.

缺陷分派是软件开发与维护中不可或缺的一环,其目的在于为缺陷报告选择合适的修复者。研究者们尝试采用机器学习、信息检索的方法为给定缺陷报告选择合适的修复者,以优化缺陷分派流程。然而,已有方法尚未同时考虑下述三个关乎缺陷分派准确性的方面:1)缺陷报告的文本信息;2)缺陷报告的元数据;3)缺陷报告传递序列。为同时使用这三种信息,提出了一种序列到序列的模型同时训练文本与传递序列信息,随后将训练好的文本、传递序列特征同元数据特征相结合,共同输入到分类模型以完成预测。在三个不同开源项目的实验中,本文方法在准确性方面比较当前最先进有显著提升。

关键词: 缺陷分派, 缺陷报告传递序列, 软件仓库挖掘

Abstract: Bug triaging, which routes the bug reports to potential fixers, is an integral step in software development and maintenance. To make bug triaging more efficient, many researchers propose to adopt machine learning and information retrieval techniques to identify some suitable fixers for a given bug report. However, none of the existing proposals simultaneously take into account the following three aspects that matter for the efficiency of bug triaging:1) the textual content in the bug reports, 2) the metadata in the bug reports, and 3) the tossing sequence of the bug reports. To simultaneously make use of the above three aspects, we propose iTriage which first adopts a sequence-to-sequence model to jointly learn the features of textual content and tossing sequence, and then uses a classification model to integrate the features from textual content, metadata, and tossing sequence. Evaluation results on three different open-source projects show that the proposed approach has significantly improved the accuracy of bug triaging compared with the state-of-the-art approaches.

Key words: bug triaging, tossing sequence, software repository mining

[1] Bertram D, Voida A, Greenberg S, Walker R. Communication, collaboration, and bugs:The social nature of issue tracking in small, collocated teams. In Proc. the 2010 ACM Conference on Computer Supported Cooperative Work, February 2010, pp.291-300.
[2] Jeong G, Kim S, Zimmermann T. Improving bug triage with bug tossing graphs. In Proc. the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, August 2009, pp.111-120.
[3] Anvik J, Hiew L, Murphy G C. Who should fix this bug? In Proc. the 28th International Conference on Software Engineering, May 2006, pp.361-370.
[4] Bhattacharya P, Neamtiu I. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In Proc. the 2010 IEEE International Conference on Software Maintenance, September 2010, Article No. 41.
[5] Yang G, Zhang T, Lee B. Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In Proc. the 38th Annual Computer Software and Applications Conference, July 2014, pp.97-106.
[6] Xia X, Lo D, Ding Y, Al-Kofahi J M, Nguyen T N, Wang X. Improving automated bug triaging with specialized topic model. IEEE Transactions on Software Engineering, 2017, 43(3):272-297.
[7] Zhang T, Yang G, Lee B, Lua E K. A novel developer ranking algorithm for automatic bug triage using topic model and developer relations. In Proc. the 21st Asia-Pacific Software Engineering Conference, December 2014, pp.223-230.
[8] Lee S R, Heo M J, Lee C G, Kim M, Jeong G. Applying deep learning based automatic bug triager to industrial projects. In Proc. the 11th Joint Meeting on Foundations of Software Engineering, September 2017, pp.926-931.
[9] Mani S, Sankaran A, Aralikatte R. DeepTriage:Exploring the effectiveness of deep learning for bug triaging. arXiv:1801.01275, 2018. https://arxiv.org/pdf/1801.01275.pdf, June 2019.
[10] Xi S Q, Yao Y, Xiao X S, Xu F, Lu J. An effective approach for routing the bug reports to the right fixers. In Proc. the 10th Asia-Pacific Symposium on Internetware, September 2018, Article No. 11.
[11] Zhang X F, Yao Y, Wang Y J, Xu F, Lu J. Exploring metadata in bug reports for bug localization. In Proc. the 24th Asia-Pacific Software Engineering Conference, December 2017, pp.328-337.
[12] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, 2014. https://arxiv.org/pdf/1409.0473.pdf, June 2019.
[13] Hinton G, Deng L, Yu D et al. Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups. IEEE Signal Processing Magazine, 2012, 29(6):82-97.
[14] Johnson R, Zhang T. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv:1602.02373, 2016. https://arxiv.org/pdf/1602.02373.pdf, June 2019.
[15] Yang Z C, Yang D Y, Dyer C, He X D, Smola A, Hovy E. Hierarchical attention networks for document classification. In Proc. the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, June 2016, pp.1480-1489.
[16] Cho K, van Merrienboer B, Gülçehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078, 2014. https://arxiv.org/pdf/1406.1078.pdf, June 2019.
[17] Xi S Q, Yao Y, Xu F, Lu J. Bug triaging approach based on recurrent neural networks. Journal of Software, 2018, 29(8):2322-2335. (in Chinese)
[18] Tamrawi A, Nguyen T T, Al-Kofahi J M, Nguyen T N. Fuzzy set and cache-based approach for bug triaging. In Proc. the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, September 2011, pp.365-375.
[19] Chang C C, Lin C J. LIBSVM:A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3):Article No. 27.
[20] Hoffman M, Bach F R, Blei D M. Online learning for latent Dirichlet allocation. In Proc. the 24th Annual Conference on Neural Information Processing Systems, December 2010, pp.856-864.
[21] Pennington J, Socher R, Manning C. Glove:Global vectors for word representation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, October 2014, pp.1532-1543.
[22] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580, 2012. https://arxiv.org/pdf/1207.0580.pdf, June 2019.
[23] Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. In Proc. the 30th International Conference on Machine Learning, June 2013, pp.1139-1147.
[24] Lin Z P, Shu F D, Ye Y, Hu C Y, Wang Q. An empirical study on bug assignment automation using Chinese bug data. In Proc. the 3rd International Symposium on Empirical Software Engineering & Measurement, October 2009, pp.451-455.
[25] Naguib H, Narayan N, Brügge B, Helal D. Bug report assignee recommendation using activity profiles. In Proc. the 10th Working Conference on Mining Software Repositories, May 2013, pp.22-30.
[26] Cubranic D, Murphy G C. Automatic bug triage using text categorization. In Proc. the 16th International Conference on Software Engineering & Knowledge Engineering, June 2004, pp.92-97.
[27] Alenezi M, Magel K, Banitaan S. Efficient bug triaging using text mining. Journal of Software, 2013, 8(9):2185-2190.
[28] Zou W Q, Hu Y, Xuan J F, Jiang H. Towards training set reduction for bug triage. In Proc. the 35th Annual Computer Software and Applications Conference, July 2011, pp.576-581.
[29] Park J, Lee M W, Kim J, Hwang S, Kim S. CosTriage:A cost-aware triage algorithm for bug reporting systems. In Proc. the 25th AAAI Conference on Artificial Intelligence, August 2011, Article No. 22.
[30] Hu H, Zhang H Y, Xuan J F, Sun W G. Effective bug triage based on historical bug-fix information. In Proc. the 25th International Symposium on Software Reliability Engineering, November 2014, pp.122-132.
[31] Zhang W, Wang S, Wang Q. KSAP:An approach to bug report assignment using KNN search and heterogeneous proximity. Information and Software Technology, 2016, 70:68-84.
[32] Wang S, Zhang W, Yang Y, Wang Q. DevNet:Exploring developer collaboration in heterogeneous networks of bug repositories. In Proc. the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2013, pp.193-202.
[33] Wu H R, Liu H Y, Ma Y T. Empirical study on developer factors affecting tossing path length of bug reports. IET Software, 2018, 12(3):258-270.
[34] Hooimeijer P, Weimer W. Modeling bug report quality. In Proc. the 22nd IEEE/ACM International Conference on Automated Software Engineering, November 2007, pp.34-43.
[35] Demeyer S, Lamkanfi A. Predicting reassignments of bug reports-An exploratory investigation. In Proc. the 17th European Conference on Software Maintenance and Reengineering, March 2013, pp.327-330.
[36] Herraiz I, Germán D M, González-Barahona J M, Robles G. Towards a simplification of the bug report form in Eclipse. In Proc. the 2008 International Working Conference on Mining Software Repositories, May 2008, pp.145-148.
[37] Wu L, Xie B Y, Kaiser G E, Passonneau R J. BUGMINER:Software reliability analysis via data mining of bug reports. In Proc the 23rd International Conference on Software Engineering & Knowledge Engineering, July 2011, pp.95-100.
[38] Zanetti M S, Scholtes I, Tessone C J, Schweitzer F. Categorizing bugs with social networks:A case study on four open source software communities. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.1032-1041.
[39] Fan Y R, Xia X, Lo D, Hassan A E. Chaff from the wheat:Characterizing and determining valid bug reports. IEEE Transactions on Software Engineering. doi:10.1109/TSE.2018.2864217.
[40] Hiew L. Assisted detection of duplicate bug reports[Ph.D. Thesis]. University of British Columbia, 2006.
[41] Runeson P, Alexandersson M, Nyholm O. Detection of duplicate defect reports using natural language processing. In Proc. the 29th International Conference on Software Engineering, May 2007, pp.499-510.
[42] Wang X Y, Zhang L, Xie T, Anvik J, Sun J. An approach to detecting duplicate bug reports using natural language and execution information. In Proc. the 30th International Conference on Software Engineering, May 2008, pp.461-470.
[43] Nguyen A T, Nguyen T T, Nguyen T N, Lo D, Sun C. Duplicate bug report detection with a combination of information retrieval and topic modeling. In Proc. the 27th IEEE/ACM International Conference on Automated Software Engineering, September 2012, pp.70-79.
[44] Tian Y, Sun C, Lo D. Improved duplicate bug report identification. In Proc. the 16th European Conference on Software Maintenance and Reengineering, March 2012, pp.385-390.
[45] Bettenburg N, Premraj R, Zimmermann T, Kim S. Duplicate bug reports considered harmful... really? In Proc. the 24th International Conference on Software Maintenance, September 2008, pp.337-345.
[46] Cavalcanti Y C, de Almeida E S, da Cunha C E A, Lucrédio D, Meira S R. An initial study on the bug report duplication problem. In Proc. the 14th European Conference on Software Maintenance & Reengineering, March 2010, pp.264-267.
[47] Cavalcanti Y C, Almeida E S, Cunha C E A et al. The bug report duplication problem:An exploratory study. Software Quality Journal, 2013, 21(1):39-66.
[1] Najam Nazar, Yan Hu, He Jiang. 软件工件摘要方法综述[J]. , 2016, 31(5): 883-909.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 张钹; 张铃;. Statistical Heuristic Search[J]. , 1987, 2(1): 1 -11 .
[2] 孟力明; 徐晓飞; 常会友; 陈光熙; 胡铭曾; 李生;. A Tree-Structured Database Machine for Large Relational Database Systems[J]. , 1987, 2(4): 265 -275 .
[3] 林琦; 夏培肃;. The Design and Implementation of a Very Fast Experimental Pipelining Computer[J]. , 1988, 3(1): 1 -6 .
[4] 孙成政; 慈云桂;. A New Method for Describing the AND-OR-Parallel Execution of Logic Programs[J]. , 1988, 3(2): 102 -112 .
[5] 张钹; 张恬; 张建伟; 张铃;. Motion Planning for Robots with Topological Dimension Reduction Method[J]. , 1990, 5(1): 1 -16 .
[6] 周巢尘; 柳欣欣;. Denote CSP with Temporal Formulas[J]. , 1990, 5(1): 17 -23 .
[7] 王鼎兴; 郑纬民; 杜晓黎; 郭毅可;. On the Execution Mechanisms of Parallel Graph Reduction[J]. , 1990, 5(4): 333 -346 .
[8] 周权; 魏道政;. A Complete Critical Path Algorithm for Test Generation of Combinational Circuits[J]. , 1991, 6(1): 74 -82 .
[9] 赵靓海; 刘慎权;. An Environment for Rapid Prototyping of Interactive Systems[J]. , 1991, 6(2): 135 -144 .
[10] 商陆军; 许立辉;. Notes on the Design of an Integrated Object-Oriented DBMS Family[J]. , 1991, 6(4): 389 -394 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: