Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (5): 942-956.doi: 10.1007/s11390-019-1953-5

Special Issue: Software Systems

• Special Section on Software Systems 2019 • Previous Articles     Next Articles

Bug Triaging Based on Tossing Sequence Modeling

Sheng-Qu Xi1, Yuan Yao1,*, Member, CCF, Xu-Sheng Xiao2, Member, ACM, IEEE, Feng Xu1, Member, CCF, Jian Lv1, Fellow, CCF   

  1. 1 State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China;
    2 Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland Ohio 44106-7071, U.S.A
  • Received:2019-03-01 Revised:2019-08-05 Online:2019-08-31 Published:2019-08-31
  • Contact: Yuan Yao
  • About author:Sheng-Qu Xi currently is a Ph.D. student of a five-year educational system in State Key Laboratory for Novel Software, the Department of Computer Science and Technology at Nanjing University, Nanjing. He received his B.S. degree in computer science from Nanjing University, Nanjing, in 2014. His major research interest includes mining bug reports and deep learning for software engineering.
  • Supported by:
    This work is supported by the National Natural Science Foundation of China under Grant Nos. 61690204, 61672274, and 61702252, and the Collaborative Innovation Center of Novel Software Technology and Industrialization at Nanjing University.

Bug triaging, which routes the bug reports to potential fixers, is an integral step in software development and maintenance. To make bug triaging more efficient, many researchers propose to adopt machine learning and information retrieval techniques to identify some suitable fixers for a given bug report. However, none of the existing proposals simultaneously take into account the following three aspects that matter for the efficiency of bug triaging:1) the textual content in the bug reports, 2) the metadata in the bug reports, and 3) the tossing sequence of the bug reports. To simultaneously make use of the above three aspects, we propose iTriage which first adopts a sequence-to-sequence model to jointly learn the features of textual content and tossing sequence, and then uses a classification model to integrate the features from textual content, metadata, and tossing sequence. Evaluation results on three different open-source projects show that the proposed approach has significantly improved the accuracy of bug triaging compared with the state-of-the-art approaches.

Key words: bug triaging; tossing sequence; software repository mining;

[1] Bertram D, Voida A, Greenberg S, Walker R. Communication, collaboration, and bugs:The social nature of issue tracking in small, collocated teams. In Proc. the 2010 ACM Conference on Computer Supported Cooperative Work, February 2010, pp.291-300.
[2] Jeong G, Kim S, Zimmermann T. Improving bug triage with bug tossing graphs. In Proc. the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, August 2009, pp.111-120.
[3] Anvik J, Hiew L, Murphy G C. Who should fix this bug? In Proc. the 28th International Conference on Software Engineering, May 2006, pp.361-370.
[4] Bhattacharya P, Neamtiu I. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In Proc. the 2010 IEEE International Conference on Software Maintenance, September 2010, Article No. 41.
[5] Yang G, Zhang T, Lee B. Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In Proc. the 38th Annual Computer Software and Applications Conference, July 2014, pp.97-106.
[6] Xia X, Lo D, Ding Y, Al-Kofahi J M, Nguyen T N, Wang X. Improving automated bug triaging with specialized topic model. IEEE Transactions on Software Engineering, 2017, 43(3):272-297.
[7] Zhang T, Yang G, Lee B, Lua E K. A novel developer ranking algorithm for automatic bug triage using topic model and developer relations. In Proc. the 21st Asia-Pacific Software Engineering Conference, December 2014, pp.223-230.
[8] Lee S R, Heo M J, Lee C G, Kim M, Jeong G. Applying deep learning based automatic bug triager to industrial projects. In Proc. the 11th Joint Meeting on Foundations of Software Engineering, September 2017, pp.926-931.
[9] Mani S, Sankaran A, Aralikatte R. DeepTriage:Exploring the effectiveness of deep learning for bug triaging. arXiv:1801.01275, 2018., June 2019.
[10] Xi S Q, Yao Y, Xiao X S, Xu F, Lu J. An effective approach for routing the bug reports to the right fixers. In Proc. the 10th Asia-Pacific Symposium on Internetware, September 2018, Article No. 11.
[11] Zhang X F, Yao Y, Wang Y J, Xu F, Lu J. Exploring metadata in bug reports for bug localization. In Proc. the 24th Asia-Pacific Software Engineering Conference, December 2017, pp.328-337.
[12] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, 2014., June 2019.
[13] Hinton G, Deng L, Yu D et al. Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups. IEEE Signal Processing Magazine, 2012, 29(6):82-97.
[14] Johnson R, Zhang T. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv:1602.02373, 2016., June 2019.
[15] Yang Z C, Yang D Y, Dyer C, He X D, Smola A, Hovy E. Hierarchical attention networks for document classification. In Proc. the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, June 2016, pp.1480-1489.
[16] Cho K, van Merrienboer B, Gülçehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078, 2014., June 2019.
[17] Xi S Q, Yao Y, Xu F, Lu J. Bug triaging approach based on recurrent neural networks. Journal of Software, 2018, 29(8):2322-2335. (in Chinese)
[18] Tamrawi A, Nguyen T T, Al-Kofahi J M, Nguyen T N. Fuzzy set and cache-based approach for bug triaging. In Proc. the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, September 2011, pp.365-375.
[19] Chang C C, Lin C J. LIBSVM:A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3):Article No. 27.
[20] Hoffman M, Bach F R, Blei D M. Online learning for latent Dirichlet allocation. In Proc. the 24th Annual Conference on Neural Information Processing Systems, December 2010, pp.856-864.
[21] Pennington J, Socher R, Manning C. Glove:Global vectors for word representation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, October 2014, pp.1532-1543.
[22] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580, 2012., June 2019.
[23] Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. In Proc. the 30th International Conference on Machine Learning, June 2013, pp.1139-1147.
[24] Lin Z P, Shu F D, Ye Y, Hu C Y, Wang Q. An empirical study on bug assignment automation using Chinese bug data. In Proc. the 3rd International Symposium on Empirical Software Engineering & Measurement, October 2009, pp.451-455.
[25] Naguib H, Narayan N, Brügge B, Helal D. Bug report assignee recommendation using activity profiles. In Proc. the 10th Working Conference on Mining Software Repositories, May 2013, pp.22-30.
[26] Cubranic D, Murphy G C. Automatic bug triage using text categorization. In Proc. the 16th International Conference on Software Engineering & Knowledge Engineering, June 2004, pp.92-97.
[27] Alenezi M, Magel K, Banitaan S. Efficient bug triaging using text mining. Journal of Software, 2013, 8(9):2185-2190.
[28] Zou W Q, Hu Y, Xuan J F, Jiang H. Towards training set reduction for bug triage. In Proc. the 35th Annual Computer Software and Applications Conference, July 2011, pp.576-581.
[29] Park J, Lee M W, Kim J, Hwang S, Kim S. CosTriage:A cost-aware triage algorithm for bug reporting systems. In Proc. the 25th AAAI Conference on Artificial Intelligence, August 2011, Article No. 22.
[30] Hu H, Zhang H Y, Xuan J F, Sun W G. Effective bug triage based on historical bug-fix information. In Proc. the 25th International Symposium on Software Reliability Engineering, November 2014, pp.122-132.
[31] Zhang W, Wang S, Wang Q. KSAP:An approach to bug report assignment using KNN search and heterogeneous proximity. Information and Software Technology, 2016, 70:68-84.
[32] Wang S, Zhang W, Yang Y, Wang Q. DevNet:Exploring developer collaboration in heterogeneous networks of bug repositories. In Proc. the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2013, pp.193-202.
[33] Wu H R, Liu H Y, Ma Y T. Empirical study on developer factors affecting tossing path length of bug reports. IET Software, 2018, 12(3):258-270.
[34] Hooimeijer P, Weimer W. Modeling bug report quality. In Proc. the 22nd IEEE/ACM International Conference on Automated Software Engineering, November 2007, pp.34-43.
[35] Demeyer S, Lamkanfi A. Predicting reassignments of bug reports-An exploratory investigation. In Proc. the 17th European Conference on Software Maintenance and Reengineering, March 2013, pp.327-330.
[36] Herraiz I, Germán D M, González-Barahona J M, Robles G. Towards a simplification of the bug report form in Eclipse. In Proc. the 2008 International Working Conference on Mining Software Repositories, May 2008, pp.145-148.
[37] Wu L, Xie B Y, Kaiser G E, Passonneau R J. BUGMINER:Software reliability analysis via data mining of bug reports. In Proc the 23rd International Conference on Software Engineering & Knowledge Engineering, July 2011, pp.95-100.
[38] Zanetti M S, Scholtes I, Tessone C J, Schweitzer F. Categorizing bugs with social networks:A case study on four open source software communities. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.1032-1041.
[39] Fan Y R, Xia X, Lo D, Hassan A E. Chaff from the wheat:Characterizing and determining valid bug reports. IEEE Transactions on Software Engineering. doi:10.1109/TSE.2018.2864217.
[40] Hiew L. Assisted detection of duplicate bug reports[Ph.D. Thesis]. University of British Columbia, 2006.
[41] Runeson P, Alexandersson M, Nyholm O. Detection of duplicate defect reports using natural language processing. In Proc. the 29th International Conference on Software Engineering, May 2007, pp.499-510.
[42] Wang X Y, Zhang L, Xie T, Anvik J, Sun J. An approach to detecting duplicate bug reports using natural language and execution information. In Proc. the 30th International Conference on Software Engineering, May 2008, pp.461-470.
[43] Nguyen A T, Nguyen T T, Nguyen T N, Lo D, Sun C. Duplicate bug report detection with a combination of information retrieval and topic modeling. In Proc. the 27th IEEE/ACM International Conference on Automated Software Engineering, September 2012, pp.70-79.
[44] Tian Y, Sun C, Lo D. Improved duplicate bug report identification. In Proc. the 16th European Conference on Software Maintenance and Reengineering, March 2012, pp.385-390.
[45] Bettenburg N, Premraj R, Zimmermann T, Kim S. Duplicate bug reports considered harmful... really? In Proc. the 24th International Conference on Software Maintenance, September 2008, pp.337-345.
[46] Cavalcanti Y C, de Almeida E S, da Cunha C E A, Lucrédio D, Meira S R. An initial study on the bug report duplication problem. In Proc. the 14th European Conference on Software Maintenance & Reengineering, March 2010, pp.264-267.
[47] Cavalcanti Y C, Almeida E S, Cunha C E A et al. The bug report duplication problem:An exploratory study. Software Quality Journal, 2013, 21(1):39-66.
[1] Jaweria Kanwal and Onaiza Maqbool. Bug Prioritization to Facilitate Bug Report Triage [J]. , 2012, (2): 397-412.
Full text



[1] Zhang Bo; Zhang Ling;. Statistical Heuristic Search[J]. , 1987, 2(1): 1 -11 .
[2] Meng Liming; Xu Xiaofei; Chang Huiyou; Chen Guangxi; Hu Mingzeng; Li Sheng;. A Tree-Structured Database Machine for Large Relational Database Systems[J]. , 1987, 2(4): 265 -275 .
[3] Lin Qi; Xia Peisu;. The Design and Implementation of a Very Fast Experimental Pipelining Computer[J]. , 1988, 3(1): 1 -6 .
[4] Sun Chengzheng; Tzu Yungui;. A New Method for Describing the AND-OR-Parallel Execution of Logic Programs[J]. , 1988, 3(2): 102 -112 .
[5] Zhang Bo; Zhang Tian; Zhang Jianwei; Zhang Ling;. Motion Planning for Robots with Topological Dimension Reduction Method[J]. , 1990, 5(1): 1 -16 .
[6] Zhou Chaochen; Liu Xinxin;. Denote CSP with Temporal Formulas[J]. , 1990, 5(1): 17 -23 .
[7] Wang Dingxing; Zheng Weimin; Du Xiaoli; Guo Yike;. On the Execution Mechanisms of Parallel Graph Reduction[J]. , 1990, 5(4): 333 -346 .
[8] Zhou Quan; Wei Daozheng;. A Complete Critical Path Algorithm for Test Generation of Combinational Circuits[J]. , 1991, 6(1): 74 -82 .
[9] Zhao Jinghai; Liu Shenquan;. An Environment for Rapid Prototyping of Interactive Systems[J]. , 1991, 6(2): 135 -144 .
[10] Shang Lujun; Xu Lihui;. Notes on the Design of an Integrated Object-Oriented DBMS Family[J]. , 1991, 6(4): 389 -394 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved