Abstract In practice, some bugs have more impact than others and thus deserve more immediate attention. Due to tight schedule and limited human resources, developers may not have enough time to inspect all bugs. Thus, they often concentrate on bugs that are highly impactful. In the literature, high-impact bugs are used to refer to the bugs which appear at unexpected time or locations and bring more unexpected effects (i.e., surprise bugs), or break pre-existing functionalities and destroy the user experience (i.e., breakage bugs). Unfortunately, identifying high-impact bugs from thousands of bug reports in a bug tracking system is not an easy feat. Thus, an automated technique that can identify high-impact bug reports can help developers to be aware of them early, rectify them quickly, and minimize the damages they cause. Considering that only a small proportion of bugs are high-impact bugs, the identification of high-impact bug reports is a difficult task. In this paper, we propose an approach to identify high-impact bug reports by leveraging imbalanced learning strategies. We investigate the effectiveness of various variants, each of which combines one particular imbalanced learning strategy and one particular classification algorithm. In particular, we choose four widely used strategies for dealing with imbalanced data and four state-of-the-art text classification algorithms to conduct experiments on four datasets from four different open source projects. We mainly perform an analytical study on two types of high-impact bugs, i.e., surprise bugs and breakage bugs. The results show that different variants have different performances, and the best performing variants SMOTE (synthetic minority over-sampling technique)+KNN (K-nearest neighbours) for surprise bug identification and RUS (random under-sampling)+NB (naive Bayes) for breakage bug identification outperform the F1-scores of the two state-of-the-art approaches by Thung et al. and Garcia and Shihab.
A preliminary version of the paper was published in the Proceedings of COMPSAC 2016.This work is supported by the National Natural Science Foundation of China under Grant Nos. 61602403 and 61402406 and the National Key Technology Research and Development Program of the Ministry of Science and Technology of China under Grant No. 2015BAH17F01.
About author: Xin Xia received his Ph.D. degree in computer science from the College of Computer Science and Technology, Zhejiang University, Hangzhou, in 2014. He is currently a research assistant professor in the College of Computer Science and Technology at Zhejiang University, Hangzhou. His research interests include software analytic, empirical study, and mining software repository.
Cite this article:
Xin-Li Yang, David Lo, Xin Xia, Qiao Huang, Jian-Ling Sun.High-Impact Bug Report Identification with Imbalanced Learning Strategies[J] Journal of Computer Science and Technology, 2017,V32(1): 181-198
 D'Ambros M, Lanza M, Robbes R. An extensive comparison of bug prediction approaches. In Proc. the 7th IEEE Working Conference on Mining Software Repositories (MSR), May 2010, pp.31-41. Rahman F, Devanbu P. How, and why, process metrics are better. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.432-441. Nam J, Pan S J, Kim S. Transfer defect learning. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.382-391. Kamei Y, Shihab E, Adams B, Hassan A E, Mockus A, Sinha A, Ubayashi N. A large-scale empirical study of justin-time quality assurance. IEEE Transactions on Software Engineering, 2013, 39(6):757-773. Yang X, Lo D, Xia X, Zhang Y, Sun J. Deep learning for just-in-time defect prediction. In Proc. the IEEE International Conference on Software Quality, Reliability and Security (QRS), August 2015, pp.17-26. Hassan A E. The road ahead for mining software repositories. In Proc. the Frontiers of Software Maintenance, September 28-October 4, 2008, pp.48-57. Godfrey M W, Hassan A E, Herbsleb J, Murphy G C, Robillard M, Devanbu P, Mockus A, Perry D E, Notkin D. Future of mining software archives:A roundtable. IEEE Software, 2009, 26(1):67-70. Shihab E, Jiang Z M, Ibrahim W M, Adams B, Hassan A E. Understanding the impact of code and process metrics on post-release defects:A case study on the Eclipse project. In Proc. the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, September 2010. Shihab E, Mockus A, Kamei Y, Adams B, Hassan A E. High impact defects:A study of breakage and surprise defects. In Proc. the 19th ACM SIGSOFT FSE and the 13th ESEC, September 2011, pp.300-310. Anvik J, Hiew L, Murphy G C. Coping with an open bug repository. In Proc. the OOPSLA Workshop on Eclipse Technology eXchange, October 2005, pp.35-39. Ohira M, Kashiwa Y, Yamatani Y, Yoshiyuki H, Maeda Y, Limsettho N, Fujino K, Hata H, Ihara A, Matsumoto K. A dataset of high-impact bugs:Manually-classified issue reports. In Proc. the 12th IEEE Working Conference on Mining Software Repositories (MSR), May 2015, pp.518-521. Lamkanfi A, Demeyer S, Giger E, Goethals B. Predicting the severity of a reported bug. In Proc. the 7th IEEE Int. Working Conference on Mining Software Repositories (MSR), May 2010, pp.1-10. Xia X, Lo D, Wen M, Shihab E, Zhou B. An empirical study of bug report field reassignment. In Proc. the Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering, February 2014, pp.174-183. Kim S, Whitehead E J, Zhang Y. Classifying software changes:Clean or buggy? IEEE Transactions on Software Engineering, 2008, 34(2):181-196. Rahman F, Posnett D, Devanbu P. Recalling the "imprecision" of cross-project defect prediction. In Proc. the 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2012, Article No. 61. Canfora G, De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S. Multi-objective cross-project defect prediction. In Proc. the 6th IEEE International Conference on Software Testing, Verification and Validation (ICST), March 2013, pp.252-261. Xia X, Lo D, Shihab E, Wang X, Yang X. ELBlocker:Predicting blocking bugs with ensemble imbalance learning. Information and Software Technology, 2015, 61:93-106. Xia X, Lo D, Shihab E, Wang X. Automated bug report field reassignment and refinement prediction. IEEE Transactions on Reliability, 2016, 65(3):1094-1113. Thung F, Lo D, Jiang L. Automatic defect categorization. In Proc. the 19th Working Conference on Reverse Engineering (WCRE), October 2012, pp.205-214. Garcia H V, Shihab E. Characterizing and predicting blocking bugs in open source projects. In Proc. the 11th Working Conference on Mining Software Repositories, May 31-June 1, 2014, pp.72-81. Yang X, Lo D, Huang Q, Xia X, Sun J. Automated identification of high-impact bug reports leveraging imbalanced learning strategies. In Proc. the 40th IEEE Annual Computer Software and Applications Conference (COMPSAC), June 2016, pp.227-232. Chen T H, Nagappan M, Shihab E, Hassan A E. An empirical study of dormant bugs. In Proc. the 11th Working Conference on Mining Software Repositories, May 31-June 1, 2014, pp.82-91. Lovins J B. Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, 1968, 11:22-31. Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto K. The effects of over and under sampling on faultprone module detection. In Proc. the 1st International Symposium on Empirical Software Engineering and Measurement, September 2007, pp.196-204. Khoshgoftaar T M, Yuan X, Allen E B. Balancing misclassification rates in classification-tree models of software quality. Empirical Software Engineering, 2000, 5(4):313-330. Han J, Kamber M. Data Mining:Concepts and Techniques. Morgan Kaufmann, 2006. He H, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9):1263-1284. Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE:Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, pp.321-357. Anvik J, Murphy G C. Reducing the effort of bug report triage:Recommenders for development-oriented decisions. ACM Transactions on Software Engineering and Methodology, 2011, 20(3):10. Aggarwal C. Data Mining:The Textbook. Springer International Publishing, 2015. Xia X, Feng Y, Lo D, Chen Z, Wang X. Towards more accurate multi-label software behavior learning. In Proc. the Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMRWCRE), Feb. 2014, pp.134-143. Xia X, Lo D,Wang X, Zhou B. Tag recommendation in software information sites. In Proc. the 10th Working Conference on Mining Software Repositories, May 2013, pp.287-296. Shihab E, Ihara A, Kamei Y, Ibrahim W M, Ohira M, Adams B, Hassan A E, Matsumoto K. Studying re-opened bugs in open source software. Empirical Software Engineering, 2013, 18(5):1005-1042. Menzies T, Marcus A. Automated severity assessment of software defect reports. In Proc. the IEEE International Conference on Software Maintenance, May 2008, pp.346-355. Cliff N. Ordinal Methods for Behavioral Data Analysis. Psychology Press, 2014. Zaman S, Adams B, Hassan A E. Security versus performance bugs:A case study on Firefox. In Proc. the 8th Working Conference on Mining Software Repositories, May 2011, pp.93-102. Nistor A, Jiang T, Tan L. Discovering, reporting, and fixing performance bugs. In Proc. the 10th Working Conference on Mining Software Repositories, May 2013, pp.237-246. Huang L, Ng V, Persing I, Geng R, Bai X, Tian J. AutoODC:Automated generation of orthogonal defect classifications. In Proc. the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE), November 2011, pp.412-415. Jeong G, Kim S, Zimmermann T. Improving bug triage with bug tossing graphs. In Proc. the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, August 2009, pp.111-120. Bhattacharya P, Neamtiu I, Shelton C R. Automated, highly-accurate, bug assignment using machine learning and tossing graphs. Journal of Systems and Software, 2012, 85(10):2275-2292. Xia X, Lo D, Wang X, Yang X, Li S, Sun J. A comparative study of supervised learning algorithms for re-opened bug prediction. In Proc. the 17th European Conference on Software Maintenance and Reengineering (CSMR), March 2013, pp.331-334. Xia X, Lo D, Shihab E, Wang X, Zhou B. Automatic, high accuracy prediction of reopened bugs. Automated Software Engineering, 2014, 22(1):75-109. Lamkanfi A, Demeyer S, Soetens Q D, Verdonck T. Comparing mining algorithms for predicting the severity of a reported bug. In Proc. the 15th European Conference on Software Maintenance and Reengineering (CSMR), March 2011, pp.249-258. Wang S, Yao X. Using class imbalance learning for software defect prediction. IEEE Transactions on Reliability, 2013, 62(2):434-443. Pelayo L, Dick S. Applying novel resampling strategies to software defect prediction. In Proc. the Annual Meeting of the North American Fuzzy Information Processing Society, June 2007, pp.69-72.
Copyright 2010 by Journal of Computer Science and Technology