Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (5): 1063-1078.doi: 10.1007/s11390-019-1960-6

Special Issue: Software Systems

• Software Systems • Previous Articles     Next Articles

Threshold Extraction Framework for Software Metrics

Mohammed Alqmase, Mohammad Alshayeb*, Lahouari Ghouti   

  1. Information and Computer Science Department, King Fahd University of Petroleum and Minerals Dhahran 31261, Saudi Arabia
  • Received:2018-10-13 Revised:2019-03-23 Online:2019-08-31 Published:2019-08-31
  • Contact: Mohammad Alshayeb
  • About author:Mohammed Alqmase received his M.S. degree in computer science from King Fahd University of Petroleum and Minerals, Dhahran, in 2019, and his B.S. degree in information technology (IT) from King Abdul-Aziz University, Jeddah, in 2013. He worked as a content management system analyst for Hippo CMS in 2017. He also worked as an instructor in Sana'a Community College, Sana, Yemen, from 2013 to 2015. His research interests include sentiment analysis, natural language processing, machine learning, algorithms and software engineering.

Software metrics are used to measure different attributes of software. To practically measure software attributes using these metrics, metric thresholds are needed. Many researchers attempted to identify these thresholds based on personal experiences. However, the resulted experience-based thresholds cannot be generalized due to the variability in personal experiences and the subjectivity of opinions. The goal of this paper is to propose an automated clustering framework based on the expectation maximization (EM) algorithm where clusters are generated using a simplified 3-metric set (LOC, LCOM, and CBO). Given these clusters, different threshold levels for software metrics are systematically determined such that each threshold reflects a specific level of software quality. The proposed framework comprises two major steps:the clustering step where the software quality historical dataset is decomposed into a fixed set of clusters using the EM algorithm, and the threshold extraction step where thresholds, specific to each software metric in the resulting clusters, are estimated using statistical measures such as the mean (μ) and the standard deviation (σ) of each software metric in each cluster. The paper's findings highlight the capability of EM-based clustering, using a minimum metric set, to group software quality datasets according to different quality levels.

Key words: metric threshold; expectation maximization; empirical study;

[1] Erni K, Lewerentz C. Applying design metrics to objectoriented frameworks. In Proc. the 3rd IEEE International Software Metrics Symposium, March 1996, pp.64-74.
[2] Abílio R, Padilha J, Figueiredo E, Costa H. Detecting code smells in software product lines-An exploratory study. In Proc. the 12th International Conference on Information Technology-New Generations, April 2015, pp.433-438.
[3] McCabe T J. A complexity measure. IEEE Transactions on Software Engineering, 1976, SE-2(4):308-320.
[4] Nejmeh B A. NPATH:A measure of execution path complexity and its applications. Commun. ACM, 1988, 31(2):188-200.
[5] Henderson-Sellers B. Object-Oriented Metrics:Measures of Complexity. Prentice Hall, 1995.
[6] Coleman D, Lowther B, Oman P. The application of software maintainability models in industrial software systems. Journal of Systems and Software, 1995, 29(1):3-16.
[7] Lanza M, Marinescu R. Object-Oriented Metrics in Practice:Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems. Springer, 2006.
[8] Wheeldon R, Counsell S. Power law distributions in class relationships. In Proc. the 3rd IEEE International Workshop on Source Code Analysis and Manipulation, September 2003, pp.45-54.
[9] Concas G, Marchesi M, Pinna S, Serra N. Power-laws in a large object-oriented software system. IEEE Transactions on Software Engineering, 2007, 33(10):687-708.
[10] Baxter G, Frean M, Noble J et al. Understanding the shape of Java software. In Proc. the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, October 2006, pp.397-412.
[11] French V. Establishing software metric thresholds. In Proc. the 9th International Workshop on Software Measurement, September 1999, Article No. 7.
[12] Shatnawi R, Li W, Swain J, Newman T. Finding software metrics threshold values using ROC curves. Journal of Software Maintenance and Evolution:Research and Practice, 2010, 22(1):1-16.
[13] Catal C, Alan O, Balkan K. Class noise detection based on software metrics and ROC curves. Information Sciences, 2011, 181(21):4867-4877.
[14] Herbold S, Grabowski J, Waack S. Calculation and optimization of thresholds for sets of software metrics. Empirical Software Engineering, 2011, 16(6):812-841.
[15] Do C B, Batzoglou S. What is the expectation maximization algorithm? Nature Biotechnology, 2008, 26:897-899.
[16] He P, Li B, Liu X, Chen J, Ma Y. An empirical study on software defect prediction with a simplified metric set. Information and Software Technology, 2015, 59:170-190.
[17] Sharma N, Bajpai A, Litoriya M R. Comparison the various clustering algorithms of Weka tools. International Journal of Emerging Technology and Advanced Engineering, 2012, 2(5):73-80.
[18] Hill T, Lewicki P. Statistics:Methods and Applications; A Comprehensive Reference for Science, Industry, and Data Mining. StatSoft, 2006.
[19] Chidamber S R, Kemerer C F. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 1994, 20(6):476-493.
[20] Vale G A D, Figueiredo E M L. A method to derive metric thresholds for software product lines. In Proc. the 29th Brazilian Symposium on Software Engineering, September 2015, pp.110-119.
[21] Benlarbi S, Emam K E, Goel N, Rai S. Thresholds for object-oriented measures. In Proc. the 11th International Symposium on Software Reliability Engineering, October 2000, pp.24-39.
[22] Emam K E, Benlarbi S, Goel N, Melo W, Lounis H, Rai S N. The optimal class size for object-oriented software. IEEE Transactions on Software Engineering, 2002, 28(5):494-509.
[23] Spinellis D, Jureczko M. Metric descriptions. inf/ckjm/metric.html, December 2018.
[24] Jureczko M, Madeyski L. Towards identifying software project clusters with regard to defect prediction. In Proc. the 6th International Conference on Predictive Models in Software Engineering, September 2010, Article No. 9.
[25] Jureczko M, Spinellis D. Using object-oriented design metrics to predict software defects. In Proc. the 5th International Conference on Dependability of Computer Systems, June 2010, pp.69-81.
[26] Zhang H. An investigation of the relationships between lines of code and defects. In Proc. the 25th IEEE International Conference on Software Maintenance, September 2009, pp.274-283.
[27] Lipow M. Number of faults per line of code. IEEE Transactions on Software Engineering, 1982, SE-8(4):437-439.
[28] Ferreira K A M, Bigonha M A S, Bigonha R S, Mendes L F O, Almeida H C. Identifying thresholds for object-oriented software metrics. Journal of Systems and Software, 2012, 85(2):244-257.
[29] Alves T L, Ypma C, Visser J. Deriving metric thresholds from benchmark data. In Proc. the 26th IEEE International Conference on Software Maintenance, September 2010, Article No. 44.
[30] Oliveira P, Valente M T, Lima F P. Extracting relative thresholds for source code metrics. In Proc. the 2014 IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering, February 2014, pp.254-263.
[31] Veado L, Vale G, Fernandes E, Figueiredo E. TDTool:Threshold derivation tool. In Proc. the 20th International Conference on Evaluation and Assessment in Software Engineering, June 2016, Article No. 24.
[32] Lincke R, Lundberg J, Löwe W. Comparing software metrics tools. In Proc. the 2008 International Symposium on Software Testing and Analysis, July 2008, pp.131-142.
[1] Que-Ping Kong, Zi-Yan Wang, Yuan Huang, Xiang-Ping Chen, Xiao-Cong Zhou, Zi-Bin Zheng, and Gang Huang. Characterizing and Detecting Gas-Inefficient Patterns in Smart Contracts [J]. Journal of Computer Science and Technology, 2022, 37(1): 67-82.
[2] Yi-Xuan Tang, Zhi-Lei Ren, He Jiang, Xiao-Chen Li, Wei-Qiang Kong. An Empirical Comparison Between Tutorials and Crowd Documentation of Application Programming Interface [J]. Journal of Computer Science and Technology, 2021, 36(4): 856-876.
[3] Yong-Hao Wu, Zheng Li, Yong Liu, Xiang Chen. FATOC: Bug Isolation Based Multi-Fault Localization by Using OPTICS Clustering [J]. Journal of Computer Science and Technology, 2020, 35(5): 979-998.
[4] Xiang Chen, Dun Zhang, Zhan-Qi Cui, Qing Gu, Xiao-Lin Ju. DP-Share: Privacy-Preserving Software Defect Prediction Model Sharing Through Differential Privacy [J]. Journal of Computer Science and Technology, 2019, 34(5): 1020-1038.
[5] Xin-Li Yang, David Lo, Xin Xia, Zhi-Yuan Wan, Jian-Ling Sun. What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts [J]. , 2016, 31(5): 910-924.
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved