Special Issue: Software Systems

• Articles • Previous Articles     Next Articles

Software Project Effort Estimation Based on Multiple Parametric Models Generated Through Data Clustering

Juan J. Cuadrado Gallego1, Daniel Rodri guez1, Miguel Angel Sicilia1, Miguel Garre Rubio1 and Angel Garci a Crespo2   

  1. 1Department of Computer Science, The University of Alcala, Alcala, Spain 2Department of Computer Science, Carlos III University, Madrid, Spain
  • Received:2006-03-15 Revised:2007-02-15 Online:2007-05-10 Published:2007-05-10

Parametric software effort estimation models usually consists of only a single mathematical relationship. With the advent of software repositories containing data from heterogeneous projects, these types of models suffer from poor adjustment and predictive accuracy. One possible way to alleviate this problem is the use of a set of mathematical equations obtained through dividing of the historical project datasets according to different parameters into subdatasets called partitions. In turn, partitions are divided into clusters that serve as a tool for more accurate models. In this paper, we describe the process, tool and results of such approach through a case study using a publicly available repository, ISBSG. Results suggest the adequacy of the technique as an extension of existing single-expression models without making the estimation process much more complex that uses a single estimation model. A tool to support the process is also presented.

Key words: universal abstract consistency clasa; universal unifying principle; universal refutation; soundness; completeness;

[1] Boehm B, Abts C, Chulani S. Software development cost estimation approaches --A survey. USC Center for Software Engineering Technical Report USC-CSE-2000-505, 2000.

[2] Parametric Estimating Initiative. -Parametric Estimating Handbook}, 2nd Edition, 1999.

[3] Stensrud E, Foss T, Kitchenham B, Myrtveit I. An empirical validation of the relationship between the magnitude of relative error and project size. In -\it Proc. the Eighth IEEE Symp. Software Metrics}, Ottawa, Canada, 2002, pp.3--12.

[4] Cuadrado-Gallego J J, Sicilia M A, Garre M -\it et al}. An empirical study of process-related attributes in segmented software cost-estimation relationships. -\it Journal of Systems and Software}, 2006, 79(3): 351$\sim$361.

[5] Shepperd M, Schofield C, Kitchenham B. Effort estimation using analogy. In -\it Proc. 8th Int. Conf. Software Engineering}, IEEE Computer Society Press, Berlin, 1996, pp.170$\sim$178.

[6] Xu Z, Khoshgoftaar T. Identification of fuzzy models of software cost estimation. -\it Fuzzy Sets and Systems}, 2004, 145(1): 141$\sim$163.

[7] Pedrycz W, Succi G. Genetic granular classifiers in modeling software quality. -\it The Journal of Systems and Software}, 2002, 76(3): 277$\sim$285.

[8] Dick S, Meeks A, Last M -\it et al}. Data mining in software metrics databases. -\it Fuzzy Sets and Systems}, 2004, 145(1): 81$\sim$110.

[9] Lung C H, Zaman M, Nandi A. Applications of clustering techniques to software partitioning, recovery and restructuring. -\it Journal of Systems and Software}, 2004, 73(2): 227$\sim$244

[10] Dolado J. On the problem of the software cost function. -\it Information and Software Technology}, 2001, 43(1): 61$\sim$72.

[11] Shepperd M, Schofield C. Estimating software project effort using analogies. -\it IEEE Trans. Software Engineering}, 1997, 23(11): 736$\sim$743.

[12] Oligny S, Bourque P, Abran A, Fournier B. Exploring the relation between effort and duration in software engineering project. In -\it Proc. World Computer Congress}, Beijing, China, August 21$\sim$25, 2000, pp.175$\sim$178.

[13] Marquardt W. An algorithm for least squares estimation of non-linear parameters. -\it J. Soc. Indust. Appl. Math.}, 1963, 11: 431$\sim$441.

[14] Conte S D, Dunsmore H E, Shen V Y. -Software Engineering Metrics and Models}. Menlo Park: Benjamin/Cummings, CA, 1986.

[15] Kohavi R, John G. Automatic parameter selection by minimizing estimated error. In -\it Proc. 12th Int. Conf. Machine Learning}, San Francisco, 1995, pp.304$\sim$312.

[16] Witten I H, Frank E. -Data Mining, Practical Machine Learning Tools and Techniques with Java Implementations}. San Francisco: Morgan Kaufmann Publishers, USA, 2005.

[17] NESMA. -NESMA FPA counting practices manual} (CPM 2.0), 1996.

[18] Dreger J B. -Function Point Analysis}. Englewood Cliffs, NJ: Prentice Hall, 1989.
[1] Yong-Nan Liu, Jian-Zhong Li, Zhao-Nian Zou. Determining the Real Data Completeness of a Relational Dataset [J]. , 2016, 31(4): 720-740.
[2] Jian-Er Chen. Parameterized Computation and Complexity: A New Approach Dealing with NP-Hardnes [J]. , 2005, 20(1): 0-0.
[3] Ren-Ren Liu, Song-Qiao Chen, Jian-Er Chen, and Shu Li. Some Results on the Minimal Coverings of Precomplete Classes in Partial k-Valued Logic Functions [J]. , 2004, 19(6): 0-0.
[4] Fang Gu, Cun-Gen Cao, Yue-Fei Sui, and Wen Tian. Domain-Specific Ontology of Botany [J]. , 2004, 19(2): 0-0.
[5] Hao Lin, Ze-Feng Zhang, Qiang-Feng Zhang,Dong-Bo Bu, and Ming Li. A Note on the Single Genotype Resolution Problem [J]. , 2004, 19(2): 0-0.
[6] Luca Aceto, Jens A. Hansen, Anna Ingolfsdottir, Jacob Johnsen and John Knudsen. The Complexity of Checking Consistency of Pedigree Information and Related Problems [J]. , 2004, 19(1): 0-0.
[7] NIE Xumin; GUO Qing;. Renaming a Set of Non-Horn Clauses [J]. , 2000, 15(5): 409-415.
[8] WANG Bingshan; LI Zhoujun; CHEN Huowang;. Universal Abstract Consistency Class and Universal Refutation [J]. , 1999, 14(2): 165-172.
[9] Yu Huiqun; Song Guoxin; Sun Yongqiang;. Completeness of the Accumulation Calculus [J]. , 1998, 13(1): 25-31.
[10] Shao Zhiqing; Song Guoxin;. An Algebraic Characterization of Inductive Soundness in Proof by Consistency [J]. , 1995, 10(3): 285-288.
[11] Zhao Zhaokeng; Dai Jun; Chen Wendan;. Automated Theorem Proving in Temporal Logic:T-Resolution [J]. , 1994, 9(1): 53-62.
Full text



[1] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[2] Zhang Yan; He Jichao;. Data Dependencies in Database with Incomplete Information[J]. , 1988, 3(2): 131 -138 .
[3] Zhu Mingyuan;. Two Congruent Semantics for Prolog with CUT[J]. , 1990, 5(1): 82 -91 .
[4] Han Jianchao; Shi Zhongzhi;. Formalizing Default Reasoning[J]. , 1990, 5(4): 374 -378 .
[5] Cai Shijie; Zhang Fuyan;. A Fast Algorithm for Polygon Operations[J]. , 1991, 6(1): 91 -96 .
[6] Yao Xin; Li Guojie;. General Simulated Annealing[J]. , 1991, 6(4): 329 -338 .
[7] Li Renwei; He Pei; Zhang Wenhui;. An Introduction to IN CAPS System[J]. , 1993, 8(1): 26 -37 .
[8] Zhang Bo; Zhang Ling;. On Memory Capacity of the Probabilistic Logic Neuron Network[J]. , 1993, 8(3): 62 -66 .
[9] Ma Jun; Ma Shaohan;. An O(k~2n~2) Algorithm to Find a k-Partition in a k-Connected Graph[J]. , 1994, 9(1): 86 -91 .
[10] Cao Cungen;. Expansion Nets and Expansion Processes of Elementary Net Systems[J]. , 1995, 10(4): 325 -333 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved