Journal of Computer Science and Technology ›› 2018, Vol. 33 ›› Issue (5): 876-899.doi: 10.1007/s11390-018-1864-x

Special Issue: Surveys

Previous Articles     Next Articles

Empirical Research in Software Engineering-A Literature Survey

Li Zhang1,2, Senior Member, CCF, Jia-Hao Tian1, Member, CCF, Jing Jiang1,*, Member, CCF, Yi-Jun Liu1,2, Meng-Yuan Pu1,2, Tao Yue3, Senior Member, IEEE   

  1. 1 State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China;
    2 College of Software, Beihang University, Beijing 100191, China;
    3 Simula Research Laboratory, Martin Lingesvei 25, 1364 Fornebu, Norway
  • Received:2018-03-05 Revised:2018-05-14 Online:2018-09-17 Published:2018-09-17
  • Contact: Jing Jiang,
  • Supported by:
    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61672078 and 61732019, and the National Key Research and Development Program of China under Grant No. 2018YFB1004202.

Empirical research is playing a significant role in software engineering (SE), and it has been applied to evaluate software artifacts and technologies. There have been a great number of empirical research articles published recently. There is also a large research community in empirical software engineering (ESE). In this paper, we identify both the overall landscape and detailed implementations of ESE, and investigate frequently applied empirical methods, targeted research purposes, used data sources, and applied data processing approaches and tools in ESE. The aim is to identify new trends and obtain interesting observations of empirical software engineering across different sub-fields of software engineering. We conduct a mapping study on 538 selected articles from January 2013 to November 2017, with four research questions. We observe that the trend of applying empirical methods in software engineering is continuously increasing and the most commonly applied methods are experiment, case study and survey. Moreover, open source projects are the most frequently used data sources. We also observe that most of researchers have paid attention to the validity and the possibility to replicate their studies. These observations are carefully analyzed and presented as carefully designed diagrams. We also reveal shortcomings and demanded knowledge/strategies in ESE and propose recommendations for researchers.

Key words: empirical software engineering; empirical method; systematic mapping study;

[1] Shull F, Singer J, Sjøberg D I K. Guide to Advanced Empirical Software Engineering. Springer, 2008.
[2] Siegmund J, Siegmund N, Apel S. Views on internal and external validity in empirical software engineering. In Proc. the 37th International Conference on Software Engineering, May 2015, pp.9-19.
[3] Borgs A, Ferreira W, Barreiros E, Almeida A, Fonseca L, Teixeira E, Silva D, Alencar A, Soares S. Support mechanisms to conduct empirical studies in software engineering. In Proc. the 19th International Conference on Evaluation and Assessment in Software Engineering, April 2015, Article No. 22.
[4] Cosentino V, Izquierdo J L C, Cabot J. A systematic mapping study of software development with GitHub. IEEE Access, 2017, 5:7173-7192.
[5] Bezerra R, Silva F, Santana A, Magalhaes C, Santos R. Replication of empirical studies in software engineering:An update of a systematic mapping study. In Proc. the 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2015, pp.132-135.
[6] Zhang J, Wang X Y, Hao D, Xie B, Zhang L, Mei H. A survey on bug-report analysis. Science China Information Sciences, 2015, 58(2):1-24.
[7] Zhang T, He J, Luo X, Chan A T S. A literature review of research in bug resolution:Tasks, challenges and future directions. The Computer Journal, 2016, 59(5):741-773.
[8] Ahmad A, Brereton P, Andras P. A systematic mapping study of empirical studies on software cloud testing methods. In Proc. IEEE International Conference on Software Quality, Reliability and Security Companion, July 2017, pp.555-562.
[9] Zhang L, Pu M Y, Liu Y J et al. Empirical investigation of empirical research methods in software engineering. Journal of Software, 2018, 29(5):1422-1450. (in Chinese)
[10] Wohlin C, Runeson P, Höst M, Ohlsson M C, Regnell B, Runeson P, Wesslén A. Experimentation in Software Engineering. Springer, 2012.
[11] Petersen K, Feldt R, Mujtaba S, Mattsson M. Systematic mapping studies in software engineering. In Proc. the 12th International Conference on Evaluation and Assessment in Software Engineering, June 2008, pp.68-77.
[12] Petticrew M, Roberts H. Systematic Reviews in the Social Sciences:A Practical Guide. John Wiley & Sons, 2008
[13] Bourque P, Fairley R E. Guide to the Software Engineering Body of Knowledge (3rd edition). IEEE Computer Society Press, 2014
[14] Delgado D, Martinez A. Cost effectiveness of unit testing a case study in a financial institution. In Proc. the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2013, pp.340-347.
[15] Cook T D, Cambell D T. Quasi-Experiment:Design and Analysis Issues for Field Setting. Houghton Mifflin, 1979.
[16] Robert J M. Experimental and quasi-experimental designs for generalized causal inference. Journal of Policy Analysis and Management, 2003, 22(2):330-332.
[17] Runeson P, Host M. Guidelines for conducting and reporting case study research in software engineering. Empirical Software Engineering, 2009, 14(2):131-164.
[18] Haller I, Slowinska A, Bos H. Scalable data structure detection and classification for C/C++ binaries. Empirical Software Engineering, 2016, 21(3):778-810.
[19] Molléri J S, Petersen K, Mendes E. Survey guidelines in software engineering:An annotated review. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 58.
[20] Bao L F, Li J, Xing Z C, Wang X Y, Xia X, Zhou B. Extracting and analyzing time-series HCI data from screen-captured task videos. Empirical Software Engineering, 2017, 22(1):134-174.
[21] Petersen K, Vakkalanka S, Kuzniarz L. Guidelines for conducting systematic mapping studies in software engineering:An update. Information and Software Technology, 2015, 64:1-18.
[22] Juristo N, Vegas S. Using differences among replications of software engineering experiments to gain knowledge. In Proc. the 3rd International Symposium on Empirical Software Engineering and Measurement, October 2009, pp.356-366.
[23] Monteiro C V, Silva F Q, Capretz L F. The innovative behaviour of software engineers:Findings from a pilot case study. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 7.
[24] Wang Y. Characterizing developer behavior in cloud based IDEs. In Proc. the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, November 2017, pp.48-57.
[25] Octaviano F R, Felizardo K R, Maldonado J C, Fabbri S C P F. Semi-automatic selection of primary studies in systematic literature reviews:Is it reasonable? Empirical Software Engineering, 2015, 20(6):1898-1917.
[26] Heeager L T, Rose J. Optimising agile development practices for the maintenance operation:Nine heuristics. Empirical Software Engineering, 2015, 20(6):1762-1784.
[27] Shin Y, Williams L. Can traditional fault prediction models be used for vulnerability prediction? Empirical Software Engineering, 2013, 18(1):25-59.
[28] Raja U. All complaints are not created equal:Text analysis of open source software defect reports. Empirical Software Engineering, 2013, 18(1):117-138.
[29] Albayrak O, Carver J C. Investigation of individual factors impacting the effectiveness of requirements inspections:A replicated experiment. Empirical Software Engineering, 2014, 19(1):241-266.
[30] Estler H C, Nordio M, Furia C A, Meyer B, Schneider J. Agile vs. structured distributed software development:A case study. Empirical Software Engineering, 2014, 19(5):1197-1224.
[31] Chen N, Hoi S C, Xiao X. Software process evaluation:A machine learning framework with application to defect management process. Empirical Software Engineering, 2014, 19(6):1531-1564.
[32] Chen J, Xiao J, Wang Q, Osterweil L J, Li M. Perspectives on refactoring planning and practice:An empirical study. Empirical Software Engineering, 2016, 21(3):1397-1436.
[33] Unterkalmsteiner M, Gorschek T, Feldt R, Lavesson N. Large-scale information retrieval in software engineering:An experience report from industrial application. Empirical Software Engineering, 2016, 21(6):2324-2365.
[34] Capiluppi A, Izquierdo-Cortázar D. Effort estimation of FLOSS projects:A study of the Linux kernel. Empirical Software Engineering, 2013, 18(1):60-88.
[35] Fucci D, Turhan B. On the role of tests in test-driven development:A differentiated and partial replication. Empirical Software Engineering, 2014, 19(2):277-302.
[36] Mcburney P W, Mcmillan C. An empirical study of the textual similarity between source code and source code summaries. Empirical Software Engineering, 2016, 21(1):17-42.
[37] Mcilroy S, Ali N, Khalid H, Hassan A E. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering, 2016, 21(3):1067-1106.
[38] Smite D, Wohlin C, Galvina Z, Prikladnicki R. An empirically based terminology and taxonomy for global software engineering. Empirical Software Engineering, 2014, 19(1):105-153.
[39] Greiler M, Deursen A V. What your plug-in test suites really test:An integration perspective on test suite understanding. Empirical Software Engineering, 2013, 18(5):859-900.
[40] Calláu O, Robbes R, Tanter E, Röthlisberger D. How (and why) developers use the dynamic features of programming languages:The case of small-talk. Empirical Software Engineering, 2013, 18(6):1156-1194.
[41] Cheung W T, Ryu S, Kim S. Development nature matters:An empirical study of code clones in JavaScript applications. Empirical Software Engineering, 2016, 21(2):517-564.
[42] Ceccato M, Capiluppi A, Falcarin P, Boldyreff C. A large study on the effect of code obfuscation on the quality of java code. Empirical Software Engineering, 2015, 20(6):1486-1524.
[43] Arcuri A, Fraser G. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 2013, 18(3):594-623.
[44] Tian Y, Lo D, Xia X, Sun C N. Automated prediction of bug report priority using multi-factor analysis. Empirical Software Engineering, 2015, 20(5):1354-1383.
[45] Dit B, Revelle M, Poshyvanyk D. Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empirical Software Engineering, 2013, 18(2):277-309.
[46] Bavota G, Lucia A D, Marcus A, Oliveto R. Automating extract class refactoring:An improved method and its evaluation. Empirical Software Engineering, 2014, 19(6):1617-1664.
[47] Zhu J, Zhou M, Mockus A. Patterns of folder use and project popularity:A case study of GitHub repositories. In Proc. the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2014, Article No. 30.
[48] Al-Subaihin A A, Sarro F, Black S, Capra M, Harman M, Jia Y, Zhang Y. Clustering mobile apps based on mined textual features. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 38.
[49] Mcilroy S, Ali N, Hassan A E. Fresh apps:An empirical study of frequently-updated mobile apps in the Google play store. Empirical Software Engineering, 2016, 21(3):1346-1370.
[50] Allix K, Bissyandé T F, Jérome Q, Klein J, State R, Traon Y L. Empirical assessment of machine learning-based malware detectors for Android-Measuring the gap between in-the-lab and in-the-wild validation. Empirical Software Engineering, 2016, 21(1):183-211.
[51] Fraser G, Arcuri A. 1600 faults in 100 projects:Automatically finding faults while achieving high coverage with EvoSuite. Empirical Software Engineering, 2015, 20(3):611-639.
[52] Vasilescu B, Serebrenik A, Goeminne M, Mens T. On the variation and specialisation of workload:A case study of the GNOME ecosystem community. Empirical Software Engineering, 2014, 19(4):955-1008.
[53] Xia X, Bao L F, Lo D, Kochhar P S, Hassan A E, Z Xing Z C. What do developers search for on the Web? Empirical Software Engineering, 2017, 22(6):3149-3185.
[54] Kosti M V, Feldt R, Angelis L. Archetypal personalities of software engineers and their work preferences:A new perspective for empirical studies. Empirical Software Engineering, 2016, 21(4):1509-1532.
[55] Yin R K. Case Study Research:Design and Methods (4th edition). Sage Publications, 2009.
[56] William B J, Carver J C. Examination of the software architecture change characterization scheme using three empirical studies. Empirical Software Engineering, 2014, 19(3):419-464.
[57] Schulz T, Radlinski L, Gorges T, Rosenstiel W. Predicting the flow of defect correction effort using a Bayesian network model. Empirical Software Engineering, 2013, 18(3):435-477.
[1] Zi-Jie Huang, Zhi-Qing Shao, Gui-Sheng Fan, Hui-Qun Yu, Xing-Guang Yang, and Kang Yang. Community Smell Occurrence Prediction on Multi-Granularity by Developer-Oriented Features and Process Metrics [J]. Journal of Computer Science and Technology, 2022, 37(1): 182-206.
Full text



[1] Li Hongzhou; Li Guanying;. Nonuniform Lowness and Strong Nonuniform Lowness[J]. , 1995, 10(3): 253 -258 .
[2] Min Youli; Min Yinghua;. A Fault-Tolerant and Heuristic Routing Algorithm for Faulty Hypercubes[J]. , 1995, 10(6): 536 -544 .
[3] Zong Chengqing; Chen Zhaoxiong; Huang Heyan;. Parsing with Dynamic Rule Selection[J]. , 1997, 12(1): 90 -96 .
[4] XI Haifeng; LUO Yupin; YANG Shiyuan;. An Approach to Active Learning for Classifier Systems[J]. , 1999, 14(4): 372 -378 .
[5] ZHANG Wensong; JIN Shiyao; WU Quanyuan;. LinuxDirector: A Connection Director for Scalable Internet Services[J]. , 2000, 15(6): 560 -571 .
[6] Qi-Jin Ji and Yong-Qiang Dong. Design and Analysis of a Multiscale Active Queue Management Scheme[J]. , 2006, 21(6): 1022 -1030 .
[7] Yu-Hai Zhao, Guo-Ren Wang, Ying Yin, and Guang-Yu Xu. A Novel Approach to Revealing Positive and Negative Co-Regulated Genes[J]. , 2007, 22(2): 261 -272 .
[8] Youcef Derbal. A model of grid service capacity[J]. , 2007, 22(4): 505 -514 .
[9] Feng-Xi Song, David Zhang, Cai-Kou Chen, and Jing-Yu Yang. Facial Feature Extraction Method Based on Coefficients of Variances[J]. , 2007, 22(4): 626 -632 .
[10] Mohamed Farouk Abdel Hady and Friedhelm Schwenker. Combining Committee-Based Semi-Supervised Learning and Active Learning[J]. , 2010, 25(4): 681 -698 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved