We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Li Zhang, Jia-Hao Tian, Jing Jiang, Yi-Jun Liu, Meng-Yuan Pu, Tao Yue. Empirical Research in Software Engineering-A Literature Survey[J]. Journal of Computer Science and Technology, 2018, 33(5): 876-899. DOI: 10.1007/s11390-018-1864-x
Citation: Li Zhang, Jia-Hao Tian, Jing Jiang, Yi-Jun Liu, Meng-Yuan Pu, Tao Yue. Empirical Research in Software Engineering-A Literature Survey[J]. Journal of Computer Science and Technology, 2018, 33(5): 876-899. DOI: 10.1007/s11390-018-1864-x

Empirical Research in Software Engineering-A Literature Survey

Funds: This work was supported by the National Natural Science Foundation of China under Grant Nos. 61672078 and 61732019, and the National Key Research and Development Program of China under Grant No. 2018YFB1004202.
More Information
  • Corresponding author:

    Jing Jiang,E-mail:jiangjing@buaa.edu.cn

  • Received Date: March 04, 2018
  • Revised Date: May 13, 2018
  • Published Date: September 16, 2018
  • Empirical research is playing a significant role in software engineering (SE), and it has been applied to evaluate software artifacts and technologies. There have been a great number of empirical research articles published recently. There is also a large research community in empirical software engineering (ESE). In this paper, we identify both the overall landscape and detailed implementations of ESE, and investigate frequently applied empirical methods, targeted research purposes, used data sources, and applied data processing approaches and tools in ESE. The aim is to identify new trends and obtain interesting observations of empirical software engineering across different sub-fields of software engineering. We conduct a mapping study on 538 selected articles from January 2013 to November 2017, with four research questions. We observe that the trend of applying empirical methods in software engineering is continuously increasing and the most commonly applied methods are experiment, case study and survey. Moreover, open source projects are the most frequently used data sources. We also observe that most of researchers have paid attention to the validity and the possibility to replicate their studies. These observations are carefully analyzed and presented as carefully designed diagrams. We also reveal shortcomings and demanded knowledge/strategies in ESE and propose recommendations for researchers.
  • [1]
    Shull F, Singer J, Sjøberg D I K. Guide to Advanced Empirical Software Engineering. Springer, 2008.
    [2]
    Siegmund J, Siegmund N, Apel S. Views on internal and external validity in empirical software engineering. In Proc. the 37th International Conference on Software Engineering, May 2015, pp.9-19.
    [3]
    Borgs A, Ferreira W, Barreiros E, Almeida A, Fonseca L, Teixeira E, Silva D, Alencar A, Soares S. Support mechanisms to conduct empirical studies in software engineering. In Proc. the 19th International Conference on Evaluation and Assessment in Software Engineering, April 2015, Article No. 22.
    [4]
    Cosentino V, Izquierdo J L C, Cabot J. A systematic mapping study of software development with GitHub. IEEE Access, 2017, 5:7173-7192.
    [5]
    Bezerra R, Silva F, Santana A, Magalhaes C, Santos R. Replication of empirical studies in software engineering:An update of a systematic mapping study. In Proc. the 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2015, pp.132-135.
    [6]
    Zhang J, Wang X Y, Hao D, Xie B, Zhang L, Mei H. A survey on bug-report analysis. Science China Information Sciences, 2015, 58(2):1-24.
    [7]
    Zhang T, He J, Luo X, Chan A T S. A literature review of research in bug resolution:Tasks, challenges and future directions. The Computer Journal, 2016, 59(5):741-773.
    [8]
    Ahmad A, Brereton P, Andras P. A systematic mapping study of empirical studies on software cloud testing methods. In Proc. IEEE International Conference on Software Quality, Reliability and Security Companion, July 2017, pp.555-562.
    [9]
    Zhang L, Pu M Y, Liu Y J et al. Empirical investigation of empirical research methods in software engineering. Journal of Software, 2018, 29(5):1422-1450. (in Chinese)
    [10]
    Wohlin C, Runeson P, Höst M, Ohlsson M C, Regnell B, Runeson P, Wesslén A. Experimentation in Software Engineering. Springer, 2012.
    [11]
    Petersen K, Feldt R, Mujtaba S, Mattsson M. Systematic mapping studies in software engineering. In Proc. the 12th International Conference on Evaluation and Assessment in Software Engineering, June 2008, pp.68-77.
    [12]
    Petticrew M, Roberts H. Systematic Reviews in the Social Sciences:A Practical Guide. John Wiley & Sons, 2008
    [13]
    Bourque P, Fairley R E. Guide to the Software Engineering Body of Knowledge (3rd edition). IEEE Computer Society Press, 2014
    [14]
    Delgado D, Martinez A. Cost effectiveness of unit testing a case study in a financial institution. In Proc. the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2013, pp.340-347.
    [15]
    Cook T D, Cambell D T. Quasi-Experiment:Design and Analysis Issues for Field Setting. Houghton Mifflin, 1979.
    [16]
    Robert J M. Experimental and quasi-experimental designs for generalized causal inference. Journal of Policy Analysis and Management, 2003, 22(2):330-332.
    [17]
    Runeson P, Host M. Guidelines for conducting and reporting case study research in software engineering. Empirical Software Engineering, 2009, 14(2):131-164.
    [18]
    Haller I, Slowinska A, Bos H. Scalable data structure detection and classification for C/C++ binaries. Empirical Software Engineering, 2016, 21(3):778-810.
    [19]
    Molléri J S, Petersen K, Mendes E. Survey guidelines in software engineering:An annotated review. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 58.
    [20]
    Bao L F, Li J, Xing Z C, Wang X Y, Xia X, Zhou B. Extracting and analyzing time-series HCI data from screen-captured task videos. Empirical Software Engineering, 2017, 22(1):134-174.
    [21]
    Petersen K, Vakkalanka S, Kuzniarz L. Guidelines for conducting systematic mapping studies in software engineering:An update. Information and Software Technology, 2015, 64:1-18.
    [22]
    Juristo N, Vegas S. Using differences among replications of software engineering experiments to gain knowledge. In Proc. the 3rd International Symposium on Empirical Software Engineering and Measurement, October 2009, pp.356-366.
    [23]
    Monteiro C V, Silva F Q, Capretz L F. The innovative behaviour of software engineers:Findings from a pilot case study. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 7.
    [24]
    Wang Y. Characterizing developer behavior in cloud based IDEs. In Proc. the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, November 2017, pp.48-57.
    [25]
    Octaviano F R, Felizardo K R, Maldonado J C, Fabbri S C P F. Semi-automatic selection of primary studies in systematic literature reviews:Is it reasonable? Empirical Software Engineering, 2015, 20(6):1898-1917.
    [26]
    Heeager L T, Rose J. Optimising agile development practices for the maintenance operation:Nine heuristics. Empirical Software Engineering, 2015, 20(6):1762-1784.
    [27]
    Shin Y, Williams L. Can traditional fault prediction models be used for vulnerability prediction? Empirical Software Engineering, 2013, 18(1):25-59.
    [28]
    Raja U. All complaints are not created equal:Text analysis of open source software defect reports. Empirical Software Engineering, 2013, 18(1):117-138.
    [29]
    Albayrak O, Carver J C. Investigation of individual factors impacting the effectiveness of requirements inspections:A replicated experiment. Empirical Software Engineering, 2014, 19(1):241-266.
    [30]
    Estler H C, Nordio M, Furia C A, Meyer B, Schneider J. Agile vs. structured distributed software development:A case study. Empirical Software Engineering, 2014, 19(5):1197-1224.
    [31]
    Chen N, Hoi S C, Xiao X. Software process evaluation:A machine learning framework with application to defect management process. Empirical Software Engineering, 2014, 19(6):1531-1564.
    [32]
    Chen J, Xiao J, Wang Q, Osterweil L J, Li M. Perspectives on refactoring planning and practice:An empirical study. Empirical Software Engineering, 2016, 21(3):1397-1436.
    [33]
    Unterkalmsteiner M, Gorschek T, Feldt R, Lavesson N. Large-scale information retrieval in software engineering:An experience report from industrial application. Empirical Software Engineering, 2016, 21(6):2324-2365.
    [34]
    Capiluppi A, Izquierdo-Cortázar D. Effort estimation of FLOSS projects:A study of the Linux kernel. Empirical Software Engineering, 2013, 18(1):60-88.
    [35]
    Fucci D, Turhan B. On the role of tests in test-driven development:A differentiated and partial replication. Empirical Software Engineering, 2014, 19(2):277-302.
    [36]
    Mcburney P W, Mcmillan C. An empirical study of the textual similarity between source code and source code summaries. Empirical Software Engineering, 2016, 21(1):17-42.
    [37]
    Mcilroy S, Ali N, Khalid H, Hassan A E. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering, 2016, 21(3):1067-1106.
    [38]
    Smite D, Wohlin C, Galvina Z, Prikladnicki R. An empirically based terminology and taxonomy for global software engineering. Empirical Software Engineering, 2014, 19(1):105-153.
    [39]
    Greiler M, Deursen A V. What your plug-in test suites really test:An integration perspective on test suite understanding. Empirical Software Engineering, 2013, 18(5):859-900.
    [40]
    Calláu O, Robbes R, Tanter E, Röthlisberger D. How (and why) developers use the dynamic features of programming languages:The case of small-talk. Empirical Software Engineering, 2013, 18(6):1156-1194.
    [41]
    Cheung W T, Ryu S, Kim S. Development nature matters:An empirical study of code clones in JavaScript applications. Empirical Software Engineering, 2016, 21(2):517-564.
    [42]
    Ceccato M, Capiluppi A, Falcarin P, Boldyreff C. A large study on the effect of code obfuscation on the quality of java code. Empirical Software Engineering, 2015, 20(6):1486-1524.
    [43]
    Arcuri A, Fraser G. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 2013, 18(3):594-623.
    [44]
    Tian Y, Lo D, Xia X, Sun C N. Automated prediction of bug report priority using multi-factor analysis. Empirical Software Engineering, 2015, 20(5):1354-1383.
    [45]
    Dit B, Revelle M, Poshyvanyk D. Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empirical Software Engineering, 2013, 18(2):277-309.
    [46]
    Bavota G, Lucia A D, Marcus A, Oliveto R. Automating extract class refactoring:An improved method and its evaluation. Empirical Software Engineering, 2014, 19(6):1617-1664.
    [47]
    Zhu J, Zhou M, Mockus A. Patterns of folder use and project popularity:A case study of GitHub repositories. In Proc. the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2014, Article No. 30.
    [48]
    Al-Subaihin A A, Sarro F, Black S, Capra M, Harman M, Jia Y, Zhang Y. Clustering mobile apps based on mined textual features. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 38.
    [49]
    Mcilroy S, Ali N, Hassan A E. Fresh apps:An empirical study of frequently-updated mobile apps in the Google play store. Empirical Software Engineering, 2016, 21(3):1346-1370.
    [50]
    Allix K, Bissyandé T F, Jérome Q, Klein J, State R, Traon Y L. Empirical assessment of machine learning-based malware detectors for Android-Measuring the gap between in-the-lab and in-the-wild validation. Empirical Software Engineering, 2016, 21(1):183-211.
    [51]
    Fraser G, Arcuri A. 1600 faults in 100 projects:Automatically finding faults while achieving high coverage with EvoSuite. Empirical Software Engineering, 2015, 20(3):611-639.
    [52]
    Vasilescu B, Serebrenik A, Goeminne M, Mens T. On the variation and specialisation of workload:A case study of the GNOME ecosystem community. Empirical Software Engineering, 2014, 19(4):955-1008.
    [53]
    Xia X, Bao L F, Lo D, Kochhar P S, Hassan A E, Z Xing Z C. What do developers search for on the Web? Empirical Software Engineering, 2017, 22(6):3149-3185.
    [54]
    Kosti M V, Feldt R, Angelis L. Archetypal personalities of software engineers and their work preferences:A new perspective for empirical studies. Empirical Software Engineering, 2016, 21(4):1509-1532.
    [55]
    Yin R K. Case Study Research:Design and Methods (4th edition). Sage Publications, 2009.
    [56]
    William B J, Carver J C. Examination of the software architecture change characterization scheme using three empirical studies. Empirical Software Engineering, 2014, 19(3):419-464.
    [57]
    Schulz T, Radlinski L, Gorges T, Rosenstiel W. Predicting the flow of defect correction effort using a Bayesian network model. Empirical Software Engineering, 2013, 18(3):435-477.
  • Related Articles

    [1]Xiang-Jun Liu, Ping Yu, Xiao-Xing Ma. An Empirical Study on Automated Test Generation Tools for Java: Effectiveness and Challenges[J]. Journal of Computer Science and Technology, 2024, 39(3): 715-736. DOI: 10.1007/s11390-023-1935-5
    [2]Can Cheng, Bing Li, Zeng-Yang Li, Yu-Qi Zhao, Feng-Ling Liao. Developer Role Evolution in Open Source Software Ecosystem: An Explanatory Study on GNOME[J]. Journal of Computer Science and Technology, 2017, 32(2): 396-414. DOI: 10.1007/s11390-017-1728-9
    [3]Yang Liu, Huai-Kou Miao, Hong-Wei Zeng, Yan Ma, Pan Liu. Nondeterministic Probabilistic Petri Net — A New Method to Study Qualitative and Quantitative Behaviors of System[J]. Journal of Computer Science and Technology, 2013, 28(1): 203-216. DOI: 10.1007/s11390-013-1323-7
    [4]Giuseppe Lami, Robert W. Ferguson. An Empirical Study on the Impact of Automation on the Requirements Analysis Process[J]. Journal of Computer Science and Technology, 2007, 22(3): 338-347.
    [5]XU Lin, GAO Wen. Study on Translating Chinese into Chinese Sign Language[J]. Journal of Computer Science and Technology, 2000, 15(5): 485-490.
    [6]Lu Weifeng, Zhang Yuping. Experimental Study on Strategy of CombiningSAT Algorithms[J]. Journal of Computer Science and Technology, 1998, 13(6): 608-614.
    [7]Jiang Xianchun. Prefix Code Translation by Mapping[J]. Journal of Computer Science and Technology, 1994, 9(2): 175-181.
    [8]Liu Hong, Wang Wenhong, Zhang Defu. A Methodology for Mapping and Partitioning Arbitrary N-Dimensional Nested Loops into 2-Dimensional VLSI Arrays[J]. Journal of Computer Science and Technology, 1993, 8(3): 31-42.
    [9]Li Wei. A Comparative Study of Default Reasoning and Epistemic Processes[J]. Journal of Computer Science and Technology, 1993, 8(3): 3-14.
    [10]Li Tianzhu. A Study of Optimization and Rule/Goal Graph for a Logical Query[J]. Journal of Computer Science and Technology, 1992, 7(4): 356-362.

Catalog

    Article views (124) PDF downloads (1965) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return