计算机科学技术学报 ›› 2020,Vol. 35 ›› Issue (5): 963-978.doi: 10.1007/s11390-020-0407-4

所属专题: Software Systems

• •    下一篇

挖掘设计模式应用场景和相关设计模式对:基于网络发帖的案例研究

Dong Liu1, Zhi-Lei Ren1,2,*, Member, CCF, ACM, Zhong-Tian Long3, Guo-Jun Gao1, and He Jiang1,2, Member, CCF, ACM, IEEE   

  1. 1 School of Software, Dalian University of Technology, Dalian 116024, China;
    2 Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian 116000, China;
    3 DUT-RU International School of Information Science & Engineering at DUT, Dalian University of Technology Dalian 116620, China
  • 收稿日期:2020-02-27 修回日期:2020-07-31 出版日期:2020-09-20 发布日期:2020-09-29
  • 通讯作者: Zhi-Lei Ren E-mail:zren@dlut.edu.cn
  • 作者简介:Dong Liu received his M.S. degree in computer science and technology from Hebei University of Technology, Baoding, in 2016. He is currently a Ph.D. candidate in Dalian University of Technology, Dalian. His current research interests include mining software repositories and data-driven methods in software engineering.
  • 基金资助:
    This work was supported by the National Key Research and Development Program of China under Grant No. 2018YFB1003903, and the National Natural Science Foundation of China under Grant Nos. 61722202 and 61772107.

Mining Design Pattern Use Scenarios and Related Design Pattern Pairs: A Case Study on Online Posts

Dong Liu1, Zhi-Lei Ren1,2,*, Member, CCF, ACM, Zhong-Tian Long3, Guo-Jun Gao1, and He Jiang1,2, Member, CCF, ACM, IEEE   

  1. 1 School of Software, Dalian University of Technology, Dalian 116024, China;
    2 Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian 116000, China;
    3 DUT-RU International School of Information Science & Engineering at DUT, Dalian University of Technology Dalian 116620, China
  • Received:2020-02-27 Revised:2020-07-31 Online:2020-09-20 Published:2020-09-29
  • Contact: Zhi-Lei Ren E-mail:zren@dlut.edu.cn
  • Supported by:
    This work was supported by the National Key Research and Development Program of China under Grant No. 2018YFB1003903, and the National Natural Science Foundation of China under Grant Nos. 61722202 and 61772107.

1、研究背景(Context):
在设计模式相关文档(如设计模式书)中,作者通常使用固定的模板对各个设计模式进行描述。这些模板通常包括多个属性(attributes),如设计模式的“目的”、“结构”和“示例代码”等。为了使设计模式文档能够对当前的开发者具有参考价值,这些属性的描述应当与最新的编程技术联系起来。例如,在经典设计模式书Gang-of-Four中,属性“已知应用”描述的是设计模式在实践中的具体应用场景,属性“相关模式”说明了该设计模式与哪些其它设计模式有关。因为该书出版时间较早,其内容很难对当前编程技术有所反映。但是,由于编程技术的多样性,人工对这些属性的内容进行及时有效地更新又存在困难。
2、目的(Objective):
本文旨在构建一种方法,能够在编程相关的网络发帖中挖掘设计模式应用场景和相关设计模式对,以对上述两个属性的内容进行有效地补充;同时,将该方法应用到Stack Overflow网站的帖子中,以进行案例研究。
3、方法(Method):
首先,识别出Stack Overflow上面的设计模式标签,并提取出包含这些标签的帖子;然后,利用主题建模技术发掘出各个设计模式对应的主题分布;最后,通过分析单个设计模式和共现设计模式对相应的主题,得到各个设计模式的应用场景和相关设计模式对。
4、结果(Result&Findings):
通过该案例研究,我们得到195个设计模式应用场景和70组相关设计模式对,总共涉及61个设计模式。这些结果能够很好地反映当前比较流行的软件框架,如Ruby on Rails、Spring和ORM,以及编程方向,如移动开发、机器学习和web服务等,因此对当前的开发者具有参考价值。
5、结论(Conclusions):
本文提出了一种对设计模式相关属性的内容进行挖掘和丰富的方法,并进行案例研究,从Stack Overflow网站的帖子中发现了195个设计模式应用场景和70组相关设计模式对。这些结果能够对设计模式相关文档进行补充,并且能够让开发者更好地了解在编程实践中设计模式如何使用以及设计模式间的关系。在今后的研究中,我们将把该方法应用于其它数据源,并尝试提升方法的自动化程度。

关键词: 设计模式, 软件文档, 问答社区, 主题模型

Abstract: In common design pattern collections, e.g., design pattern books, design patterns are documented with templates that consist of multiple attributes, such as intent, structure, and sample code. To adapt to modern developers, the depictions of design patterns, especially some specific attributes, should advance with the current programming technologies, for example, “known uses”, which exemplifies the use scenarios of design patterns in practice, and “related patterns”, which describes the relatedness between a design pattern and the others within a context. However, it is not easy to update the contents of these attributes manually due to the diversity of the programming technologies. To address this problem, in this work, we conducted a case study to mine design pattern use scenarios and related design pattern pairs from Stack Overflow posts to enrich the two attributes. We first extracted the question posts relevant to each design pattern by identifying the design pattern tags. Then, the topics of the posts were discovered by applying topic modeling techniques. Finally, by analyzing the topics specified for each design pattern, we detected 195 design pattern use scenarios and 70 related design pattern pairs, involving 61 design patterns totally. These findings are associated with a variety of popular software frameworks and programming techniques. They could complement the existing design pattern collections and help developers better acknowledge the usage and relatedness of design patterns in today's programming practice.

Key words: design pattern, software documentation, Stack Overflow, topic model

[1] Zhang C, Budgen D. What do we know about the effectiveness of software design patterns? IEEE Trans. Softw. Eng., 2012, 38(5):1213-1231.
[2] Henninger S, Corrêa V. Software pattern communities:Current practices and challenges. In Proc. the 14th Conf. Patt. Lang. Prog., Sept. 2007, Article No. 14.
[3] Gamma E, Helm R, Johnson R, Vlissides J. Design Patterns:Elements of Reusable Object-Oriented Software (1st edition). Addison-Wesley Professional, 1994.
[4] Buschmann F, Meunier R, Rohnert H, Sommerlad P, Stal M. Pattern-Oriented Software Architecture:A System of Patterns (1st edition). Wiley, 1996.
[5] Barua A, Thomas S W, Hassan A E. What are developers talking about? An analysis of topics and trends in Stack Overflow. Empir. Softw. Eng., 2014, 19(3):619-654.
[6] Ahmad A, Feng C, Ge S, Yousif A. A survey on mining stack overflow:Question and answering (Q&A) community. Data Technol. Appl., 2018, 52(2):190-247.
[7] Hindle A, Godfrey M W, Holt R C. What's hot and what's not:Windowed developer topic analysis. In Proc. the 25th IEEE Int. Conf. Softw. Maint., Sept. 2009, pp.339-348.
[8] Han D, Zhang C, Fan X, Hindle A, Wong K, Stroulia E. Understanding Android fragmentation with topic analysis of vendor-specific bugs. In Proc. the 19th Working Conf. Reverse Eng., Oct. 2012, pp.83-92.
[9] Bagherzadeh M, Khatchadourian R. Going big:A largescale study on what big data developers ask. In Proc. the 27th ACM Joint Eur. Softw. Eng. Conf./Symp. Found. Softw. Eng., Aug. 2019, pp.432-442.
[10] Zhou P, Liu J, Yang Z, Zhou G. Scalable tag recommendation for software information sites. In Proc. the 24th IEEE Int. Conf. Softw. Anal. Evol. Reeng., Feb. 2017, pp.272-282.
[11] Chen C, Gao S, Xing Z. Mining analogical libraries in Q&A discussions-Incorporating relational and categorical knowledge into word embedding. In Proc. the 23rd IEEE Int. Conf. Softw. Anal. Evol. Reeng., Mar. 2016, pp.338-348.
[12] Wang X Y, Xia X, Lo D. TagCombine:Recommending tags to contents in software information sites. J. Comput. Sci. Technol., 2015, 30(5):1017-1035.
[13] Zhang Y, Lo D, Xia X, Sun J L. Multi-factor duplicate question detection in Stack Overflow. J. Comput. Sci. Technol., 2015, 30(5):981-997.
[14] Jiang H, Liu D, Chen X, Liu H, Mei H. How are design patterns concerned by developers? In Proc. the 41st Int. Conf. Softw. Eng. Comp., May 2019, pp.232-233.
[15] Yang X L, Lo D, Xia X, Wan Z Y, Sun J L. What security questions do developers ask? A large-scale study of Stack Overflow posts. J. Comput. Sci. Technol., 2016, 31(5):910-924.
[16] Yin J, Wang J. A Dirichlet multinomial mixture modelbased approach for short text clustering. In Proc. the 20th ACM Int. Conf. Knowl. Disc. Data Mining, Aug. 2014, pp.233-242.
[17] Griffiths T L, Steyvers M. Finding scientific topics. Proc. National Academy Sci., 2004, 101(suppl. 1):5228-5235.
[18] Satopaa V, Albrecht J, Irwin D, Raghavan B. Finding a "Kneedle" in a haystack:Detecting knee points in system behavior. In Proc. the 31st Int. Conf. Distrt. Comput. Syst. Workshops, June 2011, pp.166-171.
[19] Rosen C, Shihab E. What are mobile developers asking about? A large scale study using Stack Overflow. Empir. Softw. Eng., 2016, 21:1192-1223.
[20] Hasheminejad S M H, Jalili S. Design patterns selection:An automatic two-phase method. J. Syst. Softw., 2012, 85(2):408-424.
[21] Scanniello G, Gravino C, Risi M, Tortora G, Dodero G. Documenting design-pattern instances:A family of experiments on source-code comprehensibility. ACM Trans. Softw. Eng. Methodol., 2015, 24(3):Article No. 14.
[22] Ampatzoglou A, Chatzigeorgiou A, Charalampidou S, Avgeriou P. The effect of GoF design patterns on stability:A case study. IEEE Trans. Softw. Eng., 2015, 41(8):781-802.
[23] Hussain S, Keung J, Khan A A, Bennin K E. Correlation between the frequent use of gang-of-four design patterns and structural complexity. In Proc. the 24th Asia-Pacific Softw. Eng. Conf., Dec. 2017, pp.189-198.
[24] Jaafar F, Guéhéneuc Y G, Hamel S, Khomh F, Zulkernine M. Evaluating the impact of design pattern and anti-pattern dependencies on changes and faults. Empir. Softw. Eng., 2016, 21(3):896-931.
[25] Aversano L, Cerulo L, Di Penta M. Relationship between design patterns defects and crosscutting concern scattering degree:An empirical study. IET Softw., 2009, 3(5):395-409.
[26] Zou J, Xu L, Yang M, Zhang X, Yang D. Towards comprehending the non-functional requirements through developers' eyes:An exploration of Stack Overflow using topic analysis. Inf. Softw. Technol., 2017, 84:19-32.
[27] Nagy C, Cleve A. Mining Stack Overflow for discovering error patterns in SQL queries. In Proc. the 7th IEEE Int. Conf. Softw. Maint. Evolut., Sept. 2015, pp.516-520.
[28] Zhang T, Upadhyaya G, Reinhardt A, Rajan H, Kim M. Are code examples on an online Q&A forum reliable? A study of API misuse on Stack Overflow. In Proc. the 40th Int. Conf. Softw. Eng., May 2018, pp.886-896.
[29] Rahman A, Farhana E, Imtiaz N. Snakes in paradise? Insecure python-related coding practices in Stack Overflow. In Proc. the 16th IEEE/ACM Int. Conf. Mining Softw. Repos., May 2019, pp.200-204.
[30] Ford D, Smith J, Guo P J, Parnin C. Paradise unplugged:Identifying barriers for female participation on Stack Overflow. In Proc. the 24th ACM Joint Eur. Softw. Eng. Conf./Symp. Found. Softw. Eng., Nov. 2016, pp.846-857.
[31] Zhang H, Wang S, Chen T P, Zou Y, Hassan A E. An empirical study of obsolete answers on Stack Overflow. IEEE Trans. Softw. Eng.. doi:10.1109/TSE.2019.2906315.
[32] Marder A. Stack Overflow badges and user behavior:An econometric approach. In Proc. the 12th IEEE/ACM Int. Conf. Mining Softw. Repos., May 2015, pp.450-453.
[33] Thomas S W, Adams B, Hassan A E, Blostein D. Validating the use of topic models for software evolution. In Proc. the 10th IEEE Working Conf. Source Code Anal. Manip., Sept. 2010, pp.55-64.
[34] Fowkes J, Chanthirasegaran P, Ranca R, Allamanis M, Lapata M, Sutton C. Autofolding for source code summarization. IEEE Trans. Softw. Eng., 2017, 43(12):1095-1109.
[35] Mahmoud A, Bradshaw G. Semantic topic models for source code analysis. Empir. Softw. Eng., 2017, 22(4):1965-2000.
[36] Xia X, Lo D, Ding Y, Al-Kofahi J M, Nguyen T N, Wang X. Improving automated bug triaging with specialized topic model. IEEE Trans. Softw. Eng., 2017, 43(3):272-297.
[37] Jiang H, Zhang J, Ren Z, Zhang T. An unsupervised approach for discovering relevant tutorial fragments for APIs. In Proc. the 39th Int. Conf. Softw. Eng., May 2017, pp.38-48.
[38] Hu J, Sun X, Li B. Explore the evolution of development topics via on-line LDA. In Proc. the 22nd IEEE Int. Conf. Softw. Anal. Evol. Reeng., Mar. 2015, pp.555-559.
[1] Sai-Sai Gong, Wei Hu, Wei-Yi Ge, Yu-Zhong Qu. 面向众包实体消解的用户专长主题建模[J]. 计算机科学技术学报, 2018, 33(6): 1204-1218.
[2] Fei-Fei Kou, Jun-Ping Du, Cong-Xian Yang, Yan-Song Shi, Wan-Qiu Cui. 基于微博多特征的标签推荐[J]. , 2018, 33(4): 711-726.
[3] Tie-Ke He, Hao Lian, Ze-Min Qin, Zhen-Yu Chen, Bin Luo. 一种用于罚金判定的主题模型[J]. , 2018, 33(4): 756-767.
[4] Peng-Peng Zhao, Hai-Feng Zhu, Yanchi Liu, Zi-Ting Zhou, Zhi-Xu Li, Jia-Jie Xu. 一种地理社交组推荐的生成模型方法[J]. , 2018, 33(4): 727-738.
[5] Xi-Ming Li, Ji-Hong Ouyang. 调整随机变分推理的学习速率[J]. , 2016, 31(2): 428-436.
[6] Hoon Park, Anping He, Marly Roncken, Xiaoyu Song, Ivan Sutherland. 延迟非敏感系统的模块化时序约束[J]. , 2016, 31(1): 77-106.
[7] Yong-Xin Tong, Jieying She, Lei Chen. 关于如何更好地理解App的功能[J]. , 2015, 30(5): 1130-1140.
[8] Jun-Wen Duan, Yi-Heng Chen Ting Liu, Xiao Ding. 基于在线问答社区的意图相关产品挖掘[J]. , 2015, 30(5): 1054-1062.
[9] Xin Xin, Chin-Yew Lin, Xiao-Chi Wei, He-Yan Huang. 分解遇见异构潜在主题:一个可解释的跨网站推荐模型[J]. , 2015, 30(4): 917-932.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 陈世华;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[2] 屈延文;. AGDL: A Definition Language for Attribute Grammars[J]. , 1986, 1(3): 80 -91 .
[3] 王建潮; 魏道政;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[4] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] 吴允曾;. On the Development of Applications of Logic in Programming[J]. , 1987, 2(1): 30 -34 .
[6] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[7] 王建潮; 魏道政;. Reconvergent-Fanout-Oriented Testability Measure[J]. , 1988, 3(1): 16 -28 .
[8] 周巢尘; 柳欣欣;. Denote CSP with Temporal Formulas[J]. , 1990, 5(1): 17 -23 .
[9] 胡运发;. A Goal-Type Driven Method of Solving Horn Logic with Equality[J]. , 1990, 5(3): 250 -258 .
[10] 王海鹰;. A Framework for Command Recovery in User Interface[J]. , 1990, 5(3): 296 -301 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: