|
计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (5): 993-1006.doi: 10.1007/s11390-019-1956-2
所属专题: Data Management and Data Mining; Software Systems
• Special Section on Software Systems 2019 • 上一篇 下一篇
Chun-Yang Ling, Yan-Zhen Zou*, Member, CCF, ACM, IEEE, Ze-Qi Lin, Bing Xie, Senior Member, CCF
Chun-Yang Ling, Yan-Zhen Zou*, Member, CCF, ACM, IEEE, Ze-Qi Lin, Bing Xie, Senior Member, CCF
检索软件项目API(Application Program Interface,应用程序接口)是开发者复用软件的重要方式.当前基于自然语言的API检索和推荐主要面临以下两个挑战:1)随着软件项目变得越来越复杂,自然语言查询的准确性需要提高.2)需要展示目标APIs之间的语义关联,以更好地辅助开发者理解APIs的逻辑和使用场景.为此,本文提出了一种基于图嵌入的API子图推荐方法.首先,该方法能够基于软件项目源代码,自动构建其代码结构图,并通过图嵌入对源代码进行表示.然后,开发者可以输入自然语言问题、检索并返回相关的APIs及其关联信息构成的连通API子图,从而提高了API检索和复用的效率.本文选择了开源项目Apache Lucene、POI、JodaTime为例及进行实验验证.结果表明本文方法的F1值相比现有的基于最短路径的方法提高了10%,而平均响应时间缩短了约60倍.
[1] Moreno L, Bavota G, di Penta M et al. How can I use this method? In Proc. the 37th International Conference on Software Engineering, May 2015, pp.880-890. [2] Sirres R, Bissyandé T F, Kim D, Lo D, Klein J, Kim K, Traon L Y. Augmenting and structuring user queries to support efficient free-form code search. Empirical Software Engineering, 2018, 23(5):2622-2654. [3] Stylos J, Myers B A. Mica:A web-search tool for finding API components and examples. In Proc. the 2006 IEEE Symposium on Visual Languages and Human-Centric Computing, Sept. 2006, pp.195-202. [4] Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P. Sourcerer:Mining and searching internet-scale software repositories. Data Mining and Knowledge Discovery, 2009, 18(2):300-336. [5] Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrieval-The Concepts and Technology Behind Search (2nd edition). Addison-Wesley Professional, 2011. [6] Lv F, Zhang H, Lou J, Wang S, Zhang D, Zhao J. CodeHow:Effective code search based on API understanding and extended Boolean model (E). In Proc. the 30th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2015, pp.260-270. [7] Hill E, Pollock L, Vijay-Shanker K. Improving source code search with natural language phrasal representations of method signatures. In Proc. the 26th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2011, pp.524-527. [8] Rahman M M, Roy C K. QUICKAR:Automatic query reformulation for concept location using crowdsourced knowledge. In Proc. the 31st IEEE/ACM International Conference on Automated Software Engineering, Aug. 2016, pp.220-225. [9] McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C. Portfolio:Finding relevant functions and their usage. In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.111-120. [10] Chan W K, Cheng H, Lo D. Searching connected API subgraph via text phrases. In Proc. the 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2012, Article No. 10. [11] Goyal P, Ferrara E. Graph embedding techniques, applications, and performance:A survey. Knowledge-Based Systems, 2018, 151:78-94. [12] Belkin M, Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proc. the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic, Dec. 2002, pp.585-591. [13] Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola A J. Distributed large-scale natural graph factorization. In Proc. the 22nd International World Wide Web Conference, May 2013, pp.37-48. [14] Perozzi B, Al-Rfou R, Skiena S. DeepWalk:Online learning of social representations. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2014, pp.701-710. [15] Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2016, pp.1225-1234. [16] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907, 2016. https://arXiv.org/abs/1609.02907, July 2019. [17] Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O. Translating embeddings for modeling multirelational data. In Proc. the 27th Annual Conference on Neural Information Processing Systems, Dec. 2013, pp.2787-2795. [18] Lin Y, Liu Z, Sun M, Liu Y, Zhu X. Learning entity and relation embeddings for knowledge graph completion. In Proc. the 29th AAAI Conference on Artificial Intelligence, Jan. 2015, pp.2181-2187. [19] Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE:Large-scale information network embedding. In Proc. the 24th International Conference on World Wide Web, May 2015, pp.1067-1077. [20] Cui P, Wang X, Pei J, Zhu W. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(5):833-852. [21] Zou Y, Ling C, Lin Z, Xie B. Graph embedding based code search in software project. In Proc. the 10th Asia-Pacific Symposium on Internetware, Sept. 2018, Article No. 1. [22] Chatterjee S, Juvekar S, Sen K. SNIFF:A search engine for Java using free-form queries. In Proc. the 12th International Conference on Fundamental Approaches to Software Engineering, Mar. 2009, pp.385-400. [23] Tian Y, Lo D, Lawall J. Automated construction of a software specific word similarity database. In Proc. the 2014 IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering, Feb. 2014, pp.44-53. [24] Yang J, Tan L. Inferring semantically related words from software context. In Proc. the 9th IEEE Working Conference of Mining Software Repositories, June 2012, pp.161-170. [25] Sridhara G, Hill E, Pollock L, Vijay-Shanker K. Identifying word relations in software:A comparative study of semantic similarity tools. In Proc. the 16th IEEE International Conference on Program Comprehension, June 2008, pp.123-132. [26] Wang S, Lo D, Jiang L. Active code search:Incorporating user feedback to improve code search relevance. In Proc. the 29th IEEE/ACM International Conference on Automated Software Engineering, Sept. 2014, pp.677-682. [27] Haiduc S, Bavota G, Marcus A, Oliveto R, de Lucia A, Menzies T. Automatic query reformulations for text retrieval in software engineering. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.842-851. [28] Jiang H, Nie L, Sun Z, Kong W, Zhang T, Luo X. ROSF:Leveraging information retrieval and supervised learning for recommending code snippets. IEEE Transactions on Services Computing, 2019, 12(1):34-46. [29] Gu X, Zhang H, Zhang D, Kim S. Deep API learning. In Proc. the 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2016, pp.631-642. [30] Richardson K, Kuhn J. Function assistant:A tool for NL querying of APIs. arXiv:1706.00468, 2017. https://arXiv.org/abs/-1706.00468, July 2019. [31] Nguyen T D, Nguyen A T, Phan H D, Nguyen T N. Exploring API embedding for API usages and applications. In Proc. the 39th IEEE/ACM International Conference on Software Engineering, May 2017, pp.438-449. [32] Huang Q, Xia X, Xing Z, Lo D, Wang X. API method recommendation without worrying about the Task-API knowledge gap. In Proc. the 33rd ACM/IEEE International Conference on Automated Software Engineering, Sept. 2018, pp.292-303. [33] Sillito J, Murphy G C, de Volder K. Asking and answering questions during a programming change task. IEEE Transactions on Software Engineering, 2008, 34(4):434-451. [34] Li X, Wang Z, Wang Q, Yan S, Xie T, Mei H. Relationshipaware code search for JavaScript frameworks. In Proc. the 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2016, pp.690-701. [35] Fu K, Qian W Y, Peng X, Zhao W. Feature location method based on call chain analysis. Computer Science, 2014, 41(11):36-39. (in Chinese) |
[1] | 龙永浩, 陈彦呈, 陈湘萍, 石晓虹, 周凡. 测试驱动的构件功能提取[J]. 计算机科学技术学报, 2022, 37(2): 389-404. |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |