We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Bin-Bin Liu, Wei Dong, Jia-Xin Liu, Ya-Ting Zhang, Dai-Yan Wang. ProSy: API-Based Synthesis with Probabilistic Model[J]. Journal of Computer Science and Technology, 2020, 35(6): 1234-1257. DOI: 10.1007/s11390-020-0520-4
Citation: Bin-Bin Liu, Wei Dong, Jia-Xin Liu, Ya-Ting Zhang, Dai-Yan Wang. ProSy: API-Based Synthesis with Probabilistic Model[J]. Journal of Computer Science and Technology, 2020, 35(6): 1234-1257. DOI: 10.1007/s11390-020-0520-4

ProSy: API-Based Synthesis with Probabilistic Model

Funds: This paper was supported by the National Natural Science Foundation of China under Grant No. 61690203, and the National Key Research and Development Program of China under Grant No. 2018YFB0204301.
More Information
  • Author Bio:

    Bin-Bin Liu is a Ph.D. candidate in software engineering at National University of Defense Technology, Changsha. He received his B.S. degree in computer science and technology and M.S. degree in software engineering from National University of Defense Technology, Changsha, in 2013 and 2015 respectively. His research interests include intelligent methods in software engineering, and data-driven software engineering.

  • Corresponding author:

    Wei Dong E-mail: wdong@nudt.edu.cn

  • Received Date: April 09, 2020
  • Revised Date: October 21, 2020
  • Published Date: November 19, 2020
  • Program synthesis is an exciting topic that desires to generate programs satisfying user intent automatically. But in most cases, only small programs for simple or domain-specific tasks can be synthesized. The major obstacle of synthesis lies in the huge search space. A common practice in addressing this problem is using a domain-specific language, while many approaches still wish to synthesize programs in general programming languages. With the rapid growth of reusable libraries, component-based synthesis provides a promising way, such as synthesizing Java programs which are only composed of APIs (application programming interfaces). However, the efficiency of searching for proper solutions for complex tasks is still a challenge. Given an unfamiliar programming task, programmers would search for API usage knowledge from various coding resources to reduce the search space. Considering this, we propose a novel approach named ProSy to synthesize API-based programs in Java. The key novelty is to retrieve related knowledge from Javadoc and Stack Overflow and then construct a probabilistic reachability graph. It assigns higher probabilities to APIs that are more likely to be used in implementing the given task. In the synthesis process, the program sketch with a higher probability will be considered first; thus, the number of explored reachable paths would be decreased. Some extension and optimization strategies are further studied in the paper. We implement our approach and conduct several experiments on it. We compare ProSy with SyPet and other state-of-the-art API-based synthesis approaches. The experimental results show that ProSy reduces the synthesis time of SyPet by up to 80%.
  • [1]
    Gulwani S, Polozov O, Singh R. Program synthesis. Foundations and Trends in Programming Languages, 2017, 4(1/2):1-119.
    [2]
    Pnueli A, Rosner R. On the synthesis of a reactive module. In Proc. the 16th Annual ACM Symposium on Principles of Programming Languages, January 1989, pp.179-190.
    [3]
    Gulwani S, Jain P. Programming by examples:PL meets ML. In Proc. the 15th Asian Symposium on Programming Languages and Systems, November 2017, pp.3-20.
    [4]
    Alur R, Bodík R, Juniwal G, Martin M M K, Raghothaman M, Seshia S A, Singh R, Solar-Lezama A, Torlak E, Udupa A. Syntax-guided synthesis. In Proc. the 13th International Conference on Formal Methods in Computer-Aided Design, October 2013, pp.1-8.
    [5]
    Gulwani S, Jha S, Tiwari A, Venkatesan R. Synthesis of loop-free programs. In Proc. the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2011, pp.62-73.
    [6]
    Solar-Lezama A. Program sketching. Int. J. Softw. Tools Technol. Transf., 2013, 15(5/6):475-495.
    [7]
    Parisotto E, Mohamed A, Singh R, Li L, Zhou D, Kohli P. Neuro-symbolic program synthesis. In Proc. the 5th International Conference on Learning Representations, April 2017.
    [8]
    Balog M, Gaunt A L, Brockschmidt M, Nowozin S, Tarlow D. DeepCoder:Learning to write programs. In Proc. the 5th International Conference on Learning Representations, April 2017.
    [9]
    Perelman D, Gulwani S, Grossman D, Provost P. Testdriven synthesis. In Proc. the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2014, pp.408-418.
    [10]
    Feng Y, Martins R, Geffen J V, Dillig I, Chaudhuri S. Component-based synthesis of table consolidation and transformation tasks from examples. In Proc. the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2017, pp.422-436.
    [11]
    Jha S, Gulwani S, Seshia S A, Tiwari A. Oracle-guided component-based program synthesis. In Proc. the 32nd ACM/IEEE International Conference on Software Engineering, May 2010, pp.215-224.
    [12]
    Feng Y, Martins R, Wang Y, Dillig I, Reps T W. Component-based synthesis for complex APIs. In Proc. the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, January 2017, pp.599-612.
    [13]
    Shi K, Steinhardt J, Liang P. FrAngel:Component-based synthesis with control structures. Proc. the ACM on Programming Languages, 2019, 3(POPL):Article No. 73.
    [14]
    Mei H, Zhang L. Can big data bring a breakthrough for software automation? SCIENCE CHINA Information Sciences, 2018, 61(5):Article No. 056101.
    [15]
    Petri C A, Reisig W. Petri net. Scholarpedia, 2008, 3(4):Article No. 6477.
    [16]
    Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013. https://arxiv.org/abs/1301.3781, Sept. 2020.
    [17]
    Pennington J, Socher R, Manning C D. Glove:Global vectors for word representation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, October 2014, pp.1532-1543.
    [18]
    Eén N, Sörensson N. Translating pseudo-boolean constraints into SAT. J. Satisf. Boolean Model. Comput., 2006, 2(1/2/3/4):1-26.
    [19]
    Darwin I F. Java Cookbook. O'Reilly, 2001.
    [20]
    Murali V, Qi L, Chaudhuri S, Jermaine C. Neural sketch learning for conditional program generation. In Proc. the 6th International Conference on Learning Representations, April 2018.
    [21]
    Gulwani S. Automating string processing in spreadsheets using input-output examples. In Proc. the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 2011, pp.317-330.
    [22]
    Singh R, Gulwani S. Learning semantic string transformations from examples. Proc. the VLDB Endowment, 2012, 5(8):740-751.
    [23]
    Albarghouthi A, Gulwani S, Kincaid Z. Recursive program synthesis. In Proc. the 25th International Conference on Computer Aided Verification, July 2013, pp.934-950.
    [24]
    Singh R, Gulwani S. Synthesizing number transformations from input-output examples. In Proc. the 24th International Conference on Computer Aided Verification, July 2012, pp.634-651.
    [25]
    Gulwani S, Harris W R, Singh R. Spreadsheet data manipulation using examples. Communications of the ACM, 2012, 55(8):97-105.
    [26]
    Gulwani S, Korthikanti V A, Tiwari A. Synthesizing geometry constructions. In Proc. the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2011, pp.50-61.
    [27]
    Feser J K, Chaudhuri S, Dillig I. Synthesizing data structure transformations from input-output examples. In Proc. the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015, pp.229-239.
    [28]
    Kalyan A, Mohta A, Polozov O, Batra D, Jain P, Gulwani S. Neural-guided deductive search for real-time program synthesis from examples. In Proc. the 6th International Conference on Learning Representations, April 2018.
    [29]
    Ling W, Blunsom P, Grefenstette E, Hermann K M, Kociský T, Wang F, Senior A W. Latent predictor networks for code generation. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016.
    [30]
    Yin P, Neubig G. A syntactic neural model for general purpose code generation. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.440-450.
    [31]
    Rabinovich M, Stern M, Klein D. Abstract syntax networks for code generation and semantic parsing. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.1139-1149.
    [32]
    Lee W, Heo K, Alur R, Naik M. Accelerating search-based program synthesis using learned probabilistic models. In Proc. the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2018, pp.436-449.
    [33]
    Chen Y, Martins R, Feng Y. Maximal multi-layer specification synthesis. In Proc. the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, August 2019, pp.602-612.
    [34]
    Linstead E, Bajracharya S K, Ngo T C, Rigor P, Lopes C V, Baldi P. Sourcerer:Mining and searching Internet-scale software repositories. Data Min. Knowl. Discov., 2009, 18(2):300-336.
    [35]
    McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C. Portfolio:Finding relevant functions and their usage. In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.111-120.
    [36]
    Lu M, Sun X, Wang S, Lo D, Duan Y. Query expansion via WordNet for effective code search. In Proc. the 22nd IEEE International Conference on Software Analysis, March 2015, pp.545-549.
    [37]
    Haiduc S, Bavota G, Marcus A, Oliveto R, Lucia A D, Menzies T. Automatic query reformulations for text retrieval in software engineering. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.842-851.
    [38]
    Lv F, Zhang H, Lou J, Wang S, Zhang D, Zhao J. CodeHow:Effective code search based on API understanding and extended Boolean model (E). In Proc. the 30th IEEE/ACM International Conference on Automated Software Engineering, November 2015, pp.260-270.
    [39]
    Gu X, Zhang H, Kim S. Deep code search. In Proc. the 40th International Conference on Software Engineering, May 2018, pp.933-944.
    [40]
    Ke Y, Stolee K T, Goues C L, Brun Y. Repairing programs with semantic code search (T). In Proc. the 30th IEEE/ACM International Conference on Automated Software Engineering, November 2015, pp.295-306.
    [41]
    Reiss S P. Semantics-based code search. In Proc. the 31st International Conference on Software Engineering, May 2009, pp.243-253.
    [42]
    Stolee K T, Elbaum S G, Dobos D. Solving the search for source code. ACM Trans. Softw. Eng. Methodol., 2014, 23(3):Article No. 26.
    [43]
    Sirres R, Bissyandé T F, Kim D, Lo D, Klein J, Kim K, Traon Y L. Augmenting and structuring user queries to support efficient free-form code search. In Proc. the 40th International Conference on Software Engineering, May 2018, pp.945-945.
    [44]
    Nguyen A T, Hilton M, Codoban M, Nguyen H A, Mast L, Rademacher E, Nguyen T N, Dig D. API code recommendation using statistical learning from fine-grained changes. In Proc. the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, November 2016, pp.511-522.
    [45]
    Gu X, Zhang H, Zhang D, Kim S. Deep API learning. In Proc. the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, November 2016, pp.631-642.
    [46]
    Huang Q, Xia X, Xing Z, Lo D, Wang X. API method recommendation without worrying about the task-API knowledge gap. In Proc. the 33rd ACM/IEEE International Conference on Automated Software Engineering, September 2018, pp.293-304.
    [47]
    Liu X, Huang L, Ng V. Effective API recommendation without historical software repositories. In Proc. the 33rd ACM/IEEE International Conference on Automated Software Engineering, September 2018, pp.282-292.
    [48]
    Li X, Jiang H, Kamei Y, Chen X. Bridging semantic gaps between natural languages and APIs with word embedding. arXiv:1810.09723, 2018. https://arxiv.org/abs/1810.09723, Sept. 2020.
    [49]
    Mandelin D, Xu L, Bodík R, Kimelman D. Jungloid mining:Helping to navigate the API jungle. In Proc. the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2005, pp.48-61.
    [50]
    Chan W, Cheng H, Lo D. Searching connected API subgraph via text phrases. In Proc. the 20th ACM SIGSOFT Symposium on the Foundations of Software Engineering, November 2012, Article No. 10.
    [51]
    van Nguyen T, Rigby P C, Nguyen A T, Karanfil M, Nguyen T N. T2API:Synthesizing API code usage templates from English texts with statistical translation. In Proc. the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, November 2016, pp.1013-1017.
    [52]
    Nguyen T T, Nguyen H A, Pham N H, Al-Kofahi J M, Nguyen T N. Graph-based mining of multiple object usage patterns. In Proc. the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, August 2009, pp.383-392.
  • Related Articles

    [1]Xiaofang Zhou, Wen Hua. Preface[J]. Journal of Computer Science and Technology, 2020, 35(4): 721-723. DOI: 10.1007/s11390-020-0005-5
    [2]Min-Ling Zhang, Yu-Feng Li, Qi Liu. Preface[J]. Journal of Computer Science and Technology, 2020, 35(2): 231-233. DOI: 10.1007/s11390-020-0002-8
    [3]Su Song, Ke Liu, Zhi-Yong Liu. Preface[J]. Journal of Computer Science and Technology, 2019, 34(2): 253-255. DOI: 10.1007/s11390-019-1908-x
    [4]Jian Pei. Preface[J]. Journal of Computer Science and Technology, 2016, 31(4): 635-636. DOI: 10.1007/s11390-016-1652-4
    [5]Shi-Min Hu, Li-Gang Liu, Ralph R. Martin. Preface[J]. Journal of Computer Science and Technology, 2016, 31(3): 437-438. DOI: 10.1007/s11390-016-1637-3
    [6]Jie Tang, Xiao-Yan Zhu. Preface[J]. Journal of Computer Science and Technology, 2015, 30(4): 902-902. DOI: 10.1007/s11390-015-1568-4
    [7]Jian Pei. Preface[J]. Journal of Computer Science and Technology, 2015, 30(4): 655-656. DOI: 10.1007/s11390-015-1552-z
    [8]Xi-Lin Chen. Preface[J]. Journal of Computer Science and Technology, 2015, 30(2): 324-324. DOI: 10.1007/s11390-015-1525-2
    [9]Guo-Jie Li. Preface[J]. Journal of Computer Science and Technology, 2015, 30(2): 225-226. DOI: 10.1007/s11390-015-1516-3
    [10]Fei-Yue Wang, Ning-Hui Sun, Wen-Ji Mao, Xiao-Wei Li. Preface[J]. Journal of Computer Science and Technology, 2009, 24(6): 997-999.
  • Others

Catalog

    Article views (39) PDF downloads (0) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return