A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Yuan-Feng Song; Yuan-Qin He; Xue-Fang Zhao; Han-Lin Gu; Di Jiang; Hai-Jun Yang; Li-Xin Fan

doi:10.1007/s11390-024-4058-8

Volume 39 Issue 4

September 2024

Turn off MathJax

Article Contents

Abstract

Conflict of Interest

References

Supplements

Journal of Computer Science and Technology > 2024 > 39(4): 984-1004. > DOI: 10.1007/s11390-024-4058-8 CSTR: 32374.14.s11390-024-4058-8

Song YF, He YQ, Zhao XF et al. A communication theory perspective on prompting engineering methods for large language models. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(4): 984−1004 July 2024. DOI: 10.1007/s11390-024-4058-8.

Citation:

PDF

Read Online

A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Yuan-Feng Song (宋元峰) ,
Yuan-Qin He (何元钦) ,
Xue-Fang Zhao (赵雪芳) ,
Han-Lin Gu (古瀚林) ,
Di Jiang (姜迪) ,
Hai-Jun Yang (杨海军) ,
Li-Xin Fan (范力欣)

AI Group, WeBank Co., Ltd, Shenzhen 518000, China

More Information

Author Bio:
Yuan-Feng Song is a researcher in WeBank AI Group, WeBank, Shenzhen. His research interests include learning to rank, data visualization, and speech-driven applications. In his career, he has published several papers in venues such as KDD, ICDM, EMNLP, MM, TIST, TKDE, and SIGMOD

Yuan-Qin He is currently a researcher with WeBank AI Group, WeBank, Shenzhen. He received his B.S. degree in Physics from Shanghai Jiao Tong University, and his Ph.D. degree in physics from the Technical University of Munich, Munich, in 2017. His research interests include machine learning and federated learning

Xue-Fang Zhao received her Master degree in computer science from the Tsinghua University, Beijing, in 2020. She is currently a research engineer at WeBank AI Group, WeBank, Shenzhen. Her research interests include natural language processing and speech recognition

Han-Lin Gu received his B.S. degree in mathematics from University of Science and Technology of China, Hefei, in 2017. He received his Ph.D. degree in mathematics from Hong Kong University of Science and Technology, Hong Kong, in 2022. He now works as a senior researcher at WeBank AI Group, WeBank, Shenzhen. His research interests include federated learning and privacy-preserving methodology. He has published a series of papers in TPAMI, TDSC, IJCAI, PAKDD, and so on

Di Jiang received his Ph.D. degree in computer science from the Hong Kong University of Science and Technology, Hong Kong, in 2014. He is currently the principal scientist at WeBank AI Group, WeBank, Shenzhen. His research interests include information retrieval, natural language processing, and massive data management

Hai-Jun Yang received his B.E. degree in 2008 and his M.S. degree in 2011, both from Harbin Institute of Technology, Harbin. He is currently the Senior Manager of the AI Group at WeBank, Shenzhen, mainly responsible for promoting the integration and implementation of AI technology with WeBank in customer service, risk control, marketing, and other business scenarios

Li-Xin Fan is the Principal Scientist of Artificial Intelligence at WeBank, Shenzhen, and the Chairman of the Federal Learning Industry Ecological Development Alliance. His research fields include machine learning and deep learning, computer vision and pattern recognition, image and video processing, 3D big data processing, data visualization and rendering, augmented and virtual reality, mobile computing and ubiquitous computing, and intelligent man-machine interface. He is the author of more than 70 international journals and conference articles. He has worked at Nokia Research Center and Xerox Research Center Europe. His research includes the well-known Bag of Keypoints image classification method. He has participated in NIPS/NeurIPS, ICML, CVPR, ICCV, ECCV, IJCAI and other top artificial intelligence conferences for a long time, served as area chair of AAAI, and organized workshops in various technical fields. He is also the inventor of more than one hundred patents filed in the United States, Europe, and China, and the chairman of the IEEE P2894 Explainable Artificial Intelligence (XAI) Standard Working Group
Received Date: December 20, 2023
Accepted Date: April 11, 2024

Abstract

Abstract

The springing up of large language models (LLMs) has shifted the community from single-task-orientated natural language processing (NLP) research to a holistic end-to-end multi-task learning paradigm. Along this line of research endeavors in the area, LLM-based prompting methods have attracted much attention, partially due to the technological advantages brought by prompt engineering (PE) as well as the underlying NLP principles disclosed by various prompting methods. Traditional supervised learning usually requires training a model based on labeled data and then making predictions. In contrast, PE methods directly use the powerful capabilities of existing LLMs (e.g., GPT-3 and GPT-4) via composing appropriate prompts, especially under few-shot or zero-shot scenarios. Facing the abundance of studies related to the prompting and the ever-evolving nature of this field, this article aims to 1) illustrate a novel perspective to review existing PE methods within the well-established communication theory framework, 2) facilitate a better/deeper understanding of developing trends of existing PE methods used in three typical tasks, and 3) shed light on promising research directions for future PE methods.
- prompting method,
- large language model,
- communication theory

FullText(HTML)

References (118)

References

[1]	Brown T B, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D M, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D. Language models are few-shot learners. In Proc. the 34th International Conference on Neural Information Processing Systems, Dec. 2020, Article No. 159.
[2]	OpenAI. GPT-4 technical report. arXiv: 2303.08774, 2023. https://arxiv.org/abs/2303.08774, Jul. 2024.
[3]	Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A, Joulin A, Grave E, Lample G. LLaMA: Open and efficient foundation language models. arXiv: 2302.13971, 2023. https://arxiv.org/abs/2302.13971, Jul. 2024.
[4]	Cheng K M, Li Z Y, Li C, Xie R J, Guo Q, He Y B, Wu H Y. The potential of GPT-4 as an AI-powered virtual assistant for surgeons specialized in joint arthroplasty. Annals of Biomedical Engineering, 2023, 51(7): 1366–1370. DOI: 10.1007/s10439-023-03207-z.
[5]	Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: An analysis of multiple clinical and research scenarios. Journal of Medical Systems, 2023, 47(1): Article No. 33. DOI: 10.1007/s10916-023-01925-4.
[6]	George A S, George A S H. A review of ChatGPT AI’s impact on several business sectors. Partners Universal International Innovation Journal, 2023, 1(1): 9–23. DOI: 10.5281/zenodo.7644359.
[7]	Liu P F, Yuan W Z, Fu J L, Jiang Z B, Hayashi H, Neubig G. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023, 55(9): 195. DOI: 10.1145/3560815.
[8]	Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI blog, 2019, 1(8): Article No. 9.
[9]	Petroni F, Rocktäschel T, Riedel S, Lewis P, Bakhtin A, Wu Y X, Miller A. Language models as knowledge bases? In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Nov. 2019, pp.2463–2473. DOI: 10.18653/v1/D19-1250.
[10]	Schick T, Schütze H. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proc. the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Apr. 2021, pp.255–269. DOI: 10.18653/v1/2021.eacl-main.20.
[11]	Jiang Z B, Xu F F, Araki J, Neubig G. How can we know what language models know? Transactions of the Association for Computational Linguistics, 2020, 8: 423–438. DOI: 10.1162/tacl_a_00324.
[12]	Shin T, Razeghi Y, Logan IV R L, Wallace E, Singh S. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2020, pp.4222–4235. DOI: 10.18653/v1/2020.emnlp-main.346.
[13]	Li X L, Liang P. Prefix-tuning: Optimizing continuous prompts for generation. In Proc. the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Aug. 2021, pp.4582–4597. DOI: 10.18653/v1/2021.acl-long.353.
[14]	Haviv A, Berant J, Globerson A. BERTese: Learning to speak to BERT. In Proc. the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Apr. 2021, pp.3618–3623. DOI: 10.18653/v1/2021.eacl-main.316.
[15]	Liu X, Zheng Y N, Du Z X, Ding M, Qian Y J, Yang Z L, Tang J. GPT understands, too. AI Open, 2023. DOI: 10.1016/j.aiopen.2023.08.012.
[16]	Zhong Z X, Friedman D, Chen D Q. Factual probing is [MASK]: Learning vs. learning to recall. In Proc. the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2021, pp.5017–5033. DOI: 10.18653/v1/2021.naacl-main.398.
[17]	Gao T Y, Fisch A, Chen D Q. Making pre-trained language models better few-shot learners. In Proc. the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug. 2021, pp.3816–3830. DOI: 10.18653/v1/2021.acl-long.295.
[18]	Zhang N Y, Li L Q, Chen X, Deng S M, Bi Z, Tan C Q, Huang F, Chen H J. Differentiable prompt makes pre-trained language models better few-shot learners. In Proc. the 10th International Conference on Learning Representations, Apr. 2022.
[19]	Han X, Zhao W L, Ding N, Liu Z Y, Sun M S. PTR: Prompt tuning with rules for text classification. AI Open, 2022, 3: 182–192. DOI: 10.1016/j.aiopen.2022.11.003.
[20]	Lester B, Al-Rfou R, Constant N. The power of scale for parameter-efficient prompt tuning. In Proc. the 2021 Conference on Empirical Methods in Natural Language Processing, Nov. 2021, pp.3045–3059. DOI: 10.18653/v1/2021.emnlp-main.243.
[21]	Gu Y X, Han X, Liu Z Y, Huang M L. PPT: Pre-trained prompt tuning for few-shot learning. In Proc. the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), May 2022, pp.8410–8423. DOI: 10.18653/v1/2022.acl-long.576.
[22]	Deng M K, Wang J Y, Hsieh C P, Wang Y H, Guo H, Shu T M, Song M, Xing E, Hu Z T. RLPrompt: Optimizing discrete text prompts with reinforcement learning. In Proc. the 2022 Conference on Empirical Methods in Natural Language Processing, Dec. 2022, pp.3369–3391. DOI: 10.18653/v1/2022.emnlp-main.222.
[23]	Hou Y T, Dong H Y, Wang X H, Li B H, Che W X. MetaPrompting: Learning to learn better prompts. In Proc. the 29th International Conference on Computational Linguistics, Oct. 2022, pp.3251–3262.
[24]	Wang Z, Panda R, Karlinsky L, Feris R, Sun H, Kim Y. Multitask prompt tuning enables parameter-efficient transfer learning. In Proc. the 11th International Conference on Learning Representations, May 2023.
[25]	Kojima T, Gu S S, Reid M, Matsuo Y, Iwasawa Y. Large language models are zero-shot reasoners. In Proc. the 36th International Conference on Neural Information Processing Systems, Nov. 28-Dec. 9, 2022, Article No. 1613.
[26]	Paranjape B, Lundberg S, Singh S, Hajishirzi H, Zettlemoyer L, Ribeiro M T. ART: Automatic multi-step reasoning and tool-use for large language models. arXiv: 2303.09014, 2023. https://arxiv.org/abs/2303.09014, Jul. 2024.
[27]	Narula U. Handbook of Communication: Models, Perspectives, Strategies. Atlantic Publishers & Distributors (P) Ltd, 2006.
[28]	Chandler D, Munday R. A Dictionary of Media and Communication. Oxford University Press, 2011.
[29]	Cobley P, Schulz P J. Theories and Models of Communication. De Gruyter Mouton, 2013.
[30]	Latané B. Dynamic social impact: The creation of culture by communication. Journal of Communication, 1996, 46(4): 13–25. DOI: 10.1111/j.1460-2466.1996.tb01501.x.
[31]	Orbe M P. From the standpoint(s) of traditionally muted groups: Explicating a co-cultural communication theoretical model. Communication Theory, 1998, 8(1): 1–26. DOI: 10.1111/j.1468-2885.1998.tb00209.x.
[32]	Segrin C, Abramson L Y. Negative reactions to depressive behaviors: A communication theories analysis. Journal of Abnormal Psychology, 1994, 103(4): 655–668. DOI: 10.1037/0021-843X.103.4.655.
[33]	Shannon C E. A mathematical theory of communication. The Bell System Technical Journal, 1948, 27(3): 379–423. DOI: 10.1002/j.1538-7305.1948.tb01338.x.
[34]	Schramm W. The Process and Effects of Mass Communication. University of Illinois Press, 1954.
[35]	Cover T M, Thomas J A. Elements of Information Theory. John Wiley & Sons, 1991.
[36]	Sorensen T, Robinson J, Rytting C, Shaw A, Rogers K, Delorey A, Khalil M, Fulda N, Wingate D. An information-theoretic approach to prompt engineering without ground truth labels. In Proc. the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), May 2022, pp.819–862. DOI: 10.18653/v1/2022.acl-long.60.
[37]	Schick T, Schütze H. It’s not just size that matters: Small language models are also few-shot learners. In Proc. the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2021, pp.2339–2352. DOI: 10.18653/v1/2021.naacl-main.185.
[38]	Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y Q, Li W, Liu P J. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21(1): 140.
[39]	Zhou Y L, Zhao Y R, Shumailov I, Mullins R, Gal Y. Revisiting automated prompting: Are we actually doing better? In Proc. the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Jul. 2023, pp.1822–1832. DOI: 10.18653/v1/2023.acl-short.155.
[40]	Logan IV R, Balažević I, Wallace E, Petroni F, Singh S, Riedel S. Cutting down on prompts and parameters: Simple few-shot learning with language models. In Proc. the 2022 Findings of the Association for Computational Linguistics, May 2022, pp.2824–2835. DOI: 10.18653/v1/2022.findings-acl.222.
[41]	Yuan W Z, Neubig G, Liu P F. BARTSCORE: Evaluating generated text as text generation. In Proc. the 35th International Conference on Neural Information Processing Systems, Dec. 2021, Article No. 2088.
[42]	Ben-David E, Oved N, Reichart R. PADA: Example-based prompt learning for on-the-fly adaptation to unseen domains. Transactions of the Association for Computational Linguistics, 2022, 10: 414–433. DOI: 10.1162/ tacl_a_00468.
[43]	Li B H, Hou Y T, Che W X. Data augmentation approaches in natural language processing: A survey. AI Open, 2022, 3: 71–90. DOI: 10.1016/j.aiopen.2022.03.001.
[44]	Zhou Y C, Muresanu A I, Han Z W, Paster K, Pitis S, Chan H, Ba J. Large language models are human-level prompt engineers. In Proc. the 11th International Conference on Learning Representations, May 2023.
[45]	Davison J, Feldman J, Rush A M. Commonsense knowledge mining from pretrained models. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Nov. 2019, pp.1173–1178. DOI: 10.18653/v1/D19-1109.
[46]	Yang X J, Cheng W, Zhao X J, Yu W C, Petzold L, Chen H F. Dynamic prompting: A unified framework for prompt tuning. arXiv: 2303.02909, 2023. https://arxiv.org/abs/2303.02909, Jul. 2024.
[47]	Zaken E B, Goldberg Y, Ravfogel S. BitFit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In Proc. the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), May 2022. DOI: 10.18653/v1/2022.acl-short.1.
[48]	Zhang J O, Sax A, Zamir A, Guibas L, Malik J. Side-tuning: A baseline for network adaptation via additive side networks. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.698–714. DOI: 10.1007/978-3-030-58580-8_41.
[49]	Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, de Laroussilhe Q, Gesmundo A, Attariyan M, Gelly S. Parameter-efficient transfer learning for NLP. In Proc. the 36th International Conference on Machine Learning, Jun. 2019, pp.2790–2799.
[50]	Sung Y L, Cho J, Bansal M. LST: Ladder side-tuning for parameter and memory efficient transfer learning. In Proc. the 36th International Conference on Neural Information Processing Systems, Nov. 28-Dec. 9, 2022, Article No. 944.
[51]	Schick T, Schmid H, Schütze H. Automatically identifying words that can serve as labels for few-shot text classification. In Proc. the 28th International Conference on Computational Linguistics, Dec. 2020, pp.5569–5578. DOI: 10.18653/v1/2020.coling-main.488.
[52]	Hambardzumyan K, Khachatrian H, May J. WARP: Word-level adversarial reprogramming. In Proc. the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Aug. 2021, pp.4921–4933. DOI: 10.18653/v1/2021.acl-long.381.
[53]	Chen Y L, Liu Y, Dong L, Wang S H, Zhu C G, Zeng M, Zhang Y. AdaPrompt: Adaptive model training for prompt-based NLP. In Proc. the 2022 Findings of the Association for Computational Linguistics, Dec. 2022, pp.6057–6068. DOI: 10.18653/v1/2022.findings-emnlp.448.
[54]	Yin W P, Hay J, Roth D. Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Nov. 2019, pp.3914–3923. DOI: 10.18653/v1/D19-1404.
[55]	Cui L Y, Wu Y, Liu J, Yang S, Zhang Y. Template-based named entity recognition using BART. In Proc. the 2021 Findings of the Association for Computational Linguistics, Aug. 2021, pp.1835–1845. DOI: 10.18653/v1/2021.findings-acl.161.
[56]	Jiang Z B, Anastasopoulos A, Araki J, Ding H B, Neubig G. X-FACTR: Multilingual factual knowledge retrieval from pretrained language models. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2020, pp.5943–5959. DOI: 10.18653/v1/2020.emnlp-main.479.
[57]	Nickel M, Kiela D. Learning continuous hierarchies in the Lorentz model of hyperbolic geometry. In Proc. the 35th International Conference on Machine Learning, Jul. 2018, pp.3776–3785.
[58]	Hou Y T, Che W X, Lai Y K, Zhou Z H, Liu Y J, Liu H, Liu T. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.1381–1393. DOI: 10.18653/v1/2020.acl-main.128.
[59]	Min S, Zhong V, Zettlemoyer L, Hajishirzi H. Multi-hop reading comprehension through question decomposition and rescoring. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, Jul. 2019, pp.6097–6109. DOI: 10.18653/v1/P19-1613.
[60]	Khot T, Khashabi D, Richardson K, Clark P, Sabharwal A. Text modular networks: Learning to decompose tasks in the language of existing models. In Proc. the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2021, pp.1264–1279. DOI: 10.18653/v1/2021.naacl-main.99.
[61]	Qin G H, Eisner J. Learning how to ask: Querying LMs with mixtures of soft prompts. In Proc. the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2021, pp.5203–5212. DOI: 10.18653/v1/2021.naacl-main.410.
[62]	Wang X Z, Wei J, Schuurmans D, Le Q V, Chi E H, Narang S, Chowdhery A, Zhou D. Self-consistency improves chain of thought reasoning in language models. In Proc. the 11th International Conference on Learning Representations, May 2023.
[63]	Lewkowycz A, Andreassen A, Dohan D, Dyer E, Michalewski H, Ramasesh V, Slone A, Anil C, Schlag I, Gutman-Solo T, Wu T H, Neyshabur B, Gur-Ari G, Misra V. Solving quantitative reasoning problems with language models. In Proc. the 36th International Conference on Neural Information Processing Systems, Nov. 28-Dec. 9, 2022, Article No. 278.
[64]	Wang X Z, Wei J, Schuurmans D, Le Q, Chi E, Zhou D. Rationale-augmented ensembles in language models. arXiv: 2207.00747, 2022. https://arxiv.org/abs/2207.00747, Jul. 2024.
[65]	Li Y F, Lin Z Q, Zhang S Z, Fu Q, Chen B, Lou J G, Chen W Z. On the advance of making language models better reasoners. arXiv: 2206.02336, 2022. https://arxiv.org/abs/2206.02336v1, Jul. 2024.
[66]	Fu Y, Peng H, Sabharwal A, Clark P, Khot T. Complexity-based prompting for multi-step reasoning. In Proc. the 11th International Conference on Learning Representations, May 2023.
[67]	Besta M, Blach N, Kubicek A, Gerstenberger R, Podstawski M, Gianinazzi L, Gajda J, Lehmann T, Niewiadomski H, Nyczyk P, Hoefler T. Graph of thoughts: Solving elaborate problems with large language models. In Proc. the 38th AAAI Conference on Artificial Intelligence, Feb. 2024, pp.17682–17690. DOI: 10.1609/aaai.v38i16.29720.
[68]	Schick T, Schütze H. Few-shot text generation with pattern-exploiting training. arXiv: 2012.11926, 2020. https://arxiv.org/abs/2012.11926, Jul. 2024.
[69]	Perez E, Lewis P, Yih W T, Cho K, Kiela D. Unsupervised question decomposition for question answering. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2020, pp.8864–8880. DOI: 10.18653/v1/2020.emnlp-main.713.
[70]	Zhou D, Schärli N, Hou L, Wei J, Scales N, Wang X Z, Schuurmans D, Cui C, Bousquet O, Le Q V, Chi E H. Least-to-most prompting enables complex reasoning in large language models. In Proc. the 11th International Conference on Learning Representations, May 2023.
[71]	Dua D, Gupta S, Singh S, Gardner M. Successive prompting for decomposing complex questions. In Proc. the 2022 Conference on Empirical Methods in Natural Language Processing, Dec. 2022, pp.1251–1265. DOI: 10.18653/v1/2022.emnlp-main.81.
[72]	Creswell A, Shanahan M, Higgins I. Selection-inference: Exploiting large language models for interpretable logical reasoning. In Proc. the 11th International Conference on Learning Representations, May 2023.
[73]	Arora S, Narayan A, Chen M F, Orr L J, Guha N, Bhatia K, Chami I, Ré C. Ask me anything: A simple strategy for prompting language models. In Proc. the 11th International Conference on Learning Representations, May 2023.
[74]	Khot T, Trivedi H, Finlayson M, Fu Y, Richardson K, Clark P, Sabharwal A. Decomposed prompting: A modular approach for solving complex tasks. In Proc. the 11th International Conference on Learning Representations, May 2023.
[75]	Ye Y H, Hui B Y, Yang M, Li B H, Huang F, Li Y B. Large language models are versatile decomposers: Decompose evidence and questions for table-based reasoning. arXiv: 2301.13808, 2023. https://arxiv.org/abs/2301.13808, Jul. 2024.
[76]	Wu T S, Terry M, Cai C J. AI chains: Transparent and controllable human-AI interaction by chaining large language model prompts. In Proc. the 2022 CHI Conference on Human Factors in Computing Systems, Apr. 29-May 5, 2022, Article No. 385. DOI: 10.1145/3491102.3517582.
[77]	Wang L, Xu W Y, Lan Y H, Hu Z Q, Lan Y S, Lee R K W, Lim E P. Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. In Proc. the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul. 2023, pp.2609–2634. DOI: 10.18653/v1/2023.acl-long.147.
[78]	Li J L, Wang J Y, Zhang Z S, Zhao H. Self-prompting large language models for zero-shot open-domain QA. In Proc. the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Jun. 2024, pp.296–310. DOI: 10.18653/v1/2024.naacl-long.17.
[79]	Ye X, Durrett G. Explanation selection using unlabeled data for chain-of-thought prompting. In Proc. the 2023 Conference on Empirical Methods in Natural Language Processing, Dec. 2023, pp.619–637. DOI: 10.18653/v1/2023.emnlp-main.41.
[80]	Shum K, Diao S Z, Zhang T. Automatic prompt augmentation and selection with chain-of-thought from labeled data. In Proc. the 2023 Findings of the Association for Computational Linguistics, Dec. 2023, pp.12113–12139. DOI: 10.18653/v1/2023.findings-emnlp.811.
[81]	Diao S Z, Wang P C, Lin Y, Pan R, Liu X, Zhang T. Active prompting with chain-of-thought for large language models. arXiv: 2302.12246, 2023. https://arxiv.org/abs/2302.12246, Jul. 2024.
[82]	Zhang Z S, Zhang A, Li M, Smola A. Automatic chain of thought prompting in large language models. In Proc. the 11th International Conference on Learning Representations, May 2023.
[83]	Yang K, Tian Y D, Peng N Y, Klein D. Re³: Generating longer stories with recursive reprompting and revision. In Proc. the 2022 Conference on Empirical Methods in Natural Language Processing, Dec. 2022, pp.4393–4479. DOI: 10.18653/v1/2022.emnlp-main.296.
[84]	Yang K, Klein D, Peng N Y, Tian Y D. Doc: Improving long story coherence with detailed outline control. In Proc. the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul. 2023, pp.3378–3465. DOI: 10.18653/v1/2023.acl-long.190.
[85]	Schick T, Dwivedi-Yu J, Dessí R, Raileanu R, Lomeli M, Hambro E, Zettlemoyer L, Cancedda N, Scialom T. Toolformer: Language models can teach themselves to use tools. In Proc. the 37th International Conference on Neural Information Processing Systems, Dec. 2023, Article No. 2997.
[86]	Shen Y L, Song K T, Tan X, Li D S, Lu W M, Zhuang Y T. HuggingGPT: Solving AI tasks with ChatGPT and its friends in hugging face. In Proc. the 37th International Conference on Neural Information Processing Systems, Dec. 2023, Article No. 1657.
[87]	Wang B S, Deng X, Sun H. Iteratively prompt pre-trained language models for chain of thought. In Proc. the 2022 Conference on Empirical Methods in Natural Language Processing, Dec. 2022, pp.2714–2730. DOI: 10.18653/v1/2022.emnlp-main.174.
[88]	Nye M, Andreassen A J, Gur-Ari G, Michalewski H, Austin J, Bieber D, Dohan D, Lewkowycz A, Bosma M, Luan D, Sutton C, Odena A. Show your work: Scratchpads for intermediate computation with language models. In Proc. the 2022 Deep Learning for Code Workshop, May 2022.
[89]	Zelikman E, Wu Y H, Mu J, Goodman N D. STaR: Self-taught reasoner bootstrapping reasoning with reasoning. In Proc. the 36th International Conference on Neural Information Processing Systems, Nov. 28-Dec. 9, 2022, Article No. 1126.
[90]	Taylor R, Kardas M, Cucurull G, Scialom T, Hartshorn A, Saravia E, Poulton A, Kerkez V, Stojnic R. Galactica: A large language model for science. arXiv: 2211.09085, 2022. https://arxiv.org/abs/2211.09085, Jul. 2024.
[91]	Ting K M, Witten I H. Stacked generalization: When does it work? In Proc. the 15th International Joint Conference on Artificial Intelligence, Aug. 1997, pp.866–871.
[92]	Zhou Z H, Wu J X, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002, 137(1/2): 239–263. DOI: 10.1016/S0004-3702(02)00190-X.
[93]	Duh K, Sudoh K, Wu X C, Tsukada H, Nagata M. Generalized minimum Bayes risk system combination. In Proc. the 5th International Joint Conference on Natural Language Processing, Nov. 2011, pp.1356–1360.
[94]	Weng Y X, Zhu M J, Xia F, Li B, He S Z, Liu S P, Sun B, Liu K, Zhao J. Large language models are better reasoners with self-verification. In Proc. the 2023 Findings of the Association for Computational Linguistics, Dec. 2023, pp.2550–2575. DOI: 10.18653/v1/2023.findings-emnlp.167.
[95]	Yao S Y, Yu D, Zhao J, Shafran I, Griffiths T L, Cao Y, Narasimhan K. Tree of thoughts: Deliberate problem solving with large language models. In Proc. the 37th International Conference on Neural Information Processing Systems, Dec. 2023, Article No. 517.
[96]	Schick T, Schütze H. Few-shot text generation with natural language instructions. In Proc. the 2021 Conference on Empirical Methods in Natural Language Processing, Nov. 2021, pp.390–402. DOI: 10.18653/v1/2021.emnlp-main.32.
[97]	Yang J F, Jiang H M, Yin Q Y, Zhang D Q, Yin B, Yang D Y. SEQZERO: Few-shot compositional semantic parsing with sequential prompts and zero-shot models. In Proc. the 2022 Findings of the Association for Computational Linguistics, Jul. 2022, pp.49–60. DOI: 10.18653/v1/2022.findings-naacl.5.
[98]	Drozdov A, Schärli N, Akyürek E, Scales N, Song X Y, Chen X Y, Bousquet O, Zhou D. Compositional semantic parsing with large language models. In Proc. the 11th International Conference on Learning Representations, May 2023.
[99]	Press O, Zhang M R, Min S, Schmidt L, Smith N A, Lewis M. Measuring and narrowing the compositionality gap in language models. In Proc. the 2023 Findings of the Association for Computational Linguistics, Dec. 2023, pp.5687–5711. DOI: 10.18653/v1/2023.findings-emnlp.378.
[100]	Mialon G, Dessi R, Lomeli M, Nalmpantis C, Pasunuru R, Raileanu R, Rozière B, Schick T, Dwivedi-Yu J, Celikyilmaz A, Grave E, LeCun T, Scialom T. Augmented language models: A survey. arXiv: 2302.07842, 2023. https://arxiv.org/abs/2302.07842, Jul. 2024.
[101]	Yao S Y, Zhao J, Yu D, Du N, Shafran I, Narasimhan K R, Cao Y. ReAct: Synergizing reasoning and acting in language models. In Proc. the 11th International Conference on Learning Representations, May 2023.
[102]	Thoppilan R, De Freitas D, Hall J, Shazeer N, Kulshreshtha A, Cheng H T, Jin A, Bos T, Baker L, Du Y, Li Y, Lee H, Zheng H S, Ghafouri A, Menegali M, Huang Y P, Krikun M, Lepikhin D, Qin J, Chen D H, Xu Y Z, Chen Z F, Roberts A, Bosma M, Zhao V, Zhou Y Q, Chang C C, Krivokon I, Rusch W, Pickett M, Srinivasan P, Man L, Meier-Hellstern K, Morris M R, Doshi T, Santos R D, Duke T, Soraker J, Zevenbergen B, Prabhakaran V, Diaz M, Hutchinson B, Olson K, Molina A, Hoffman-John E, Lee J, Aroyo L, Rajakumar R, Butryna A, Lamm M, Kuzmina V, Fenton J, Cohen A, Bernstein R, Kurzweil R, Aguera-Arcas B, Cui C, Croak M, Chi E, Le Q. LaMDA: Language models for dialog applications. arXiv: 2201.08239, 2022. https://arxiv.org/abs/2201.08239, Jul. 2024.
[103]	Qiao S F, Ou Y X, Zhang N Y, Chen X, Yao Y Z, Deng S M, Tan C Q, Huang F, Chen H J. Reasoning with language model prompting: A survey. In Proc. the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul. 2023, pp.5368–5393. DOI: 10.18653/v1/2023.acl-long.294.
[104]	Lialin V, Deshpande V, Rumshisky A. Scaling down to scale up: A guide to parameter-efficient fine-tuning. arXiv: 2303.15647, 2023. https://arxiv.org/abs/2303.15647, Jul. 2024.
[105]	Zhao W X, Zhou K, Li J Y, Tang T Y, Wang X L, Hou Y P, Min Y Q, Zhang B C, Zhang J J, Dong Z C, Du Y F, Yang C, Chen Y S, Chen Z P, Jiang J H, Ren R Y, Li Y F, Tang X Y, Liu Z K, Liu P Y, Nie J Y, Wen J R. A survey of large language models. arXiv: 2303.18223, 2023. https://arxiv.org/abs/2303.18223, Jul. 2024.
[106]	Dong Q X, Li L, Dai D M, Zheng C, Wu Z Y, Chang B B, Sun X, Xu J J, Li L, Sui Z F. A survey for in-context learning. arXiv: 2301.00234, 2022. https://arxiv.org/abs/2301.00234v1, Jul. 2024.
[107]	Lou R Z, Zhang K, Yin W P. Is prompt all you need? No. A comprehensive and broader view of instruction learning. arXiv: 2303.10475, 2023. https://arxiv.org/abs/2303.10475v1, Jul. 2024.
[108]	Zhong R Q, Lee K, Zhang Z, Klein D. Adapting language models for zero-shot learning by meta-tuning on dataset and prompt collections. In Proc. the 2021 Findings of the Association for Computational Linguistics, Nov. 2021, pp.2856–2878. DOI: 10.18653/v1/2021.findings-emnlp.244.
[109]	Reynolds L, McDonell K. Prompt programming for large language models: Beyond the few-shot paradigm. In Proc. the 2021 CHI Conference on Human Factors in Computing Systems, May 2021, Article No. 314. DOI: 10.1145/3411763.3451760.
[110]	Gu Z H, Fan J, Tang N, Cao L, Jia B W, Madden S, Du X Y. Few-shot text-to-SQL translation using structure and content prompt learning. Proceedings of the ACM on Management of Data, 2023, 1(2): 147. DOI: 10.1145/3589292.
[111]	Abadi M, Chu A, Goodfellow I, McMahan H B, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In Proc. the 2016 ACM SIGSAC Conference on Computer and Communications Security, Oct. 2016, pp.308–318. DOI: 10.1145/2976749.2978318.
[112]	Gentry C. A fully homomorphic encryption scheme [Ph. D. Thesis]. Stanford University, Palo Alto, 2009.
[113]	Yang Q, Liu Y, Chen T J, Tong Y X. Federated machine learning: Concept and applications. ACM Trans. Intelligent Systems and Technology, 2019, 10(2): 12. DOI: 10.1145/3298981.
[114]	Kirchenbauer J, Geiping J, Wen Y X, Katz J, Miers I, Goldstein T. A watermark for large language models. In Proc. the 40th International Conference on Machine Learning, Jul. 2023, pp.17061–17084.
[115]	Wei J, Wang X Z, Schuurmans D, Bosma M, Ichter B, Xia F, Chi E H, Le Q V, Zhou D. Chain-of-thought prompting elicits reasoning in large language models. In Proc. the 36th International Conference on Neural Information Processing Systems, Nov. 28-Dec. 9, 2022, Article No. 1800.
[116]	Zhao Z H, Wallace E, Feng S, Klein D, Singh S. Calibrate before use: Improving few-shot performance of language models. In Proc. the 38th International Conference on Machine Learning, Jul. 2021, pp.12697–12706.
[117]	Schick T, Udupa S, Schütze H. Self-diagnosis and self-debiasing: A proposal for reducing corpus-based bias in NLP. Transactions of the Association for Computational Linguistics, 2021, 9: 1408–1424. DOI: 10.1162/tacl_a_ 00434.
[118]	Liu Y, Gao Y, Su Z, Chen X K, Ash E, Lou J G. Uncovering and categorizing social biases in text-to-SQL. In Proc. the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul. 2023, pp.13573–13584. DOI: 10.18653/v1/2023.acl-long.759.

Relative Articles

Supplements (1)

Supplements
Others
- Read Online
  https://rdcu.be/dUUBU

Cited By

Get Citation

PDF

XML

Read Online

Article views (315) PDF downloads (28)

Indexed in:

A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Abstract

Conflict of Interest

References

Supplements

Others

Read Online

Catalog

Related

Home

Overview

Resources

Contents

Indexed in:

A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Abstract

Conflict of Interest

References

Supplements

Others

Read Online

Catalog

Related

Home

Overview

Resources

Contents

Export File

Citation

Format

Content