We use cookies to improve your experience with our site.

通信理论视角下的大语言模型提示工程

A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

  • 摘要: 大型语言模型(LLM)的兴起使得研究重心从单任务导向的自然语言处理(NLP)研究转向了整体的端到端多任务学习范式。在这一研究方向上, 基于LLM的提示方法引起了广泛关注, 部分原因在于提示工程带来的技术优势, 以及各种提示方法所揭示的底层NLP原理。传统的监督学习通常需要基于标记数据训练模型, 然后进行预测。相比之下, 提示工程方法通过组织适当的提示,直接利用现有LLM(如GPT-3和GPT-4)的强大能力,尤其是在少样本或零样本场景下。面对与提示相关的大量研究以及该领域不断演变的特性,本文旨在:(i)提供一个新颖的视角,在成熟的通信理论框架内回顾现有的PE方法;(ii)促进对三个典型任务中现有PE方法发展趋势的更好/更深入理解;(iii)为未来PE方法的有前景研究方向提供启示。

     

    Abstract: The springing up of large language models (LLMs) has shifted the community from single-task-orientated natural language processing (NLP) research to a holistic end-to-end multi-task learning paradigm. Along this line of research endeavors in the area, LLM-based prompting methods have attracted much attention, partially due to the technological advantages brought by prompt engineering (PE) as well as the underlying NLP principles disclosed by various prompting methods. Traditional supervised learning usually requires training a model based on labeled data and then making predictions. In contrast, PE methods directly use the powerful capabilities of existing LLMs (e.g., GPT-3 and GPT-4) via composing appropriate prompts, especially under few-shot or zero-shot scenarios. Facing the abundance of studies related to the prompting and the ever-evolving nature of this field, this article aims to 1) illustrate a novel perspective to review existing PE methods within the well-established communication theory framework, 2) facilitate a better/deeper understanding of developing trends of existing PE methods used in three typical tasks, and 3) shed light on promising research directions for future PE methods.

     

/

返回文章
返回