最大限度发挥LLM的潜力之提示符工程化策略指南-51CTO.COM

译者 | 朱先忠

审校 | 重楼

引言

近年来，自然语言模型得到了快速改进；其中，GPT-3和GPT-4等大型语言模型占据了中心地位。这些模型之所以受欢迎，是因为它们能够以令人难以置信的技能执行各种各样的任务。此外，随着这些模型的参数数量（数十亿！）的不断增加，这些模型意外地获得了新的功能。

在本文中，我们将探讨LLM（大型语言模型）、它们可以执行的任务、它们存在的问题以及各种应用场景下的提示符工程化应对策略。

什么是LLM？

LLM是在大量文本数据上进行训练的神经网络。训练过程允许模型学习文本中的模式，包括语法、句法和单词联想。这些模型能够使用这些学习到的模式来生成类似人类的文本，使其成为自然语言处理（NLP）任务的理想选择。

目前存在哪些可用的LLM？

当前，已经有几种LLM正式投入商业应用，其中GPT-4最受欢迎。其他模型包括LLaMA、PaLM、BERT和T5等。当然，每一种模型都有其各自的优点和缺点。其中有一些是开源的，另一些则是闭源的，只能通过API调用方式来使用。

LLM的不足

尽管LLM的性能令人印象深刻，但仍有一些局限性。一个显著的缺点是它们无法超越提示符中提供的信息进行推理。此外，LLM可能根据训练的数据生成带有“偏见”的文本描述内容。总之，想准确控制LLM的输出结果也是一项很富有挑战性的任务。因此，人们有必要采用提示符工程化策略来控制所需要的输出内容。

您可以执行哪些任务？

我们可以通过格式化每一种应用场景相应的提示符内容来指导LLM为我们执行特定的任务。接下来，我们将列举一些典型应用场景中的任务、提示符模板以及每个任务相应的输出结果。

文本摘要应用场景

LLM可以帮助我们生成长文本的摘要，使内容更容易理解和消化。例如，我们可以要求使用以下提示符模板对文本进行总结：

模板

<Full text>
Summarize the text above: / Explain the text above in <N> sentences:1.
2.

例如，我们可以要求用以下提示信息来总结上述文本：

提示符

In a mechanical watch the watch's gears are turned by a spiral spring called a mainspring. In a manual watch energy is stored in the mainspring by turning a knob, the crown on the side of the watch. Then the energy from the mainspring powers the watch movement until it runs down, requiring the spring to be wound again.
A self-winding watch movement has a mechanism which winds the mainspring using the natural motions of the wearer's body. The watch contains an oscillating weight that turns on a pivot. The normal movements of the watch in the user's pocket (for a pocketwatch) or on the user's arm (for a wristwatch) cause the rotor to pivot on its staff, which is attached to a ratcheted winding mechanism. The motion of the watch is thereby translated into circular motion of the weight which, through a series of reverser and reducing gears, eventually winds the mainspring. There are many different designs for modern self-winding mechanisms. Some designs allow winding of the watch to take place while the weight swings in only one direction while other, more advanced, mechanisms have two ratchets and wind the mainspring during both clockwise and anti-clockwise weight motions.
Explain the text above in 2 sentences:1.
2.
3.

最终，LLM将给出以下输出结果：

输出

A mechanical watch uses a mainspring to power the movement of the watch's gears. In a manual watch, the mainspring is wound by turning a knob, while a self-winding watch uses the natural motion of the wearer's body to wind the mainspring through an oscillating weight and ratcheted winding mechanism.1.

信息提取领域应用

LLM可以从大量文本中识别和提取特定信息。这在新闻和商业智能等领域很有用。以下模板指示LLM模型提取信息：

模板

<Full text>
Mention X in the text above: / List X in the text above:1.
2.

例如，我们可以要求从以下文本中提取地名：

提示符

The earliest credible evidence for a successful design is the watch made by the Swiss watchmaker Abraham-Louis Perrelet, who lived in Le Locle. In late 1776 or early 1777, he invented a self-winding mechanism for pocket watches using an oscillating weight inside the watch that moved up and down. The Geneva Society of Arts, reporting on this watch in 1777, stated that 15 minutes walking was necessary to fully wind the watch.
List the names of places in the text above:1.
2.

它给出以下输出结果：

输出

Le Locle, Geneva.1.

问答场景应用

LLM还可以根据提示符中提供的信息进行问题回答。以下模板可用于指导LLM回答问题：

模板

Answer the question based on the context below. <Other instructions>. Respond "Not sure" if not sure about the answer.
Context: <Context>
Q: <Question>
A:1.
2.
3.
4.

例如，使用以下提示：

提示符

Answer the question based on the context below. Answer in a single sentence. Respond "Not sure" if not sure about the answer.
Context: An automatic watch, also known as a self-winding watch or simply an automatic, is a mechanical watch where the natural motion of the wearer provides energy to wind the mainspring, making manual winding unnecessary if worn enough. It is distinguished from a manual watch in that a manual watch must have its mainspring wound by hand at regular intervals.
Q: Do automatic watches need to be manually wound?
A:1.
2.
3.
4.

而ChatGPT的回应是：

输出

No, automatic watches do not need to be manually wound if worn enough because the natural motion of the wearer provides energy to wind the mainspring.1.

文本分类领域

LLM可以将文本分类为预定义的类别，例如情绪分析或垃圾邮件检测。您可以按以下方式格式化文本分类说明：

模板

Classify the text into <List of choices>:
Text: <Full text>
Classification:1.
2.
3.

例如，我们可以尝试根据文本的情绪是积极的、中立的还是消极的来对其进行分类：

提示符

Classify the text into positive, neutral or negative:
Text: That shot selection was awful
Classification:1.
2.
3.

输出

Negative1.

会话应用场景

LLM可以生成类似人类的对话，使其在聊天机器人和客户服务应用程序中非常有用。我们可以指示LLM以特定的方式跟随对话。

模板

The following is a conversation with <Actor>. The <Actor> should answer <Further instructions>.
Person: <Conversation>
<Actor>: <Response>
Person: <Conversation>
<Actor>:1.
2.
3.
4.
5.

LLM能够填写我们指示它的参与者类型的响应，我们可以通过再次传递整个对话和新的响应来继续对话。例如：

提示符1

The following is a conversation with a lawyer. The lawyer should answer simply and in layman terms.
Person: Hello, I would like to know how to dispute a parking ticket
Lawyer:1.
2.
3.

输出1

Sure, I can help. The first step is to check the ticket for any errors or mistakes. If you find any, you can use those as grounds to dispute the ticket. If there are no mistakes, you can still dispute the ticket by providing evidence to support your case, such as photos or witness statements. You can also request a hearing to present your case in person.1.

提示符2

The following is a conversation with a lawyer. The lawyer should answer simply and in layman terms.
Person: Hello, I would like to know how to dispute a parking ticket
Lawyer: Sure, I can help. The first step is to check the ticket for any errors or mistakes. If you find any, you can use those as grounds to dispute the ticket. If there are no mistakes, you can still dispute the ticket by providing evidence to support your case, such as photos or witness statements. You can also request a hearing to present your case in person.
Person: How do I request a hearing?
Lawyer:1.
2.
3.
4.
5.

输出2

To request a hearing, you'll need to follow the instructions on the back of the ticket or on the website provided. Usually, you'll need to submit a written request within a certain timeframe, along with any evidence or supporting documents you have. Once your request is received, you'll be notified of the date, time, and location of your hearing.1.

代码生成应用场景

LLM可以生成计算机程序代码，使开发人员更容易创建复杂的软件程序。我们可以写一个代码注释来描述我们希望代码做什么，LLM将能够尽最大努力为我们做这件事。我们希望代码做的事情越复杂，我们就需要在提示注释中表现得越具体和具备描述性。

模板可能会因我们想要生成的编程语言而异，但我建议大致遵循如下所示的模板：

模板

<Comment symbol of your language of choice (/* or ''' or #)>
Method: <Name of the method>
Language: <Programming language>
Description: <Description of what the method must do>
Parameters: <Parameters that the method will receive>
Context: <Further information needed to implement the method (URLs, databases, etc.)>
Output: <Output of the method>
<Comment symbol of your language of choice (/* or ''' or #)>1.
2.
3.
4.
5.
6.
7.
8.

例如，如果我们想获取一个带有分页的API，我们可以编写以下注释：

提示符

/* Method: fetchFirst5Pages
 * Language: Javascript
 * Description: Fetch the first 5 pages from this paginated API: http://example.com/pages/?n=x
 * Context: This API accepts the following URL parameters:
 * n: the number of the page to fetch
 * Output: the content of the first 5 pages
 */1.
2.
3.
4.
5.
6.
7.

输出

Here's an example implementation of the fetchFirst5Pages method in JavaScript, assuming the use of the fetch function:

async function fetchFirst5Pages() {
 const baseUrl = 'http://example.com/pages/?n=';
 const pages = [];
 for (let i = 1; i <= 5; i++) {
 const url = baseUrl + i;
 const response = await fetch(url);
 if (!response.ok) {
 throw new Error(`Error fetching page ${i}`);
 }
 const pageData = await response.json();
 pages.push(pageData);
 }
 return pages;
}

This function uses a loop to fetch the first 5 pages from the API by appending the page number to the base URL. It checks if the response from the server is okay and throws an error if it's not. Then, it parses the JSON data from the response and adds it to an array of pages. Finally, it returns the array of pages.1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.

推理领域应用

LLM可能会在一些基本任务上遇到困难，例如算术问题。然而，如果我们要求LLM对其回应进行推理，我们可以帮助它找到正确的答案。下面是我推荐的模板：

模板

Q: <Question>
A: Let's work this out in a step by step way to be sure we have the right answer. First, <Instructions>.1.
2.

在这个例子中，我们可以看到，如果我们不合理地提出一个简单的算术问题，它就会失败：

提示符1

Answer the following question. Respond "Not sure" if not sure about the answer.
Q: The sum of prime numbers in the following list are a multiple of 3: [2, 4, 6, 7, 9, 14]
A: [Yes or No]1.
2.
3.

输出1

No. The sum of prime numbers in the list [2, 4, 6, 7, 9, 14] is not a multiple of 3. Therefore, the answer is "No".1.

然而，如果我们指示LLM对其答案进行推理，我们会得到一个非常不同的结果：

提示符2

Q: The sum of prime numbers in the following list are a multiple of 3: [2, 4, 6, 7, 9, 14]
A: Let's work this out in a step by step way to be sure we have the right answer. First, we find the prime numbers in the list and add them together. Then we check whether the resulting number can be divided by 3.1.
2.

输出2

Sure, let's work this out step by step.
The prime numbers in the list are 2 and 7. So we can add them together:
2 + 7 = 9
The resulting number is 9. Now we need to check whether it is a multiple of 3. We can do this by checking if the remainder of 9 divided by 3 is equal to 0:
9 % 3 = 0
Since the remainder is 0, we know that 9 is indeed a multiple of 3.
Therefore, the answer is "Yes", the sum of prime numbers in the list [2, 4, 6, 7, 9, 14] is a multiple of 3.1.
2.
3.
4.
5.
6.
7.

策略生成领域

虽然这些模板可以帮助您根据自己的需求挖掘LLM的潜力，但有时它们仍然会在某些复杂的任务中遇到困难。当然，科学家们已经开发了一些技术来帮助LLM获得令人满意的答案。到目前为止，我们只使用了零样本提示，现在我们将对其进行定义；不过，我们可以使用更复杂的技术。

零样本提示（Zero-Shot Prompting）

本文中的所有示例都使用了零样本提示。零样本提示意味着，向LLM传递一个提示，只包含我们希望它回答的问题，而不包含任何其他情况下它应该如何响应的示例。例如，在前面的一个例子中，我们做到了：

提示符

Classify the text into positive, neutral or negative:
Text: That shot selection was awful
Classification:1.
2.
3.

输出

Negative1.

这可能不足以让模型准确地回答我们希望它如何回答。让我们看看我们有什么替代方案。

少快照提示（Few-Shot Prompting）

与零样本提示不同，在少快照提示的情况下，我们给LLM几个示例，说明它应该如何响应，然后再提问。例如，假设我们想实现像前面的例子一样对文本的情感进行分类。但我们希望它每次都能以一种非常具体的形式给我们答案。我们可以提前举几个例子：

提示符

Classify the text into positive, neutral or negative:
Text: Today the weather is fantastic
Classification: Pos
Text: The furniture is small.
Classification: Neu
Text: I don't like your attitude
Classification: Neg
Text: That shot selection was awful
Classification:1.
2.
3.
4.
5.
6.
7.
8.
9.

输出

Neg1.

思维链提示（Chain of Thought Prompting）

对于更复杂的任务，我们可以将少量的快照提示与让模型对其答案进行推理相结合。我们可以使用与之前相同的例子来了解推理和少量快照提示是如何影响结果的：

提示符

Q: The sum of prime numbers in the following list are a multiple of 3: [2, 4, 6, 7, 9, 14]
A: The prime numbers in the list are 2 and 7. Their sum is 9. 9 % 3 = 0. The answer is yes.
Q: The sum of prime numbers in the following list are a multiple of 3: [1, 2, 4, 7, 8, 10]
A: The prime numbers in the list are 1, 2 and 7. Their sum is 10. 10 % 3 = 1. The answer is no.
Q: The sum of prime numbers in the following list are a multiple of 3: [1, 2, 3, 4]
A: The prime numbers in the list are 1, 2 and 3. Their sum is 6. 6 % 3 = 0. The answer is yes.
Q: The sum of prime numbers in the following list are a multiple of 3: [5, 6, 7, 8]
A:1.
2.
3.
4.
5.
6.
7.
8.

输出

The prime numbers in the list are 5 and 7. Their sum is 12. 12 % 3 = 0. The answer is yes.1.

总结

总之，当前的大型语言模型已经彻底改变了自然语言处理领域；然而，另一方面，要最大限度地发挥其潜力，提示符的工程化策略选择至关重要。通过了解LLM可以执行的任务、它们的缺点以及各种提示符的工程策略，开发人员可以利用LLM的力量来创建创新和有效的解决方案。

在不久的将来，可能会开发出更多的策略和技术，因此请关注该领域的进一步发展，以最大限度地发挥LLM的潜力。此外，随着LLM数十亿个额外参数的不断扩大，很可能会有更多我们现在甚至无法想到的任务。想到使用这些新工具会有什么可能，以及它们在未来会为我们提供哪些用例，不能不令人惊讶。

译者介绍

朱先忠，51CTO社区编辑，51CTO专家博客、讲师，潍坊一所高校计算机教师，自由编程界老兵一枚。

原文标题：Maximizing the Potential of LLMs: A Guide to Prompt Engineering，作者：Roger Oriol