一万字用Python从零搭建AI智能体

发布于 2025-3-11 02:16

浏览

0收藏

在之前的Agent系列文章中，我们全面介绍了AI智能体，探讨了它们的特征、组成部分、发展历程、面临的挑战以及未来的可能性。在这篇文章中，我们将深入探索如何使用Python从零开始构建一个智能体。这个智能体将具备根据用户输入做出决策、选择合适工具并相应执行任务的能力。现在，就让我们开启这个有趣的构建之旅吧！

一、什么是智能体？

智能体是一种能够感知其所处环境、做出决策并采取行动以实现特定目标的自主实体。智能体的复杂程度各不相同，从简单的对刺激做出反应的反应式智能体，到能够随着时间推移进行学习和适应的更高级的智能体。常见的智能体类型包括：

反应式智能体：直接对环境变化做出反应，没有内部记忆。
基于模型的智能体：利用对世界的内部模型来做出决策。
基于目标的智能体：根据要实现的特定目标来规划行动。
基于效用的智能体：基于效用函数评估潜在行动，以实现结果的最大化。

聊天机器人、推荐系统和自动驾驶汽车都是智能体的实际应用例子，它们各自利用不同类型的智能体来高效且智能地执行任务。

我们构建的智能体核心组件包括：

模型：智能体的 “大脑”，负责处理输入并生成响应。
工具：根据用户请求，智能体可以执行的预定义函数。
工具箱：智能体可使用的工具集合。
系统提示：指导智能体如何处理用户输入并选择正确工具的指令集。

二、实现过程

现在，让我们挽起袖子，开始动手构建吧！

2.1 准备工作

本教程的完整代码可在AI智能体的GitHub仓库中获取。你可以在 “Build an Agent from Scratch” 这里找到具体实现。在运行代码之前，请确保你的系统满足以下先决条件：

1. Python环境设置

运行AI智能体需要安装Python。按照以下步骤来设置你的环境：

安装Python（如果尚未安装）：从python.org下载并安装Python（推荐3.8及以上版本）。
验证安装：在命令行中输入python --version，查看是否正确安装。
创建虚拟环境（推荐）：使用虚拟环境来管理依赖项是个不错的选择。在命令行中输入python -m venv ai_agents_env创建虚拟环境，然后通过source ai_agents_env/bin/activate激活它。
安装所需依赖项：导航到代码仓库目录，然后在命令行中输入pip install -r requirements.txt来安装所需的依赖库。

2. 本地设置Ollama

Ollama用于高效地运行和管理本地语言模型。按照以下步骤安装和配置它：

下载并安装Ollama：访问Ollama的官方网站，下载适合你操作系统的安装程序，并按照平台对应的说明进行安装。
验证Ollama安装：在命令行中运行ollama --version，检查Ollama是否正确安装。
拉取模型（如有需要）：有些智能体实现可能需要特定的模型。你可以使用ollama pull mistral命令拉取模型。

2.2 实现步骤

步骤1：设置环境

除了Python，我们还需要安装一些必要的库。在本教程中，我们将使用requests、json和termcolor库。另外，我们会使用dotenv来管理环境变量。在命令行中输入pip install requests termcolor python-dotenv进行安装。

步骤2：定义模型类

我们首先需要一个能够处理用户输入的模型。我们将创建一个OllamaModel类，它通过与本地API进行交互来生成响应。以下是基本实现代码：

from termcolor import colored
import os
from dotenv import load_dotenv
load_dotenv()
import requests
import json
import operator

class OllamaModel:
    def __init__(self, model, system_prompt, temperature=0, stop=None):
        """
        用给定的参数初始化OllamaModel。
        参数:
        model (str): 要使用的模型名称。
        system_prompt (str): 要使用的系统提示。
        temperature (float): 模型的温度设置。
        stop (str): 模型的停止标记。
        """
        self.model_endpoint = "http://localhost:11434/api/generate"
        self.temperature = temperature
        self.model = model
        self.system_prompt = system_prompt
        self.headers = {"Content-Type": "application/json"}
        self.stop = stop

    def generate_text(self, prompt):
        """
        根据提供的提示从Ollama模型生成响应。
        参数:
        prompt (str): 用于生成响应的用户查询。
        返回:
        dict: 模型的响应，以字典形式返回。
        """
        payload = {
            "model": self.model,
            "format": "json",
            "prompt": prompt,
            "system": self.system_prompt,
            "stream": False,
            "temperature": self.temperature,
            "stop": self.stop
        }
        try:
            request_response = requests.post(
                self.model_endpoint,
                headers=self.headers,
                data=json.dumps(payload)
            )
            print("REQUEST RESPONSE", request_response)
            request_response_json = request_response.json()
            response = request_response_json['response']
            response_dict = json.loads(response)
            print(f"\n\nResponse from Ollama model: {response_dict}")
            return response_dict
        except requests.RequestException as e:
            response = {"error": f"Error in invoking model! {str(e)}"}
            return response1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.

这个类使用模型名称、系统提示、温度和停止标记进行初始化。generate_text方法向模型API发送请求并返回响应。

步骤3：为智能体创建工具

接下来是为我们的智能体创建可用的工具。这些工具是执行特定任务的简单Python函数。以下是一个基本计算器和字符串反转器的示例：

def basic_calculator(input_str):
    """
    根据输入字符串或字典对两个数字执行数值运算。
    参数:
    input_str (str或dict): 要么是表示包含'num1'、'num2'和'operation'键的字典的JSON字符串，
    要么是直接的字典。例如: '{"num1": 5, "num2": 3, "operation": "add"}'
    或{"num1": 67869, "num2": 9030393, "operation": "divide"}
    返回:
    str: 运算的格式化结果。
    抛出:
    Exception: 如果在运算过程中发生错误（例如，除以零）。
    ValueError: 如果请求了不支持的运算或输入无效。
    """
    try:
        if isinstance(input_str, dict):
            input_dict = input_str
        else:
            input_str_clean = input_str.replace("'", "\"")
            input_str_clean = input_str_clean.strip().strip("\"")
            input_dict = json.loads(input_str_clean)
        if not all(key in input_dict for key in ['num1', 'num2', 'operation']):
            return "Error: Input must contain 'num1', 'num2', and 'operation'"
        num1 = float(input_dict['num1'])
        num2 = float(input_dict['num2'])
        operation = input_dict['operation'].lower()
    except (json.JSONDecodeError, KeyError) as e:
        return "Invalid input format. Please provide valid numbers and operation."
    except ValueError as e:
        return "Error: Please provide valid numerical values."
    operations = {
        'add': operator.add,
        'plus': operator.add,
      'subtract': operator.sub,
      'minus': operator.sub,
      'multiply': operator.mul,
        'times': operator.mul,
        'divide': operator.truediv,
        'floor_divide': operator.floordiv,
      'modulus': operator.mod,
        'power': operator.pow,
        'lt': operator.lt,
        'le': operator.le,
        'eq': operator.eq,
        'ne': operator.ne,
        'ge': operator.ge,
        'gt': operator.gt
    }
    if operation not in operations:
        return f"Unsupported operation: '{operation}'. Supported operations are: {', '.join(operations.keys())}"
    try:
        if (operation in ['divide', 'floor_divide','modulus']) and num2 == 0:
            return "Error: Division by zero is not allowed"
        result = operations[operation](num1, num2)
        if isinstance(result, bool):
            result_str = "True" if result else "False"
        elif isinstance(result, float):
            result_str = f"{result:.6f}".rstrip('0').rstrip('.')
        else:
            result_str = str(result)
        return f"The answer is: {result_str}"
    except Exception as e:
        return f"Error during calculation: {str(e)}"


def reverse_string(input_string):
    """
    反转给定的字符串。
    参数:
    input_string (str): 要反转的字符串。
    返回:
    str: 反转后的字符串。
    """
    if not isinstance(input_string, str):
        return "Error: Input must be a string"
    reversed_string = input_string[::-1]
    result = f"The reversed string is: {reversed_string}"
    return result1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.

这些函数根据提供的输入执行特定任务。basic_calculator处理算术运算，而reverse_string则反转给定的字符串。

步骤4：构建工具箱

ToolBox类用于存储智能体可以使用的所有工具，并为每个工具提供描述：

class ToolBox:
    def __init__(self):
        self.tools_dict = {}

    def store(self, functions_list):
        """
        存储列表中每个函数的名称和文档字符串。
        参数:
        functions_list (list): 要存储的函数对象列表。
        返回:
        dict: 以函数名称为键，其文档字符串为值的字典。
        """
        for func in functions_list:
            self.tools_dict[func.__name__] = func.__doc__
        return self.tools_dict

    def tools(self):
        """
        将store方法中创建的字典转换为文本字符串返回。
        返回:
        str: 存储的函数及其文档字符串的字典，以文本字符串形式返回。
        """
        tools_str = ""
        for name, doc in self.tools_dict.items():
            tools_str += f"{name}: \"{doc}\"\n"
        return tools_str.strip()1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.

这个类将帮助智能体了解哪些工具可用以及每个工具的用途。

步骤5：创建智能体类

智能体需要进行思考、决定使用哪个工具并执行它。以下是Agent类的代码：

agent_system_prompt_template = """
你是一个智能AI助手，可以使用特定的工具。你的回复必须始终采用以下JSON格式：
{
    "tool_choice": "name_of_the_tool",
    "tool_input": "inputs_to_the_tool"
}
工具及使用场景：
1. basic_calculator：用于任何数学计算
    - 输入格式: {{"num1": number, "num2": number, "operation": "add/subtract/multiply/divide"}}
    - 支持的运算: add/plus, subtract/minus, multiply/times, divide
    - 示例输入和输出:
        - 输入: "Calculate 15 plus 7"
        - 输出: {{"tool_choice": "basic_calculator", "tool_input": {{"num1": 15, "num2": 7, "operation": "add"}}}}
        - 输入: "What is 100 divided by 5?"
        - 输出: {{"tool_choice": "basic_calculator", "tool_input": {{"num1": 100, "num2": 5, "operation": "divide"}}}}
2. reverse_string：用于任何涉及反转文本的请求
    - 输入格式: 仅需反转的文本字符串
    - 当用户提到“reverse”、“backwards”或要求反转文本时，始终使用此工具
    - 示例输入和输出:
        - 输入: "Reverse of 'Howwwww'?"
        - 输出: {{"tool_choice": "reverse_string", "tool_input": "Howwwww"}}
        - 输入: "What is the reverse of Python?"
        - 输出: {{"tool_choice": "reverse_string", "tool_input": "Python"}}
3. no tool：用于一般对话和问题
    - 示例输入和输出:
        - 输入: "Who are you?"
        - 输出: {{"tool_choice": "no tool", "tool_input": "I am an AI assistant that can help you with calculations, reverse text, and answer questions. I can perform mathematical operations and reverse strings. How can I help you today?"}}
        - 输入: "How are you?"
        - 输出: {{"tool_choice": "no tool", "tool_input": "I'm functioning well, thank you for asking! I'm here to help you with calculations, text reversal, or answer any questions you might have."}}
严格规则：
1. 对于关于身份、能力或感受的问题：
    - 始终使用“no tool”
    - 提供完整、友好的回复
    - 提及你的能力
2. 对于任何文本反转请求：
    - 始终使用“reverse_string”
    - 仅提取要反转的文本
    - 去除引号、“reverse of”和其他多余文本
3. 对于任何数学运算：
    - 始终使用“basic_calculator”
    - 提取数字和运算
    - 将文本形式的数字转换为数字
以下是你的工具列表及其描述：
{tool_descriptions}
记住：你的回复必须始终是包含“tool_choice”和“tool_input”字段的有效JSON。
"""


class Agent:
    def __init__(self, tools, model_service, model_name, stop=None):
        """
        用工具列表和模型初始化智能体。
        参数:
        tools (list): 工具函数列表。
        model_service (class): 具有generate_text方法的模型服务类。
        model_name (str): 要使用的模型名称。
        """
        self.tools = tools
        self.model_service = model_service
        self.model_name = model_name
        self.stop = stop

    def prepare_tools(self):
        """
        将工具存储在工具箱中并返回其描述。
        返回:
        str: 存储在工具箱中的工具的描述。
        """
        toolbox = ToolBox()
        toolbox.store(self.tools)
        tool_descriptions = toolbox.tools()
        return tool_descriptions

    def think(self, prompt):
        """
        使用系统提示模板和工具描述在模型上运行generate_text方法。
        参数:
        prompt (str): 用于生成响应的用户查询。
        返回:
        dict: 模型的响应，以字典形式返回。
        """
        tool_descriptions = self.prepare_tools()
        agent_system_prompt = agent_system_prompt_template.format(tool_descriptions=tool_descriptions)
        if self.model_service == OllamaModel:
            model_instance = self.model_service(
                model=self.model_name,
                system_prompt=agent_system_prompt,
                temperature=0,
                stop=self.stop
            )
        else:
            model_instance = self.model_service(
                model=self.model_name,
                system_prompt=agent_system_prompt,
                temperature=0
            )
        agent_response_dict = model_instance.generate_text(prompt)
        return agent_response_dict

    def work(self, prompt):
        """
        解析think方法返回的字典并执行相应的工具。
        参数:
        prompt (str): 用于生成响应的用户查询。
        返回:
        执行相应工具的响应，如果未找到匹配的工具，则返回tool_input。
        """
        agent_response_dict = self.think(prompt)
        tool_choice = agent_response_dict.get("tool_choice")
        tool_input = agent_response_dict.get("tool_input")
        for tool in self.tools:
            if tool.__name__ == tool_choice:
                response = tool(tool_input)
                print(colored(response, 'cyan'))
                return
        print(colored(tool_input, 'cyan'))
        return1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
92.
93.
94.
95.
96.
97.
98.
99.
100.
101.
102.
103.
104.
105.
106.
107.
108.
109.
110.
111.
112.
113.
114.
115.
116.
117.

这个类有三个主要方法：

prepare_tools：存储并返回工具的描述。
think：根据用户提示决定使用哪个工具。
work：执行选择的工具并返回结果。

步骤6：运行智能体

最后，让我们把所有内容整合起来，运行我们的智能体。在脚本的主程序部分，初始化智能体并开始接受用户输入：

if __name__ == "__main__":
    """
    使用此智能体的说明：
    你可以尝试的示例查询：
    1. 计算器运算:
        - "Calculate 15 plus 7"
        - "What is 100 divided by 5?"
        - "Multiply 23 and 4"
    2. 字符串反转:
        - "Reverse the word 'hello world'"
        - "Can you reverse 'Python Programming'?"
    3. 一般问题（将得到直接回复）:
        - "Who are you?"
        - "What can you help me with?"
    Ollama命令（在终端中运行这些命令）:
        - 查看可用模型:    'ollama list'
        - 查看正在运行的模型:      'ps aux | grep ollama'
        - 列出模型标签:          'curl http://localhost:11434/api/tags'
        - 拉取新模型:         'ollama pull mistral'
        - 运行模型服务器:         'ollama serve'
    """
    tools = [basic_calculator, reverse_string]
    model_service = OllamaModel
    model_name = "llama2"
    stop = "<|eot_id|>"
    agent = Agent(tools=tools, model_service=model_service, model_name=model_name, stop=stop)
    print("\nWelcome to the AI Agent! Type 'exit' to quit.")
    print("You can ask me to:")
    print("1. Perform calculations (e.g., 'Calculate 15 plus 7')")
    print("2. Reverse strings (e.g., 'Reverse hello world')")
    print("3. Answer general questions\n")
    while True:
        prompt = input("Ask me anything: ")
        if prompt.lower() == "exit":
            break
        agent.work(prompt)1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.

3. 结论

在这篇博文中，我们从理解智能体是什么开始，逐步实现了一个智能体。我们设置了环境，定义了模型，创建了必要的工具，并构建了一个结构化的工具箱来支持智能体的功能。最后，我们通过运行智能体，将所有内容整合到了一起。

本文转载自柏企阅文，作者：柏企

标签

Python

智能体

热门内容榜 • 最近上榜

51CTO

51CTO博客

51CTO学堂