在机器学习生命周期管理中使用MLflow的综合指南-51CTO.COM

译者 | 李睿

审校 | 重楼

MLflow是一个开源平台，专门用于处理机器学习过程的整个生命周期。本文介绍的综合指南将从初学者开始，逐步提升至高级专家的水平，将涵盖使用Python代码时的所有重要功能。通过这个综合指南，将全面了解MLflow，并能够管理实验、打包代码、管理模型以及部署模型。

MLflow简介

设置MLflow

从“MLflow跟踪”到“查询实验”

MLflow是涵盖机器学习过程生命周期的重要工具;该范围由实验、可再现性和部署组成。以下是MLflow主要组件的概述：

MLflow跟踪：用于记录和查询实验。
MLflow项目：打包机器学习代码，使其可重用和可复制。
MLflow模型：部署和管理模型。
MLflow模型注册表：专为管理模型而定制的存储库。

安装

以下的代码用于使用pip安装MLflow：

Shell：

1 !pip install mlflow

设置跟踪服务器

以下代码设置了一个MLflow跟踪服务器，其中包含用于后端存储的SQLite，以及用于项目的目录./mlflow.db和./artifacts。

Shell：

1 !mlflow server --backend-store-uri sqlite：///mlflow.db 
--default-artifact-root ./artifacts

MLflow可以用于记录和查询实验。日志记录需要运行一个程序，并且要查询实验，运行以下代码行：

Python：

1 import mlflow
2
3 with mlflow.start_run()： # Start a decorator
4 mlflow.log_param("param1", 5) # Log a parameter
5 mlflow.log_metric("metric1", 0.85) # Log a metric
6 mlflow.log_artifact("path/to/artifact") # Log an artifact

示例用例

端到端项目

Python：

1 runs = mlflow.search_runs()
2print(runs)

MLflow项目

MLflow项目是组织和打包代码的一种方式。项目只是一个带有MLproject文件的目录。

(1)创建MLproject文件

以下是一个MLproject文件的例子：

Python：

1 name： MyProject
2
3 conda_env： conda.yaml
4
5 entry_points：
6 main：
7 parameters：
8 param1： { type： int, default： 5 }
9 command： "python train.py --param1 {param1}"

(2)运行项目

要运行一个项目，使用mlflow run命令：

Shell：

1 mlflow run . -P param1=10

MLflow模型

MLflow模型是打包机器学习模型的标准方法。其想法是，使用MLflow以许多不同的格式保存模型，例如Python、R甚至Java。

(1)保存模型

以下是在Python中保存模型的方法：

Python：

1 from sklearn.ensemble import RandomForestClassifier
2
3 model = RandomForestClassifier()
4 model.fit(X_train, y_train)
5
6 mlflow.sklearn.log_model(model, "model")
7

(2)加载模型

以下是是加载已经保存模型的方法：

Python：

1 model = mlflow.sklearn.load_model("runs：//model")
2 predictions = model.predict(X_test)

MLflow模型注册表

MLflow模型注册表是管理模型的中心存储库。

(1)注册模型

为了注册一个模型，你需要先记录它，然后才能注册：

Python：

1 result = mlflow.register_model("runs：//model", "MyModel")

(2)管理模型版本

然后，可以通过在不同阶段(例如Staging和Production)之间进行转换来管理模型的不同版本：

Python：

1 from mlflow.tracking import MlflowClient
2
3 client = MlflowClient()
4
5 client.transition_model_version_stage(
6 name="MyModel",
7 version=1,
8 stage="Production"
9 )

高级功能和集成

(1)与GenAI集成

MLflow可以很好地支持GenAI模型，包括OpenAI、Transformer和LangChain。以下是如何记录和部署OpenAI模型的示例：

Python：

1 import mlflow.openai
2
3with mlflow.start_run()：
4 response = openai.Completion.create(
5 model="text-davinci-003",
6 prompt="Translate the following English text to French： '{}'",
7 max_tokens=60
8 )
9 mlflow.openai.log_model(response, "openai-model")

(2)提示工程界面

MLflow的提示工程用户界面(UI)允许交互式地开发和评估提示。

(3)部署

使用MLflow很容易部署模型。例如，可以使用MLflow的REST API来服务一个模型：

Shell：

1 mlflow models serve -m runs：//model --port 1234

MLflow的示例用例

用例1：超参数调优的实验跟踪

当为机器学习模型调优超参数时，跟踪每个实验的参数和结果以了解最佳模型配置是很重要的。如果第一次使用MLflow，那么在进一步讨论这个用例之前，下面的步骤将指导人们安装MLflow库。

假设有一个随机森林分类器，有一些超参数需要调优：

Python：

1 import mlflow
2 import mlflow.sklearn
3 from sklearn.ensemble import RandomForestClassifier
4 from sklearn.model_selection import train_test_split
5 from sklearn.datasets import load_iris
6 from sklearn.metrics import accuracy_score
7
8 # Loading the data
9 data = load_iris()
10 X_train, X_test, y_train, y_test = train_test_split(data.data, 
data.target, test_size=0.2)
11
12 # Combining the hyperparameters we would like to test
13 n_estimators = [10, 50, 100]
14 max_depth = [5, 10, 20]
15
16 # Starting the MLflow experiment
17 mlflow.set_experiment("RandomForest_Hyperparameter_Tuning")

YAML：

1 conda_env： conda.yaml
2
3 entry_points：
4 train：
5 parameters：
6 n_estimators： { type： int, default： 100 }
7 max_depth： { type： int, default： 6 }
8 command： "python train.py {n_estimators} {max_depth}"

步骤1：创建Conda环境

创建名为conda.yaml的文件，并添加如下内容：

YAML：

1 name： wine_quality
2 dependencies：
3 - python=3.7
4 - pip
5 - scikit-learn
6 - pandas
7 - mlflow

然后执行如下命令创建conda环境。

YAML：

1 conda env create -f conda.yaml

步骤2：执行培训脚本

创建一个名为train.py的文件，并添加以下脚本来实现训练逻辑：

Python：

1 import mlflow
2 from sklearn.ensemble import RandomForestClassifier
3 from sklearn.metrics import accuracy_score
4
5 def load_data()：
6 # Load and preprocess the wine quality data
7 return X_train, X_test, y_train, y_test
8
9 n_estimators = [100, 200, 300]
10 max_depth = [6, 8, 10]
11
12 for n in n_estimators：
13 for depth in max_depth：
14 with mlflow.start_run()：
15 # Train model
16 model = RandomForestClassifier(n_estimators=n, max_depth=depth)
17 model.fit(X_train, y_train)
18
19 # Log parameters and metrics
20 mlflow.log_param("n_estimators", n)
21 mlflow.log_param("max_depth", depth)
22 predictions = model.predict(X_test)
23 accuracy = accuracy_score(y_test, predictions)
24 mlflow.log_metric("accuracy", accuracy)
25
26 # Log model
27 mlflow.sklearn.log_model(model, "model")

在这个脚本中，采用不同的超参数训练多个随机森林分类器并记录结果。将load_data()函数替换为加载和预处理实际葡萄酒质量数据的代码。

YAML：

1 conda_env： conda.yaml
2
3 entry_points：
4 main：
5 parameters：
6 n_estimators： { type： int, default： 100 }
7 max_depth： { type： int, default： 10 }
8 command： "python train.py --n_estimators {n_estimators} --max_depth 
{max_depth}"

已经创建了train.py脚本来训练和记录示例模型的结果。现在，将创建以下文件来执行MLflow运行：

conda.yaml文件指定conda环境。
train.py文件指定入口点。
修改现有的load_data.py和winequality_dataset.py文件，以纠正路径规范中的错误。

步骤3：定义Conda环境

创建conda.yaml指定环境依赖项：

YAML：

1 name： wine_quality_env
2 channels：
3 - defaults
4 dependencies：
5 - python=3.8
6 - scikit-learn
7 - pandas
8 - mlflow

步骤4：编写训练脚本

创建train.py脚本来训练模型并记录结果：

Python

1 import argparse
2 import pandas as pd
3 import mlflow
4 import mlflow.sklearn
5 from sklearn.ensemble import RandomForestClassifier
6 from sklearn.metrics import accuracy_score
7 from sklearn.model_selection import train_test_split
8
9 def main(n_estimators, max_depth)：
10 # Load data
11 data = pd.read_csv("data/winequality-red.csv", sep=';')
12 X = data.drop("quality", axis=1)
13 y = data["quality"]
14 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 
random_state=42)
15
16 # Train model
17 model = RandomForestClassifier(n_estimators=n_estimators, 
max_depth=max_depth)
18 model.fit(X_train, y_train)
19
20 # Log parameters and metrics
21 with mlflow.start_run()：
22 mlflow.log_param("n_estimators", n_estimators)
23 mlflow.log_param("max_depth", max_depth)
24 predictions = model.predict(X_test)
25 accuracy = accuracy_score(y_test, predictions)
26 mlflow.log_metric("accuracy", accuracy)
27
28 # Log model
29 mlflow.sklearn.log_model(model, "model")
30
31 if __name__ == "__main__"：
32 parser = argparse.ArgumentParser()
33 parser.add_argument("--n_estimators", type=int, default=100)
34 parser.add_argument("--max_depth", type=int, default=10)
35 args = parser.parse_args()
36 main(args.n_estimators, args.max_depth)

步骤5：运行项目

使用mlflow Run命令运行项目：

Shell：

1 mlflow run . -P n_estimators=200 -P max_depth=15

步骤6：注册和部署模型

运行项目后，可以注册模型并部署：

Python：

1 from mlflow.tracking import MlflowClient
2
3 client = MlflowClient()
4 run_id = ""
5 model_uri = f"runs：/{run_id}/model"
6 model_details = client.create_registered_model("WineQualityModel")
7
8 # Register model
9 client.create_model_version(
10 name="WineQualityModel",
11 source=model_uri,
12 run_id=run_id
13 )

用以下的命令创建模型的服务版本：

Shell：

1 mlflow models serve -m models：/WineQualityModel/1

步骤7：做出预测

可以通过发送HTTP请求来进行预测。以下是如何使用请求库实现这一目的：

Python：

1 import requests
2 import json
3
4 url = "http：//127.0.0.1：5001/invocations"
5 data = {
6 "columns"： [
7 "fixed acidity", "volatile acidity", "citric acid", "residual sugar",
8 "chlorides", "free sulfur dioxide", "total sulfur dioxide", "density",
9 "pH", "sulphates", "alcohol"
10 ],
11 "data"： [[7.4, 0.7, 0.0, 1.9, 0.076, 11.0, 34.0, 0.9978, 3.51, 0.56, 
9.4]]
12 }
13
14 response = requests.post(
15 url,
16 data=json.dumps(data),
17 headers={"Content-Type"： "application/json"}
18 )
19
20 print(response.json())

结论

在这份指南中，通过一系列示例和一个综合项目演示了MLflow的应用。在掌握了所有必要的信息之后，通过提高机器学习项目管理过程的效率和功能来最大限度地提高MLflow的效率。可以将提供的项目作为未来项目和想法的基础。需要注意的是，这里提供的信息是官方文档的简明版本。有关更全面的信息，可以参阅MLflow官方指南，该指南概述了Python中的关键概念和有用的示例。

原文标题：A Comprehensive Guide to MLflow for Machine Learning Lifecycle Management，作者：Harsh Daiya

链接：https://dzone.com/articles/from-novice-to-advanced-in-mlflow-a-comprehensive。