如何将机器学习模型部署到生产环境？-51CTO.COM

译者 | 布加迪

审校 | 重楼

开发机器学习模型只完成了一半工作。除非部署到生产环境、提供业务价值，否则模型仍然毫无用处。

知道如何部署自己的模型已成为任何数据科学家的一项基本技能，许多雇主已经要求我们能做到这一点。因此，对于任何级别的数据科学家来说，学习如何将模型部署到生产环境大有助益。

本文探讨如何将机器学习模型部署到生产环境中。

机器学习模型准备

首先，准备好部署到生产环境中的模型。我们先为整个教程设置虚拟环境。你可以通过在终端中使用以下代码来实现这一点。

python -m venv myvirtualenv1.

在安装并激活虚拟环境之后，你需要安装所需的软件包。创建requirements.txt文件，并用下面的库列表填充它。

pandas
scikit-learn
fastapi
pydantic
uvicorn
streamlit1.
2.
3.
4.
5.
6.

在requirements.txt准备就绪之后，我们必须使用以下代码来安装它们。

pip install -r requirements.txt1.

一切准备就绪，我们将开始开发机器学习模型。在本教程中，我们将使用来自Kaggle的糖尿病数据。把数据放在数据文件夹中。

然后，在app文件夹中创建一个名为train_model.py的文件。在train_model.py中，我们将使用下面的代码训练机器学习模型。

import pandas as pd
import joblib
from sklearn.linear_model import LogisticRegression

data = pd.read_csv("data\\diabetes.csv")
X = data.drop('Outcome', axis =1)
y = data['Outcome']
model = LogisticRegression()

model.fit(X, y)
joblib.dump(model, 'models\\logreg_model.joblib')1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

你可以根据自己的喜好更改数据集的位置和模型路径。我将把模型放入到模型的文件夹中。

我们将跳过所有的数据准备和模型评估，因为本文的目的是将模型部署到生产环境中。当模型准备就绪后，我们将准备部署模型了。

模型部署

在本节中，我们将为模型预测创建API，并使用Docker部署它们，同时使用Streamlit前端测试它们。

首先，确保你已经安装了Docker桌面，我们将在本地测试它。

接下来，在app文件夹中创建一个名为main.py的文件，并用以下代码填充该文件以生成API。

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import pandas as pd

# Load the logistic regression model
model = joblib.load('../models/logreg_model.joblib')

# Define the input data model
class DiabetesData(BaseModel):
 Pregnancies: int
 Glucose: int
 BloodPressure: int
 SkinThickness: int
 Insulin: int
 BMI: float
 DiabetesPedigreeFunction: float
 Age: int
app = FastAPI()

# Define prediction endpoint
@app.post("/predict")
def predict(data: DiabetesData):
 input_data = {
 'Pregnancies': [data.Pregnancies],
 'Glucose': [data.Glucose],
 'BloodPressure': [data.BloodPressure],
 'SkinThickness': [data.SkinThickness],
 'Insulin': [data.Insulin],
 'BMI': [data.BMI],
 'DiabetesPedigreeFunction': [data.DiabetesPedigreeFunction],
 'Age': [data.Age]
 }
 input_df = pd.DataFrame(input_data)

 # Make a prediction
 prediction = model.predict(input_df)
 result = "Diabetes" if prediction[0] == 1 else "Not Diabetes"
 return {"prediction": result}1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.

此外，我们会有一个前端web来试一试我们部署的API模型。为此，在app文件夹中创建一个名为frontend.py的文件。然后，用以下代码填充它们。

import streamlit as st
import requests
import json

API_URL = "http://localhost:8000/predict"

st.title("Diabetes Prediction App")
st.write("Enter the details below to make a prediction.")

pregnancies = st.number_input("Pregnancies", min_value=0, step=1)
glucose = st.number_input("Glucose", min_value=0, step=1)
blood_pressure = st.number_input("Blood Pressure", min_value=0, step=1)
skin_thickness = st.number_input("Skin Thickness", min_value=0, step=1)
insulin = st.number_input("Insulin", min_value=0, step=1)
bmi = st.number_input("BMI", min_value=0.0, step=0.1)
diabetes_pedigree_function = st.number_input("Diabetes Pedigree Function", min_value=0.0, step=0.1)
age = st.number_input("Age", min_value=0, step=1)

if st.button("Predict"):
 input_data = {
 "Pregnancies": pregnancies,
 "Glucose": glucose,
 "BloodPressure": blood_pressure,
 "SkinThickness": skin_thickness,
 "Insulin": insulin,
 "BMI": bmi,
 "DiabetesPedigreeFunction": diabetes_pedigree_function,
 "Age": age
 }

 response = requests.post(API_URL, data=json.dumps(input_data), headers={"Content-Type": "application/json"})

 if response.status_code == 200:
 prediction = response.json().get("prediction", "No prediction")
 st.success(f"Prediction: {prediction}")
 else:
 st.error("Error in making prediction. Please check your input data and try again.")1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.

当一切准备就绪后，我们将创建Docker文件作为模型部署的基础。你应该在文件中填写下面的代码。

FROM python:3.9-slim

WORKDIR /app

COPY app /app
COPY models /models

RUN pip install --no-cache-dir --upgrade pip && \
 pip install --no-cache-dir -r requirements.txt

EXPOSE 8000 8501

CMD ["sh", "-c", "uvicorn main:app --host 0.0.0.0 --port 8000 & streamlit run frontend.py --server.port=8501 --server.enableCORS=false"]1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

我们将创建Docker文件已准备就绪的映像，然后通过容器部署模型。为此，在终端中运行以下代码来构建映像。

docker build -t diabetes-prediction-app .1.

上面的代码为我们的模型容器创建了Docker映像。然后，我们将使用以下代码为模型部署制作API。

docker run -d -p 8000:8000 -p 8501:8501 --name diabetes-prediction-container diabetes-prediction-app1.

一切准备就绪后，确保容器运行并使用下面的地址来访问前端。

http://localhost:8501/

你应该会看到如下图所示的前端。

如果一切顺利，恭喜你！你刚刚将机器学习模型部署到了生产环境中。

结论

在本文中，我们介绍了使用FastAPI和Docker将模型部署到生产环境中的简单方法。

当然，从维护模型和监测生产环境中模型的过程中，仍然有很多东西需要学习。但愿本文对你有所帮助！

原文标题：A Guide to Deploying Machine Learning Models to Production，作者：Cornellius Yudha Wijaya