面向机器智能的TensorFlow实践：产品环境中模型的部署-51CTO.COM

在了解如何利用TesnsorFlow构建和训练各种模型——从基本的机器学习模型到复杂的深度学习网络后，我们就要考虑如何将训练好的模型投入于产品，以使其能够为其他应用所用，本文对此将进行详细介绍。文章节选自《面向机器智能的TensorFlow实践》第7章。

本文将创建一个简单的Web App，使用户能够上传一幅图像，并对其运行Inception模型，实现图像的自动分类。

搭建TensorFlow服务开发环境

Docker镜像

TensorFlow服务是用于构建允许用户在产品中使用我们提供的模型的服务器的工具。在开发过程中，使用该工具的方法有两种：手工安装所有的依赖项和工具，并从源码开始构建;或利用Docker镜像。这里准备使用后者，因为它更容易、更干净，同时允许在其他不同于Linux的环境中进行开发。

如果不了解Docker镜像，不妨将其想象为一个轻量级的虚拟机镜像，但它在运行时不需要以在其中运行完整的操作系统为代价。如果尚未安装Docker，请在开发机中安装它，点击查看具体安装步骤(https://docs.docker.com/engine/installation/)。

为了使用Docker镜像，还可利用笔者提供的文件(https://github.com/tensorflow/serving/blob/master/tensorflow_serving/tools/docker/Dockerfile.devel)，它是一个用于在本地创建镜像的配置文件。要使用该文件，可使用下列命令：

docker build --pull -t $USER/tensorflow-serving-devel 
 
https://raw.githubusercontent.com/tensorflow/serving/master/ 
 
tensorflow_serving/tools/docker/Dockerfile.devel

请注意，执行上述命令后，下载所有的依赖项可能需要一段较长的时间。

上述命令执行完毕后，为了使用该镜像运行容器，可输入下列命令：

docker run -v $HOME:/mnt/home -p 9999:9999 -it $USER/ 
 
tensorflow-serving-devel

该命令执行后会将你的home目录加载到容器的/mnt/home路径中，并允许在其中的一个终端下工作。这是非常有用的，因为你可使用自己偏好的IDE或编辑器直接编辑代码，同时在运行构建工具时仅使用该容器。它还会开放端口9999，使你可从自己的主机中访问它，并供以后将要构建的服务器使用。

键入exit命令可退出该容器终端，使其停止运行，也可利用上述命令在需要的时候启动它。

Bazel工作区

由于TensorFlow服务程序是用C++编写的，因此在构建时应使用Google的Bazel构建工具。我们将从最近创建的容器内部运行Bazel。

Bazel在代码级管理着第三方依赖项，而且只要它们也需要用Bazel构建，Bazel便会自动下载和构建它们。为了定义我们的项目将支持哪些第三方依赖项，必须在项目库的根目录下定义一个WORKSPACE文件。

我们需要的依赖项是TensorFlow服务库。在我们的例子中，TensorFlow模型库包含了Inception模型的代码。

不幸的是，在撰写本书时，TensorFlow服务尚不支持作为Git库通过Bazel直接引用，因此必须在项目中将它作为一个Git的子模块包含进去：

# 在本地机器上 
 
mkdir ~/serving_example 
 
cd ~/serving_example 
 
git init 
 
git submodule add https://github.com/tensorflow/serving.git 
 
tf_serving 
 
git.submodule update - -init - -recursive

下面利用WORKSPACE文件中的local_repository规则将第三方依赖项定义为在本地存储的文件。此外，还需利用从项目中导入的tf_workspace规则对TensorFlow的依赖项初始化：

# Bazel WORKSPACE文件 
 
workspace(name = "serving") 
 
local_repository( 
 
name = "tf_serving", 
 
path = _workspace_dir__ + "/tf_serving"， 
 
local_repository( 
 
name = "org_tensorflow", 
 
path = _workspace_dir__ + "/tf_serving/tensorflow", 
 
) 
 
load('//tf_serving/tensorflow/tensorflow:workspace.bzl', 
 
'tf_workspace') 
 
tf_workspace("tf_serving/tensorflow/", "@org_tensorflow") 
 
bind( 
 
name = "libssl", 
 
actual = "@boringssl_git//:ssl", 
 
) 
 
bind( 
 
name = "zlib", 
 
actual = "@zlib_archive//:zlib" 
 
） 
 
# 仅当导入inception 模型时需要 
 
local_repository( 
 
name = "inception_model", 
 
path = __workspace_dir__ + "/tf_serving/tf_models/ 
 
inception”， 
 
） 
 
最后，需要从容器内为Tensorflow运行./configure: 
 
# 在Docker容器中 
 
cd /mnt/home/serving_example/tf_serving/tensorflow 
 
./configure

导出训练好的模型

一旦模型训练完毕并准备进行评估，便需要将数据流图及其变量值导出，以使其可为产品所用。

模型的数据流图应当与其训练版本有所区分，因为它必须从占位符接收输入，并对其进行单步推断以计算输出。对于Inception模型这个例子，以及对于任意一般图像识别模型，我们希望输入是一个表示了JPEG编码的图像字符串，这样就可轻易地将它传送到消费App中。这与从TFRecord文件读取训练输入颇为不同。

定义输入的一般形式如下：

def convert_external_inputs (external_x): 
 
 #将外部输入变换为推断所需的输入格式 
 
def inference(x): 
 
 #从原始模型中…… 
 
external_x = tf.placeholder(tf.string) 
 
x = convert_external_inputs(external_x) 
 
y = inference(x)

在上述代码中，为输入定义了占位符，并调用了一个函数将用占位符表示的外部输入转换为原始推断模型所需的输入格式。例如，我们需要将JPEG字符串转换为Inception模型所需的图像格式。最后，调用原始模型推断方法，依据转换后的输入得到推断结果。

例如，对于Inception模型，应当有下列方法：

import tensorflow as tf 
 
from tensorflow_serving.session_bundle import exporter 
 
from inception import inception_model 
 
def convert_external_inputs (external_x) 
 
# 将外部输入变换为推断所需的输入格式 
 
# 将图像字符串转换为一个各分量位于[0,1]内的像素张量 
 
image = 
 
tf.image.convert_image_dtype(tf.image.decode_jpeg(external_x, 
 
channels=3), tf.float32) 
 
# 对图像尺寸进行缩放，使其符合模型期望的宽度和高度 
 
images = tf.image.resize_bilinear(tf.expand_dims(image, 
 
0),[299,299]) 
 
# 将像素值变换到模型所要求的区间[-1,1]内 
 
images =tf.mul(tf.sub(image,0.5),2) 
 
return images 
 
 
def inference(images): 
 
  logits, _ = inception_model.inference(images, 1001) 
 
  return logits

这个推断方法要求各参数都被赋值。我们将从一个训练检查点恢复这些参数值。你可能还记得，在前面的章节中，我们周期性地保存模型的训练检查点文件。那些文件中包含了当时学习到的参数，因此当出现异常时，训练进展不会受到影响。

训练结束时，最后一次保存的训练检查点文件中将包含最后更新的模型参数，这正是我们希望在产品中使用的版本。

要恢复检查点文件，可使用下列代码：

saver = tf.train.Saver() 
 
with tf.Session() as sess: 
 
   # 从训练检查点文件恢复各交量 
 
ckpt = tf.train.get_checkpoint_state(sys.argv[1]) 
 
if ckpt and ckpt.model_checkpoint_path: 
 
     saver.restore(sess, sys.argv[1])+”/”+ 
 
ckpt.model_checkpoint_path) 
 
else: 
 
      print(“Checkpoint file not found”) 
 
      raise SystemExit

对于Inception模型，可从下列链接下载一个预训练的检查点文件：http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz。

# 在docker容器中 
 
cd/tmp 
 
curl -O http://download.tensorflow.org/models/image/imagenet/ 
 
inception-v3-2016-03-01.tar.gz 
 
tar –xzf inception-v3-2016-03-01.tar.gz

最后，利用tensorflow_serving.session_bundle.exporter.Exporter类将模型导出。我们通过传入一个保存器实例创建了一个它的实例。然后，需要利用exporter.classification_signature方法创建该模型的签名。该签名指定了什么是input_tensor以及哪些是输出张量。输出由classes_tensor构成，它包含了输出类名称列表以及模型分配给各类别的分值(或概率)的socres_tensor。通常，在一个包含的类别数相当多的模型中，应当通过配置指定仅返回tf.nn.top_k所选择的那些类别，即按模型分配的分数按降序排列后的前K个类别。

最后一步是应用这个调用了exporter.Exporter.init方法的签名，并通过export方法导出模型，该方法接收一个输出路径、一个模型的版本号和会话对象。

Scores, class_ids=tf.nn.top_k(y,NUM_CLASS_TO_RETURN) 
 
#为了简便起见，我们将仅返回类别ID,应当另外对它们命名 
 
classes = 
 
tf.contrib.lookup.index_to_string(tf.to_int64(class_ids) 
 
mapping=tf.constant([str(i) for i in range(1001)])) 
 
 
model_exporter = exporter.Exporter(saver) 
 
signature = exporter.classification_signature( 
 
   input_tensor=external_x, classes_tensor=classes, 
 
scores_tensor=scores) 
 
model_exporter.init(default_graph_signature=signature, 
 
init_op=tf.initialize_all_tables()) 
 
   model_exporter.export(sys.argv[1]+ "/export" 
 
tf.constant(time.time()), sess)

由于对Exporter类代码中自动生成的代码存在依赖，所以需要在Docker容器内部使用bazel运行我们的导出器。

为此，需要将代码保存到之前启动的bazel工作区内的exporter.py中。此外，还需要一个带有构建规则的BUILD文件，类似于下列内容：

# BUILD文件 
 
py_binary（ 
 
   name = "export"， 
 
srcs =[ 
 
  “export.py”, 
 
], 
 
deps = [ 
 
“//tensorflow_serving/session_bundle:exporter”, 
 
“@org_tensorflow//tensorflow:tensorflow_py”, 
 
#仅在导出 inception模型时需 
 
“@inception_model//inception”, 
 
], 
 
)

然后，可在容器中通过下列命令运行导出器：

# 在Docker容器中 
 
cd /mnt/home/serving_example

它将依据可从/tmp/inception-v3中提取到的检查点文件在/tmp/inception-v3/{current_timestamp}/ 中创建导出器。

注意，首次运行它时需要花费一些时间，因为它必须要对TensorFlow进行编译。

定义服务器接口

接下来需要为导出的模型创建一个服务器。

TensorFlow服务使用gRPC协议(gRPC是一种基于HTTP/2的二进制协议)。它支持用于创建服务器和自动生成客户端存根的各种语言。由于TensorFlow是基于C++的，所以需要在其中定义自己的服务器。幸运的是，服务器端代码比较简短。

为了使用gRPS，必须在一个protocol buffer中定义服务契约，它是用于gRPC的IDL(接口定义语言)和二进制编码。下面来定义我们的服务。前面的导出一节曾提到，我们希望服务有一个能够接收一个JPEG编码的待分类的图像字符串作为输入，并可返回一个依据分数排列的由推断得到的类别列表。

这样的服务应定义在一个classification_service.proto文件中，类似于：

syntax = "proto3"； 
 
message ClassificationRequest { 
 
// JPEG 编码的图像字符串 
 
bytes input = 1； 
 
}； 
 
message ClassificationResponse{ 
 
    repeated ClassificationClass classes = 1; 
 
}; 
 
message ClassificationClass { 
 
string name = 1; 
 
float score = 2; 
 
}

可对能够接收一幅图像，或一个音频片段或一段文字的任意类型的服务使用同一个接口。

为了使用像数据库记录这样的结构化输入，需要修改ClassificationRequest消息。例如，如果试图为Iris数据集构建分类服务，则需要如下编码：

message ClassificationRequest { 
 
float petalWidth = 1; 
 
float petaHeight = 2; 
 
float petalWidth = 3; 
 
float petaHeight = 4; 
 
}

这个proto文件将由proto编译器转换为客户端和服务器相应的类定义。为了使用protobuf编译器，必须为BUILD文件添加一条新的规则，类似于：

load("@protobuf//:protobuf.bzl", "cc_proto_library") 
 
cc_proto_library( 
 
name="classification_service_proto", 
 
srcs=["classification_service.proto"], 
 
cc_libs = ["@protobuf//:protobuf"], 
 
protoc="@protobuf//:protoc", 
 
default_runtime="@protobuf//:protobuf", 
 
use_grpc_plugin=1 
 
)

请注意位于上述代码片段中最上方的load。它从外部导入的protobuf库中导入了cc_proto_library规则定义。然后，利用它为proto文件定义了一个构建规则。利用bazel build :classification_service_proto可运行该构建，并通过bazel-genfiles/classification_service.grpc.pb.h检查结果：

… 
 
class ClassificationService { 
 
... 
 
class Service : public ::grpc::Service { 
 
public: 
 
Service(); 
 
virtual ~Service(); 
 
virtual ::grpc::Status classify(::grpc::ServerContext* 
 
context, const ::ClassificationRequest* 
 
request, ::ClassificationResponse* response); 
 
};

按照推断逻辑，ClassificationService::Service是必须要实现的接口。我们也可通过检查bazel-genfiles/classification_service.pb.h查看request和response消息的定义：

… 
 
class ClassificationRequest : 
 
public ::google::protobuf::Message { 
 
... 
 
const ::std::string& input() const; 
 
void set_input(const ::std::string& value); 
 
... 
 
} 
 
class ClassificationResponse : 
 
public ::google::protobuf::Message { 
 
... 
 
const ::ClassificationClass& classes() const; 
 
void set_allocated_classes(::ClassificationClass* 
 
classes); 
 
... 
 
} 
 
class ClassificationClass : 
 
public ::google::protobuf::Message { 
 
... 
 
const ::std::string& name() const; 
 
void set_name(const ::std::string& value); 
 
float score() const; 
 
void set_score(float value); 
 
... 
 
}

可以看到，proto定义现在变成了每种类型的C++类接口。它们的实现也是自动生成的，这样便可直接使用它们。

实现推断服务器

为实现ClassificationService::Service，需要加载导出模型并对其调用推断方法。这可通过一个SessionBundle对象来实现，该对象是从导出的模型创建的，它包含了一个带有完全加载的数据流图的TF会话对象，以及带有定义在导出工具上的分类签名的元数据。

为了从导出的文件路径创建SessionBundle对象，可定义一个便捷函数，以处理这个样板文件：

#include <iostream> 
 
#include <memory> 
 
#include <string> 
 
 
#include <grpc++/grpc++.h> 
 
#include "classification_service.grpc.pb.h" 
 
 
#include "tensorflow_serving/servables/tensorflow/ 
 
session_bundle_factory.h" 
 
 
using namespace std; 
 
using namespace tensorflow::serving; 
 
using namespace grpc; 
 
 
unique_ptr<SessionBundle> createSessionBundle(const string& 
 
pathToExportFiles) { 
 
SessionBundleConfig session_bundle_config = 
 
SessionBundleConfig(); 
 
unique_ptr<SessionBundleFactory> bundle_factory; 
 
SessionBundleFactory::Create(session_bundle_config, 
 
&bundle_factory); 
 
 
        unique_ptr<SessionBundle> sessionBundle; 
 
bundle_factory- 
 
>CreateSessionBundle(pathToExportFiles, &sessionBundle); 
 
 
       return sessionBundle; 
 
}

在这段代码中，我们利用了一个SessionBundleFactory类创建了SessionBundle对象，并将其配置为从pathToExportFiles指定的路径中加载导出的模型。最后返回一个指向所创建的SessionBundle实例的unique指针。

接下来需要定义服务的实现—ClassificationServiceImpl，该类将接收SessionBundle实例作为参数，以在推断中使用：

class ClassificationServiceImpl final : public 
 
ClassificationService::Service { 
 
private: 
 
unique_ptr<SessionBundle> sessionBundle; 
 
public: 
 
ClassificationServiceImpl(unique_ptr<SessionBundle> 
 
sessionBundle) : 
 
sificationServiceImpl(unique_ptr<Sessi 
 
Status classify(ServerContext* context, const 
 
ClassificationRequest* request, 
 
ClassificationResponse* response) 
 
override { 
 
// 加载分类签名 
 
ClassificationSignature signature; 
 
const tensorflow::Status signatureStatus = 
 
GetClassificationSignature(sessionBundle- 
 
>meta_graph_def, &signature); 
 
if (!signatureStatus.ok()) { 
 
return Status(StatusCode::INTERNAL, 
 
signatureStatus.error_message()); 
 
} 
 
// 将 protobuf 输入变换为推断输入张量 
 
tensorflow::Tensor 
 
input(tensorflow::DT_STRING, tensorflow::TensorShape()); 
 
input.scalar<string>()() = request->input(); 
 
vector<tensorflow::Tensor> outputs; 
 
//运行推断 
 
const tensorflow::Status inferenceStatus = 
 
sessionBundle->session->Run( 
 
{{signature.input().tensor_name(), 
 
input}}, 
 
{signature.classes().tensor_name(), 
 
signature.scores().tensor_name()}, 
 
{}, 
 
&outputs); 
 
if (!inferenceStatus.ok()) { 
 
return Status(StatusCode::INTERNAL, 
 
inferenceStatus.error_message()); 
 
} 
 
//将推断输出张量变换为protobuf输出 
 
for (int i = 0; i < 
 
outputs[0].vec<string>().size(); ++i) { 
 
ClassificationClass 
 
*classificationClass = response->add_classes(); 
 
classificationClass- 
 
>set_name(outputs[0].flat<string>()(i)); 
 
classificationClass- 
 
>set_score(outputs[1].flat<float>()(i)); 
 
} 
 
return Status::OK; 
 
} 
 
};

classify方法的实现包含了4个步骤：

利用GetClassificationSignature函数加载存储在模型导出元数据中的Classification-Signature。这个签名指定了输入张量的(逻辑)名称到所接收的图像的真实名称以及数据流图中输出张量的(逻辑)名称到对其获得推断结果的映射。
将JPEG编码的图像字符串从request参数复制到将被进行推断的张量。
运行推断。它从sessionBundle获得TF会话对象，并运行一次，同时传入输入和输出张量的推断。
从输出张量将结果复制到由ClassificationResponse消息指定的形状中的response输出参数并格式化。

最后一段代码是设置gRPC服务器并创建ClassificationServiceImpl实例(用Session-Bundle对象进行配置)的样板代码。

int main(int argc, char** argv) { 
 
if (argc < 3) { 
 
    cerr << "Usage: server <port> /path/to/export/files" << 
 
endl; 
 
            return 1; 
 
} 
 
    const string serverAddress(string("0.0.0.0:") + 
 
argv[1]); 
 
    const string pathToExportFile (argv[2]) ; 
 
 
    unique_ptr<SessionBundle> sessionBundle = 
 
createSessionBundle(pathToExportFiles); 
 
 
    const string serverAddres 
 
classificationServiceImpl(move(sessionBundle)); 
 
 
ServerBuilder builder; 
 
builder. AddListeningPort(serverAddress, 
 
grpc::InsecureServerCredentials()); 
 
    builder.RegisterService(&classificationServiceImpl); 
 
 
    unique_ptr<Server> server = builder.BuildAndStart(); 
 
cout << "Server listening on " << serverAddress << endl; 
 
 
    server->Wait(); 
 
    return 0; 
 
}

为了编译这段代码，需要在BUILD文件中为其定义一条规则：

cc_binary( 
 
name = "server", 
 
srcs = [ 
 
"server.cc", 
 
], 
 
deps = [ 
 
":classification_service_proto", 
 
"@tf_serving//tensorflow_serving/servables/ 
 
tensorflow:session_bundle_factory", 
 
      "@grpc//:grpc++", 
 
], 
 
）

借助这段代码，便可通过命令bazel run :server 9999 /tmp/inception-v3/export/{timestamp}从容器中运行推断服务器。

客户端应用

由于gRPC是基于HTTP/2的，将来可能会直接从浏览器调用基于gRPC的服务，但除非主流的浏览器支持所需的HTTP/2特性，且谷歌发布浏览器端的JavaScript gRPC客户端程序，从webapp访问推断服务都应当通过服务器端的组件进行。

接下来将基于BaseHTTPServer搭建一个简单的Python Web服务器，BaseHTTPServer将处理上载的图像文件，并将其发送给推断服务进行处理，再将推断结果以纯文本形式返回。

为了将图像发送到推断服务器进行分类，服务器将以一个简单的表单对GET请求做出响应。所使用的代码如下：

From BaseHTTPServer import HTTPServer,BaseHTTPRequestHandler 
 
import cgi 
 
import classification_service_pb2 
 
From grpc.beta import implementations 
 
 
class ClientApp (BaseHTTPRequestHandler); 
 
   def do_GET(self): 
 
self.respond_form() 
 
 
   def respond_form(self, response=""): 
 
 
      form = """ 
 
<html><body> 
 
<h1>Image classification service</h1> 
 
<form enctype="multipart/form-data" method="post"> 
 
<div>Image: <input type="file" name="file" 
 
accept="image/jpeg"></div> 
 
      <div><input type="submit" value="Upload"></div> 
 
</form> 
 
%s 
 
</body></html> 
 
""" 
 
 
response = form % response 
 
 
self.send_response(200) 
 
self.send_header("Content-type", "text/html") 
 
self.send_header("Content-length", len(response)) 
 
self.end_headers() 
 
self.wfile.write(response)

为了从Web App服务器调用推断功能，需要ClassificationService相应的Python protocol buffer客户端。为了生成它，需要运行Python的protocol buffer编译器：

pip install grpcio cython grpcio-tools 
 
python -m grpc.tools.protoc -I. --python_out=. -- 
 
grpc_python_out=. classification_service.proto

它将生成包含了用于调用服务的stub的classification_service_pb2.py文件。

服务器接收到POST请求后，将对发送的表单进行解析，并用它创建一个Classification-Request对象。然后为这个分类服务器设置一个channel，并将请求提交给它。最后，它会将分类响应渲染为HTML，并送回给用户。

def do_POST(self): 
 
   form = cgi.FieldStorage( 
 
fp=self.rfile, 
 
headers=self.headers, 
 
environ={ 
 
'REQUEST_METHOD': 'POST', 
 
'CONTENT_TYPE': self.headers['Content-Type'], 
 
}) 
 
   request = 
 
classification_service_pb2.ClassificationRequest() 
 
request.input = form['file'].file.read() 
 
 
channel = 
 
implementations.insecure_channel("127.0.0.1", 9999) 
 
stub = 
 
classification_service_pb2.beta_create_ClassificationService_stub(channel) 
 
response = stub.classify(request, 10) # 10 secs 
 
timeout 
 
self.respond_form("<div>Response: %s</div>" % 
 
response)

为了运行该服务器，可从该容器外部使用命令python client.py。然后，用浏览器导航到http://localhost:8080来访问其UI。请上传一幅图像并查看推断结果如何。

产品准备

在结束本文内容之前，我们还将学习如何将分类服务器应用于产品中。

首先，将编译后的服务器文件复制到一个容器内的永久位置，并清理所有的临时构建文件：

#在容器内部 
 
mkdir /opt/classification_server 
 
cd /mnt/home/serving_example 
 
cp -R bazel-bin/. /opt/classification_server 
 
bazel clean

现在，在容器外部，我们必须将其状态提交给一个新的Docker镜像，基本含义是创建一个记录其虚拟文件系统变化的快照。

#在容器外部 
 
docker ps 
 
#获取容器ID 
 
docker commit <container id>

这样，便可将图像推送到自己偏好的docker服务云中，并对其进行服务。

本文小结

在本文中，我们学习了如何将训练好的模型用于服务、如何将它们导出，以及如何构建可运行这些模型的快速、轻量级服务器;还学习了当给定了从其他App使用TensorFlow模型的完整工具集后，如何创建使用这些模型的简单Web App。