基于最近邻分类器的图像识别-51CTO.COM

介绍

本案例基于最近邻分类器这一分类器算法模型，在HarmonyOS上实现了简单的图像识别。事实上，最近邻分类器用在图像识别上的灵敏度并不高（错误率还是有点大的），而笔者编写这个案例的初衷，是为了能体验最近邻分类器的思维，以及探索如何在HarmonyOS中处理图像数据。

源码下载地址

Gitee源码地址链接

开发环境要求

DevEco Studio版本：DevEco Studio 3.1 Release
HarmonyOS SDK版本：API version 9

工程要求

API9
Stage模型

正文

基于最近邻分类器的图像识别算法

首先需要明确的是，本案例的图像识别分类器所分类的对象是灰度图。图像识别的最近邻分类器与前面提到的例子十分类似，只不过在图像识别中，样本的每个像素点的灰度值是一个特征维。假设分类器所处理的图像规模是100px乘100px，那么对应的特征空间就是10000维的高维空间，而每个图像样本经特征提取后都可以表示成10000维空间的一个向量。虽然特征空间的维度升高了不少，但分类算法在本质上没有差别。同先前的例子一样，图片分类的依据是被测样本的最近邻，所以我们需要计算被测样本对应在高维空间中的向量与其他已训练样本的向量的距离，这里的距离仍采用欧式范数

找出被测样本的最近邻后，我们便可以凭借此最近邻进行图片分类。

代码结构

─entry/src/main
   ├─ module.json5
   ├─ resources
   │  ├─ zh_CN
   │  ├─ rawfile
   │  │  ├─ p1.png
   │  │  ├─ p2.png
   │  │  ├─ p3.png
   │  │  ├─ p4.png
   │  │  ├─ p5.png
   │  │  ├─ p6.png
   │  │  ├─ s1.png
   │  │  └─ s2.png
   │  ├─ en_US
   │  └─ base
   └─ ets
      ├─ XL_Modules
      │  └─ XL_Image_NNC.ts
      ├─ pages
      │  └─ Index.ets
      └─ entryability
         └─ EntryAbility.ts1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.

图片解码流程

图片解码指将所支持格式的存档图片解码成统一的PixelMap，以便在应用或系统中进行图片显示或图片处理。

在本案例中，不论是用于训练分类器的图片还是待检测的图片，都需要提前放入Demo

的rawfile目录下的。如果要获取某个图像资源，我们只需要知道此图像的名称（字符串数据），然后通过资源管理者模块resourceManager即可取得。如下给出以获取rawfile目录下名为'p1.png',大小为100px乘100px的图片资源为例，需要编写的相关操作语句。

获取resourceManager。

let context = getContext(this)
const resourceMgr = context.resourceManager1.
2.

获取rawfile文件夹下p1.png的ArrayBuffer。

let Data = await resourceMgr.getRawFileContent('test.jpg')
let buffer = Data.buffer1.
2.

创建ImageSource实例。

const imageSource = image.createImageSource(buffer)1.

创建PixelMap实例。

const pixelMap = await imageSource.createPixelMap();1.

因为我们需要做图像处理，所以还需要创建一个ArrayBuffer实例（buffer对象），并读取PixelMap实例的数据至buffer中。这样以后，buffer就是一个储存了p1.png图像数据的BGRA_8888格式的数组（也可以称之为向量）了。

let dim = 100*100*4   //维度
let buffer = new ArrayBuffer(dim)
await pixelMap.readPixelsToBuffer(buffer)
.catch(err => {
 		console.error(`err: +${err}`)
})1.
2.
3.
4.
5.
6.

由于buffer是BGRA_8888格式的数组，所以buffer中第i，i+1，i+2, i+3(i∈[0,dim/4-1])分别代表某个像素点蓝色分量（B），绿色分量(G)，红色分量(R)和透明度分量(A)。因为本案例的图像识别是在灰度图的基础上进行的，所以我们还需要将BGRA_8888格式的数组转化为灰度值数组。

RGB图转灰度图并不难，只需要将每个像素点的蓝色，绿色和红色分量加权求和，就可以得到每个像素点的灰度值。其中，灰度值 = (红色通道值 * 0.299) + (绿色通道值 * 0.587) + (蓝色通道值 * 0.114)。这里的加权系数是根据人眼对不同颜色敏感度的差异来确定的。

关键代码

XL_Image_NNC.ts：

import image from '@ohos.multimedia.image';
import common from '@ohos.app.ability.common';

//所操作图片的尺寸(图片的宽高一致)
const OPERATION_SIZE = 100

/*
 * 函数名: Get_NumberType_Array_MinValue_Index
 * 描述: 返回输入的number型Array中数值最小的元素所在索引
 */
function Get_NumberType_Array_MinValue_Index(arr:Array<number>):number{

  let location:number = 0

  for(var i = 0; i < arr.length; ++i){
    if(arr[i]<arr[location]){
      location = i
    }
  }

  return location

}

//最近邻分类器能力接口
interface I_Nearest_Neighbor_Classifier{

  train(context:common.Context,Train_Data:Array<string>)

  identify(test_data:string)

}

/*
 * 类名: XL_Image_NNC
 * 描述: 基于最近邻分类器的图像识别模块
 */
class XL_Image_NNC implements I_Nearest_Neighbor_Classifier{

  //日志标签
  private TAG:string = '------[XL_Image_NNC] '

  //BGRA_8888图对应向量的规模
  private dim_rgb:number = 4*(OPERATION_SIZE**2)

  //灰度图对应向量的规模
  private dim_gray:number = OPERATION_SIZE**2

  //RGB图向量转化为灰度图向量时, 三原色(red green blue)通道值各自占灰度值的权重, 并且满足R_Weight + G_Weight + B_Weight = 1
  private R_Weight:number = 0.299
  private G_Weight:number = 0.587
  private B_Weight:number = 0.114

  //已加入的图像向量的集合(BGRA_8888)
  private Trained_Data_RGB:Array<Uint8Array> = []

  //已加入的图像向量的集合(灰度图)
  private Trained_Data_Gray:Array<Uint8Array> = []

  //存储距离(欧式范数)的数组
  private Distance_Array:Array<number> = []

  //储存资源管理模块的变量
  private resourceMgr = null

  /*
   * 方法名: train
   * 描述: 为分类器填充数据, 使得分类器获取监督模式识别的功能
   * 参数: context: UIAbility的上下文对象  Train_Data: 待训练的图片集（图片需要提前储存在rawfile目录下）
   */
  public async train(context:common.Context,Train_Data:Array<string>) {

    //通过context获取ResourceManager(资源管理模块)
    this.resourceMgr = context.resourceManager

    //遍历和处理待输入的图片数据
    for (var item of Train_Data) {

      //通过resourceMgr的getRawFileContent方法(填入图片的文件名字符串),获取rawfile目录下某个图片所资源对应的UintArray
      let rawData = await this.resourceMgr.getRawFileContent(item)

      //通过先前获取的UintArray创建ImageSource实例
      let imageSource = image.createImageSource(rawData.buffer)

      //通过ImageSource实例创建像素表
      let pixelMap = await imageSource.createPixelMap()

      //将像素表读取到新建的ArrayBuffer变量中
      let buffer = new ArrayBuffer(this.dim_rgb)
      await pixelMap.readPixelsToBuffer(buffer)
        .catch(err => {
          console.error(this.TAG+`err: +${err}`)
        })

      //最后将ArrayBuffer携带的RGB型图像向量存入Trained_Data_RGB（向量集合）中
      this.Trained_Data_RGB.push(new Uint8Array(buffer))

    }

    //将获取的RGB型图像向量集合转化为灰度型图像向量集合
    for(var element of this.Trained_Data_RGB){

      let GrayScaleVector = new Uint8Array(this.dim_gray)
      let index:number = 0

      //遍历RGB型图像向量的元素
      for(var i = 0; i < element.length; i++){
        if((i+1)%4 == 0){

          //获取像素点的R,G,B通道值, 将他们加权求和得到灰度值
          var grayScale = this.R_Weight*element[i-3]+this.G_Weight*element[i-2]+this.B_Weight*element[i-1]

          //存储
          GrayScaleVector[index++] = grayScale
        }
      }

      //最后将GrayScaleVector携带的灰度型图像向量存入Trained_Data_Gray（向量集合）中
      this.Trained_Data_Gray.push(GrayScaleVector)

    }


  }

  /*
   * 方法名: identify
   * 描述: 基于已获取的数据，完成监督模式识别，返回输入样本的最近邻在Trained_Data_Gray中的索引
   * 参数: test_data: 待识别的图片（图片需要储存在rawfile目录下）
   */
  public async identify(test_data:string):Promise<number>{

    //排除异常情况
    if(this.resourceMgr == null){
      console.error(this.TAG+'Please train the image data before identifying')
      return -1
    }

    //获取rawfile目录下某个图片所对应的UintArray
    let rawData = await this.resourceMgr.getRawFileContent(test_data)

    //通过先前获取的UintArray创建ImageSource实例
    let imageSource = image.createImageSource(rawData.buffer)

    //通过ImageSource实例创建像素表
    let pixelMap = await imageSource.createPixelMap()

    //将像素表读取到新建的buffer变量中
    let buffer = new ArrayBuffer(this.dim_rgb)
    await pixelMap.readPixelsToBuffer(buffer)
      .catch(err => {
        console.error(this.TAG+`err: +${err}`)
      })

    let Sample_RGB = new Uint8Array(buffer)

    let Sample_Gray = new Uint8Array(this.dim_gray)

    let index:number = 0

    //将RGB型的图像向量转化为灰度型的图像向量
    for(var i = 0; i < Sample_RGB.length; i++){
      if((i+1)%4 == 0){
        var grayScale = this.R_Weight*Sample_RGB[i-3]+this.G_Weight*Sample_RGB[i-2]+this.B_Weight*Sample_RGB[i-1]
        Sample_Gray[index++] = grayScale
      }
    }

    //赋初值
    this.Distance_Array = []

    //计算待检测图像向量与每项已训练图片向量在高维空间的的距离（距离采用欧式范数）, 即(Σ(A[i] - B[i]))^0.5, i ∈ [0, dim_gray) ∩ N
    for(var item of this.Trained_Data_Gray){

      var distance:number = 0

      //计算dim_gray维向量空间上样本与已训练数据的距离（欧式范数）
      for(var i = 0; i < this.dim_gray; i++){
        distance += (Sample_Gray[i]-item[i])**2
      }
      distance = distance**0.5

      this.Distance_Array.push(distance)
      console.info(this.TAG+'distance: '+distance)

    }

    //获取Distance_Array中最小元素所在索引并输出, 此索引即为样本的最近邻在Trained_Data_Gray中的索引
    return Get_NumberType_Array_MinValue_Index(this.Distance_Array)

  }

}

//导出本模块
export default new XL_Image_NNC()1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
92.
93.
94.
95.
96.
97.
98.
99.
100.
101.
102.
103.
104.
105.
106.
107.
108.
109.
110.
111.
112.
113.
114.
115.
116.
117.
118.
119.
120.
121.
122.
123.
124.
125.
126.
127.
128.
129.
130.
131.
132.
133.
134.
135.
136.
137.
138.
139.
140.
141.
142.
143.
144.
145.
146.
147.
148.
149.
150.
151.
152.
153.
154.
155.
156.
157.
158.
159.
160.
161.
162.
163.
164.
165.
166.
167.
168.
169.
170.
171.
172.
173.
174.
175.
176.
177.
178.
179.
180.
181.
182.
183.
184.
185.
186.
187.
188.
189.
190.
191.
192.
193.
194.
195.
196.

同往期一样，笔者青睐于将新开发的功能集成到一个ts文件里并导出，以方便管理与维护。在本模块中，功能被集成在类XL_Image_NNC中，其中，train方法用于为分类器训练数据（虽然不涉及迭代的过程，姑且将其称为"训练"吧），identify方法则是基于已训练的数据进行图像分类。

想了解更多关于开源的内容，请访问：

51CTO 开源基础软件社区

https://ost.51cto.com

基于最近邻分类器的图像识别

介绍