用?YOLOv5模型識(shí)別出表情！

時(shí)間：2021-11-08 15:31:40

關(guān)鍵字：模型

手機(jī)看文章

掃描二維碼
隨時(shí)隨地手機(jī)看文章

[導(dǎo)讀]作者|閆永強(qiáng)來源?|Datawhale本文利用YOLOV5對(duì)手勢進(jìn)行訓(xùn)練識(shí)別，并識(shí)別顯示出對(duì)應(yīng)的emoji，如同下圖：本文整體思路如下。提示：本文含完整實(shí)踐代碼，代碼較長，建議先看文字部分的實(shí)踐思路，代碼先馬后看一、YOLOV5訓(xùn)練數(shù)據(jù)集1.安裝環(huán)境依賴本教程所用環(huán)境：YOLOV...

本文利用YOLOV5對(duì)手勢進(jìn)行訓(xùn)練識(shí)別，并識(shí)別顯示出對(duì)應(yīng)的emoji，如同下圖：

本文整體思路如下。提示：本文含完整實(shí)踐代碼，代碼較長，建議先看文字部分的實(shí)踐思路，代碼先馬后看
用?YOLOv5模型識(shí)別出表情！

一、YOLOV5訓(xùn)練數(shù)據(jù)集

1. 安裝環(huán)境依賴

本教程所用環(huán)境：YOLOV5版本是V3.1。

通過git clone 將源碼下載到本地，通過pip install -r requirements.txt 安裝依賴包（其中官方要求python>=3.8 and torch>=1.6）。

我的環(huán)境是：系統(tǒng)環(huán)境Ubuntu16.04；cuda版本10.2；cudnn版本7.6.5；torch版本1.6.0；python版本3.8

2. 準(zhǔn)備手勢識(shí)別數(shù)據(jù)集

其中手勢數(shù)據(jù)集已上傳至開源數(shù)據(jù)平臺(tái)Graviti，包含了完整代碼。

手勢數(shù)據(jù)集地址：https://gas.graviti.cn/dataset/datawhale/HandPose?utm_medium=0831datawhale

注：代碼在數(shù)據(jù)地址的討論區(qū)2.1 數(shù)據(jù)集的采集以及標(biāo)注

手勢數(shù)據(jù)采集的代碼：

import cv2



def main():

total_pics = 1000

cap = cv2.VideoCapture(0)



pic_no = 0

flag_start_capturing = False

frames = 0



while True:

ret,frame = cap.read()

frame = cv2.flip(frame,1)

cv2.imwrite("hand_images/" str(pic_no) ".jpg",frame)

cv2.imshow("Capturing gesture",frame)

cv2.waitKey(10)

pic_no  = 1

if pic_no == total_pics:

break



main()

在yolov5目錄下創(chuàng)建VOC2012文件夾（名字自己定義的），目錄結(jié)構(gòu)就是VOC數(shù)據(jù)集的，對(duì)應(yīng)如下：

	


VOC2012../Annotations   #這個(gè)是存放數(shù)據(jù)集圖片對(duì)應(yīng)的xml文件../images  #這個(gè)存放圖片的../ImageSets/Main  #這個(gè)主要是存放train.txt，test.txt，val.txt和trainval.txt四個(gè)文件。里面的內(nèi)容是訓(xùn)練集、測試集、驗(yàn)證集以及訓(xùn)練驗(yàn)證集的名字（不帶擴(kuò)展后綴名）。 示例：

	


VOC2012文件夾下內(nèi)容：

	




	


Annotations文件中是xml文件（labelimg標(biāo)注的）：

	




	


images為VOC數(shù)據(jù)集格式中的JPRGImages：

	




	


ImageSets文件中Main子文件夾主要存放訓(xùn)練，測試驗(yàn)證集的劃分txt。這個(gè)劃分通過以下腳本代碼生成：

	


# coding:utf-8



import os

import random

import argparse



parser = argparse.ArgumentParser()

#xml文件的地址，根據(jù)自己的數(shù)據(jù)進(jìn)行修改 xml一般存放在Annotations下

parser.add_argument('--xml_path', default='C:\\Users\\Lenovo\\Desktop\\hand_datasets\\VOC2012\\Annotations\\', type=str, help='input xml label path')

#數(shù)據(jù)集的劃分，地址選擇自己數(shù)據(jù)下的ImageSets/Main

parser.add_argument('--txt_path', default='C:\\Users\\Lenovo\\Desktop\\hand_datasets\\VOC2012\\ImageSets\\Main\\', type=str, help='output txt label path')

opt = parser.parse_args()



trainval_percent = 1.0

train_percent = 0.99

xmlfilepath = opt.xml_path

txtsavepath = opt.txt_path

total_xml = os.listdir(xmlfilepath)

if not os.path.exists(txtsavepath):

os.makedirs(txtsavepath)



num = len(total_xml)

list_index = range(num)

tv = int(num * trainval_percent)

tr = int(tv * train_percent)

trainval = random.sample(list_index, tv)

train = random.sample(trainval, tr)



file_trainval = open(txtsavepath 'trainval.txt', 'w')

file_test = open(txtsavepath 'test.txt', 'w')

file_train = open(txtsavepath 'train.txt', 'w')

file_val = open(txtsavepath 'val.txt', 'w')



for i in list_index:

name = total_xml[i][:-4] '\n'

if i in trainval:

file_trainval.write(name)

if i in train:

file_train.write(name)

else:

file_val.write(name)

else:

file_test.write(name)



file_trainval.close()

file_train.close()

file_val.close()

file_test.close()

運(yùn)行代碼在Main文件下生成txt文檔如下：

	




	


2.2 生成yolo訓(xùn)練格式labels

	


把xml標(biāo)注信息轉(zhuǎn)換成yolo的txt格式。其中yolo的txt標(biāo)簽格式信息：每個(gè)圖像對(duì)應(yīng)一個(gè)txt文件，文件每一行為一個(gè)目標(biāo)信息，包括classx_center, y_center, width, height 格式。如下圖所示：

	




	


創(chuàng)建voc_label.py文件，將訓(xùn)練集，驗(yàn)證集以及測試集生成txt標(biāo)簽，代碼如下：

	


# -*- coding: utf-8 -*-

import xml.etree.ElementTree as ET

import os

from os import getcwd



sets = ['train', 'val', 'test']

classes = ["four_fingers","hand_with_fingers_splayed","index_pointing_up","little_finger","ok_hand","raised_fist","raised_hand","sign_of_the_horns","three","thumbup","victory_hand"] 

# 11 classes  # 改成自己的類別

abs_path = os.getcwd()

print(abs_path)



def convert(size, box):

dw = 1. / (size[0])

dh = 1. / (size[1])

x = (box[0]   box[1]) / 2.0 - 1

y = (box[2]   box[3]) / 2.0 - 1

w = box[1] - box[0]

h = box[3] - box[2]

x = x * dw

w = w * dw

y = y * dh

h = h * dh

return x, y, w, h



def convert_annotation(image_id):

in_file = open('/home/yanyq/Ryan/yolov5/VOC2012/Annotations/%s.xml' % (image_id), encoding='UTF-8')

out_file = open('/home/yanyq/Ryan/yolov5/VOC2012/labels/%s.txt' % (image_id), 'w')

tree = ET.parse(in_file)

root = tree.getroot()

size = root.find('size')

w = int(size.find('width').text)

h = int(size.find('height').text)

for obj in root.iter('object'):

# difficult = obj.find('difficult').text

difficult = obj.find('difficult').text

cls = obj.find('name').text

if cls not in classes or int(difficult) == 1:

continue

cls_id = classes.index(cls)

xmlbox = obj.find('bndbox')

b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),

float(xmlbox.find('ymax').text))

b1, b2, b3, b4 = b

# 標(biāo)注越界修正

if b2 > w:

b2 = w

if b4 > h:

b4 = h

b = (b1, b2, b3, b4)

bb = convert((w, h), b)

out_file.write(str(cls_id) " " " ".join([str(a) for a in bb]) '\n')



wd = getcwd()

for image_set in sets:

if not os.path.exists('/home/yanyq/Ryan/yolov5/VOC2012/labels/'):

os.makedirs('/home/yanyq/Ryan/yolov5/VOC2012/labels/')

image_ids = open('/home/yanyq/Ryan/yolov5/VOC2012/ImageSets/Main/%s.txt' % (image_set)).read().strip().split()

list_file = open('%s.txt' % (image_set), 'w')

for image_id in image_ids:

list_file.write(abs_path '/images/%s.jpg\n' % (image_id))

convert_annotation(image_id)

list_file.close()



運(yùn)行上述腳本后會(huì)生成labels文件夾和三個(gè)包含數(shù)據(jù)集的txt文件，其中l(wèi)abels中為圖像的yolo格式標(biāo)注文件，train.txt，test.txt, val.txt文件為劃分后圖像所在位置的絕對(duì)路徑。

	


三個(gè)txt文件內(nèi)容:

	




	


2.3 配置文件

	


1）數(shù)據(jù)集的配置

	


在yolov5目錄的data文件夾新建一個(gè)Emoji.yaml文件（自己定義）。用來存放訓(xùn)練集驗(yàn)證集的劃分文件train.txt和val.txt（其中這兩個(gè)文件是voc_label.py生成的）。具體內(nèi)容如下：

	




	


2）模型的配置文件

	


一般訓(xùn)練yolo模型的時(shí)候，是可以聚類自己標(biāo)注的框作為先驗(yàn)框（這樣可以保證標(biāo)注樣本最大化的利用）。我們這里就直接采用默認(rèn)值了。

	


選擇一個(gè)需要的模型，YOLOV5有提供s、m、l、x版本，其是逐漸增大的架構(gòu)，也就是訓(xùn)練時(shí)間和推理時(shí)間都對(duì)應(yīng)增加，我們這里選擇s版本。在yolov5文件夾下的models文件夾中打開yolov5s.yaml文件，修改內(nèi)容如下圖（我們選擇默認(rèn)anchor，所以不做修改，只需要更改nc中的類別數(shù)，由于我們是11類，所以改成11就可以了）：

	




	


到這里我們的自定義數(shù)據(jù)集以及配置文件創(chuàng)建完畢，下面就是訓(xùn)練模型了。

	



	3.模型訓(xùn)練

3.1、下載預(yù)訓(xùn)練模型

	


在源碼yolov5目錄下的weights文件夾下提供了下載smlx模型的腳本--download_weights.sh，執(zhí)行這個(gè)腳本就可以下載這四個(gè)模型的預(yù)訓(xùn)練模型了。

	


3.2、訓(xùn)練模型

	




	



	


以上參數(shù)解釋如下：epochs：指的就是訓(xùn)練過程中整個(gè)數(shù)據(jù)集將被迭代多少次,顯卡不行你就調(diào)小點(diǎn)。batch-size：一次看完多少張圖片才進(jìn)行權(quán)重更新，梯度下降的mini-batch,顯卡不行你就調(diào)小點(diǎn)。cfg：存儲(chǔ)模型結(jié)構(gòu)的配置文件。data：存儲(chǔ)訓(xùn)練、測試數(shù)據(jù)的文件。img-size：輸入圖片寬高,顯卡不行你就……。rect：進(jìn)行矩形訓(xùn)練。resume：恢復(fù)最近保存的模型開始訓(xùn)練。nosave：僅保存最終checkpoint。notest：僅測試最后的epoch。evolve：進(jìn)化超參數(shù)。bucket：gsutil bucket。 cache-images：緩存圖像以加快訓(xùn)練速度。 weights：權(quán)重文件路徑。name：重命名results.txt to results_name.txt。device：cuda device, i.e. 0 or 0,1,2,3 or cpu。adam：使用adam優(yōu)化。multi-scale：多尺度訓(xùn)練，img-size /- 50%。single-cls：單類別的訓(xùn)練集

	


訓(xùn)練只需要運(yùn)行訓(xùn)練命令就可以了，如下：

	


$ python train.py  --data Emoji.yaml --cfg yolov5s.yaml --weights weights/yolov5s.pt --batch-size 64 --device "0,1,2,3" --epochs 200 --img-size 640

其中device batch-size 等需要根據(jù)自己機(jī)器進(jìn)行設(shè)置。

	




	






	


4.模型測試

	


評(píng)估模型好壞就是在有標(biāo)注的測試集或驗(yàn)證集上進(jìn)行模型效果的評(píng)估，在目標(biāo)檢測中最常使用的評(píng)估指標(biāo)為mAP。yolov5文件下的test.py文件中指定了數(shù)據(jù)集的配置文件和訓(xùn)練結(jié)果模型如下：

	




	


通過以下命令進(jìn)行模型測試：

	


python test.py --data data/Emoji.yaml --weights runs/train/exp2/weights/best.pt --augment



模型測試效果：

	




	


測試結(jié)果圖：

	




	



	二、YOLOV5模型轉(zhuǎn)換

1.安裝依賴庫

	


pip install onnx coremltools onnx-simplifier

2.導(dǎo)出ONNX模型

	


python models/export.py --weights runs/train/exp2/weights/best.pt --img 640 --batch 1



	


此時(shí)在best.pt同級(jí)目錄下生成了best.mlmodel best.onnx best.torchscript.pt三個(gè)文件，我們只需best.onnx，這個(gè)文件可以直接用netron打開查看模型結(jié)構(gòu)。

	


3.用onnx-simplifer簡化模型

	


為什么要簡化？

	


在訓(xùn)練完深度學(xué)習(xí)的pytorch或者tensorflow模型后，有時(shí)候需要把模型轉(zhuǎn)成 onnx，但是很多時(shí)候，很多節(jié)點(diǎn)比如cast節(jié)點(diǎn)，Identity 這些節(jié)點(diǎn)可能都不需要，我們需要進(jìn)行簡化，這樣會(huì)方便我們把模型轉(zhuǎn)成ncnn或者mnn等這些端側(cè)部署的模型格式或者通過tensorRT進(jìn)行部署。

	


python -m onnxsim best.onnx yolov5-best-sim.onnx



	


完成后就生成了簡化版本的模型yolov5-best-sim.onnx。

	



	三、YOLOV5轉(zhuǎn)換成ncnn模型


	1、onnx轉(zhuǎn).param .bin

由上述生成了yolov5-best-sim.onnx這個(gè)模型，我們利用ncnn自帶的工具onnx2ncnn.exe（這個(gè)工具是自己編譯生成的，我這里是在windows下編譯生成的，可以用linux下的可執(zhí)行文件）生成yolov5s.param  yolov5s.bin兩個(gè)文件。

	


在windows平臺(tái)下ctrl r   cmd命令行窗口輸入：

	


onnx2ncnn.exe yolov5-best-sim.onnx yolov5s.param yolov5s.bin 



	




轉(zhuǎn)換的過程中會(huì)出現(xiàn)上圖所示的ncnn不支持層，下邊就是要修改param文件，把不支持層改成支持層。

	2、修改.param 參數(shù)去除不支持的網(wǎng)絡(luò)層

去掉不支持的網(wǎng)絡(luò)層，打開轉(zhuǎn)換得到的yolov5s.param文件，前面幾行需要?jiǎng)h除的是標(biāo)紅部分。（注意我們訓(xùn)練yoloV5的版本是V3.1，這里不同的版本可能會(huì)不同。）

	




	


修改結(jié)果如下綠色框和紅色框中的。因?yàn)槿サ袅?0層所以變成191  228。并用YoloV5Focus網(wǎng)絡(luò)層代替去掉的10層，而YoloV5Focus網(wǎng)絡(luò)層中的images代表該層的輸入，207代表的輸出名，這個(gè)是根據(jù)下邊一層的卷積層輸入層數(shù)寫的。

	




	


修改網(wǎng)路的輸出shape：

	


當(dāng)基于修改后的網(wǎng)路使用ncnn/examples/yolov5測試時(shí)會(huì)發(fā)現(xiàn)出現(xiàn)圖片中一堆亂框，這種情況需要修改網(wǎng)路的輸出部分。在保證輸出名一致的情況下，修改Reshape中的0=-1,使的最終的輸出shape不固定。具體的修改地方以及修改之前和之后見下圖。

	




	




	



	3、ncnn的c 測試代碼實(shí)現(xiàn)

以下是用C 實(shí)現(xiàn)的完整代碼。建議一劃到底，先看最后的整體思路

	


#include

#include

#include "iostream" 

//#include

//#include < ctime >

//#include 

//#include 



// ncnn

#include "ncnn/layer.h"

#include "ncnn/net.h"

#include "ncnn/benchmark.h"

//#include "gpu.h"



#include "opencv2/core/core.hpp"

#include "opencv2/highgui/highgui.hpp"

#include 

#include "opencv2/opencv.hpp" 



using namespace std;

using namespace cv;



static ncnn::UnlockedPoolAllocator g_blob_pool_allocator;

static ncnn::PoolAllocator g_workspace_pool_allocator;



static ncnn::Net yolov5;



class YoloV5Focus : public ncnn::Layer

{

public:

YoloV5Focus()

{

one_blob_only = true;

}



virtual int forward(const ncnn::Mat


                                
                    欲知詳情，請(qǐng)下載word文檔 下載文檔

用?YOLOv5模型識(shí)別出表情！

一 、YOLOV5訓(xùn)練數(shù)據(jù)集

1. 安裝環(huán)境依賴

2. 準(zhǔn)備手勢識(shí)別數(shù)據(jù)集

3.模型訓(xùn)練

二、YOLOV5模型轉(zhuǎn)換

三、YOLOV5轉(zhuǎn)換成ncnn模型

1、onnx轉(zhuǎn).param .bin

2、修改.param 參數(shù)去除不支持的網(wǎng)絡(luò)層

3、ncnn的c 測試代碼實(shí)現(xiàn)

用?YOLOv5模型識(shí)別出表情！

一、YOLOV5訓(xùn)練數(shù)據(jù)集

二、YOLOV5模型轉(zhuǎn)換

三、YOLOV5轉(zhuǎn)換成ncnn模型

2、修改.param 參數(shù)去除不支持的網(wǎng)絡(luò)層

3、ncnn的c 測試代碼實(shí)現(xiàn)