11.YOLOV3训练检测拾遗。

时间:2022-06-16
本文章向大家介绍11.YOLOV3训练检测拾遗。,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

训练配置以及检测我在上一篇文章里已经写过了,这里主要写一下可能会遇到的一些问题及解决方案。 这里面要做的许多事情别人都已经是做过了的,我做一下总结而已。

1. 保存训练日志以及训练日志可视化。

1.1.保存训练日志。

训练日志就是训练的时候输出的那一大堆东西,要总结实验结果,可视化训练日志的关键参数是一个很有效的方式,所以我们来做这个东西。 YOLO的代码里是有保存训练日志的模块的,只需在训练的时候增加命令即可,最后面的参数则是保存的日志信息,保存了所有打印在终端里面的信息。

./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74  2>1 | tee train_yolov3.log

训练日志会保存在train_yolov3.log里面,日志名字和保存的位置自己定义就是,建议新建一个log文件夹来保存日志,我们后续建立的解析和可视化的代码也放在这个文件夹下面。

1.2. 解析训练日志。

先看下一个完整的batch会输出什么样的日志:

Loaded: 0.000026 seconds
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.461636, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.495116, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.263516, Class: 0.436704, Obj: 0.495635, No Obj: 0.416553, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.462496, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.494657, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.268727, Class: 0.621400, Obj: 0.168319, No Obj: 0.416158, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.462955, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.493331, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.207399, Class: 0.466373, Obj: 0.332663, No Obj: 0.417906, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.462174, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.492877, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.194398, Class: 0.463323, Obj: 0.273619, No Obj: 0.424027, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.311682, Class: 0.484914, Obj: 0.280431, No Obj: 0.460396, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.494058, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.243448, Class: 0.593246, Obj: 0.383726, No Obj: 0.421082, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.462786, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.495880, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.176802, Class: 0.792245, Obj: 0.179281, No Obj: 0.416623, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.461665, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.496222, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.243663, Class: 0.242757, Obj: 0.534085, No Obj: 0.417660, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.460757, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.492573, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.089885, Class: 0.332247, Obj: 0.467275, No Obj: 0.418509, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.458823, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: 0.262965, Class: 0.556296, Obj: 0.354470, No Obj: 0.494544, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 106 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.420531, .5R: -nan, .75R: -nan,  count: 0
Region 82 Avg IOU: 0.125970, Class: 0.210711, Obj: 0.619777, No Obj: 0.458084, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.494639, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.183248, Class: 0.069611, Obj: 0.418807, No Obj: 0.415397, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.460093, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.495465, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.415323, Class: 0.665443, Obj: 0.395756, No Obj: 0.418683, .5R: 0.500000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.281197, Class: 0.619936, Obj: 0.135923, No Obj: 0.462208, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: 0.142975, Class: 0.232742, Obj: 0.140729, No Obj: 0.493011, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 106 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.419366, .5R: -nan, .75R: -nan,  count: 0
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.460545, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: 0.056978, Class: 0.266799, Obj: 0.624412, No Obj: 0.494370, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 106 Avg IOU: 0.503086, Class: 0.510277, Obj: 0.578751, No Obj: 0.417475, .5R: 1.000000, .75R: 0.000000,  count: 1
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.462025, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.494696, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.081392, Class: 0.318970, Obj: 0.406135, No Obj: 0.415770, .5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.474609, Class: 0.742155, Obj: 0.686746, No Obj: 0.462689, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.494386, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.045405, Class: 0.582467, Obj: 0.359547, No Obj: 0.417340, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.459882, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.495080, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: 0.190005, Class: 0.338868, Obj: 0.491059, No Obj: 0.419754, .5R: 0.000000, .75R: 0.000000,  count: 2
1: 1066.143311, 1066.143311 avg, 0.000000 rate, 4.240781 seconds, 32 images

规律还是很明显的,我们可以通过不同的关键词来把不同的行提取到不同的文件里,然后再做处理,包含IOU关键字的就是IOU的信息,同时我们需要把包含nan的信息丢掉。包含images的行就是包含loss信息的行,保存下来就可以了。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Nov 29 15:50:53 2018
@author: zhxing
"""
#this code is to extract the yolov3 train log
'''
import inspect
import os
import random
import sys
'''
def extract_log(log_file,new_log_file,key_word):
    f=open(log_file,'r')
    train_log=open(new_log_file,'w')
    for line in f:
        if 'Syncing' in line:        #多gpu同步信息,我就一个GPU,这里是可以不要的。
            continue
        if 'nan' in line:             #包含nan的不要
            continue
        if key_word in line:        #包含关键字
            train_log.write(line)
    f.close()
    train_log.close()
    
extract_log('train_yolov3.log','DJI_yolov3_train_loss.txt','images')
extract_log('train_yolov3.log','DJI_yolov3_train_iou.txt','IOU')

这样的话我们就得到两个txt文件,然后就可以提取需要的数据来进行可视化了。可视化的代码基本上也是参考的别人的:YOLOV3可视化,我写一些注释便于阅读和修改。

1.3. loss可视化。

我们来可视化平均loss。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Nov 29 16:17:24 2018

@author: zhxing
"""

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#%matplotlib inline

lines =16000       #rows to be draw
result = pd.read_csv('DJI_yolov3_train_loss.txt', skiprows=[x for x in range(lines) if ((x%10!=9) |(x<1000))] ,error_bad_lines=False, names=['loss', 'avg', 'rate', 'seconds', 'images'])
result.head()

#print(result)

result['loss']=result['loss'].str.split(' ').str.get(1)
result['avg']=result['avg'].str.split(' ').str.get(1)
result['rate']=result['rate'].str.split(' ').str.get(1)
result['seconds']=result['seconds'].str.split(' ').str.get(1)
result['images']=result['images'].str.split(' ').str.get(1)
result.head()
result.tail()

#print(result.head())
# print(result.tail())
# print(result.dtypes)
'''
print(result['loss'])
print(result['avg'])
print(result['rate'])
print(result['seconds'])
print(result['images'])
'''
result['loss']=pd.to_numeric(result['loss'])
result['avg']=pd.to_numeric(result['avg'])
result['rate']=pd.to_numeric(result['rate'])
result['seconds']=pd.to_numeric(result['seconds'])
result['images']=pd.to_numeric(result['images'])
result.dtypes


fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(result['avg'].values,label='avg_loss')
#ax.plot(result['loss'].values,label='loss')
ax.legend(loc='best')
ax.set_title('The loss curves')
ax.set_xlabel('batches*10')
fig.savefig('avg_loss',dpi=600)
#fig.savefig('loss')

数据没问题的话图是很快就画出来了,这个读取数据的函数还是挺麻烦的,有兴趣可以看官方的文档,我没看下去。

avg_loss.png

1.4.IOU可视化。

和loss的可视化是异曲同工的,但是这里面并没有保存batch的信息,而且犹豫略去了包含nan的行,所以其实是看不到具体的IOU随着batch变化的精确信息,不过可以看到随着batch的增大,IOU大概是一个怎样的趋势。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Nov 29 16:23:11 2018

@author: zhxing
"""

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#%matplotlib inline
 
lines = 16000    #根据train_log_iou.txt的行数修改
result = pd.read_csv('DJI_yolov3_train_iou.txt', skiprows=[x for x in range(lines) if (x%10==0 or x%10==9) ] ,error_bad_lines=False, names=['Region Avg IOU', 'Class', 'Obj', 'No Obj', 'Avg Recall','count'])
result.head()
 
result['Region Avg IOU']=result['Region Avg IOU'].str.split(': ').str.get(1)
result['Class']=result['Class'].str.split(': ').str.get(1)
result['Obj']=result['Obj'].str.split(': ').str.get(1)
result['No Obj']=result['No Obj'].str.split(': ').str.get(1)
result['Avg Recall']=result['Avg Recall'].str.split(': ').str.get(1)
result['count']=result['count'].str.split(': ').str.get(1)
result.head()
result.tail()
 
# print(result.head())
# print(result.tail())
# print(result.dtypes)
print(result['Region Avg IOU'])
 
result['Region Avg IOU']=pd.to_numeric(result['Region Avg IOU'])
result['Class']=pd.to_numeric(result['Class'])
result['Obj']=pd.to_numeric(result['Obj'])
result['No Obj']=pd.to_numeric(result['No Obj'])
result['Avg Recall']=pd.to_numeric(result['Avg Recall'])
result['count']=pd.to_numeric(result['count'])
result.dtypes
 
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(result['Region Avg IOU'].values,label='Region Avg IOU')
# ax.plot(result['Class'].values,label='Class')
# ax.plot(result['Obj'].values,label='Obj')
# ax.plot(result['No Obj'].values,label='No Obj')
# ax.plot(result['Avg Recall'].values,label='Avg Recall')
# ax.plot(result['count'].values,label='count')
ax.legend(loc='best')
# ax.set_title('The Region Avg IOU curves')
ax.set_title('The Region Avg IOU curves')
ax.set_xlabel('batches')
# fig.savefig('Avg IOU')
fig.savefig('Region Avg IOU')

附图:

Region Avg IOU.png

大概可以看一个样子,我是昨天中午2点半开始训练的,本来预计5,6个小时就可以了,直到睡觉前loss还是徘徊在0.04左右,索性就没关机跑了一夜,这几天重感冒一直在宿舍没有出去,早起已经9点过半,看了训练日志大概loss稳定到0.02左右就不再下降了,于是停止训练了。 检测的效果还不错,比上次150张图片训练的准确率要高出不少,天空和树林背景的检测准确率已经很高,不过白色的楼做背景的话,白色的无人机要就检测出来确实比较难。看后面会不会有什么别的好办法。 IOU来看的话还是能看出来一个趋势的,IOU最后基本会稳定在0.8--1的一个水平,从视频上来看,检测框的准确性确实比较一般,不知道还有么有比较好的方法去进一步提高这个精确度,这都再说了,现在就是希望感冒赶紧好起来。


未完待续!!