(一)Tensorflow搭建普通神经网络实现MNIST手写字体识别及预测

时间:2022-06-22
本文章向大家介绍(一)Tensorflow搭建普通神经网络实现MNIST手写字体识别及预测,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

1 搭建神经网络

1.0 网络结构

图1.0 神经网络

1.2 结构解析

【输入层】

输入层数据为维度(1, 784),其中1表示数据数量,因为网络一次只处理一张图片,所以为1,784是图像数据维度,将$28times 28 times1$的数据处理成一个列向量,便于存储,若向显示,则需要将其回复到源尺寸,参见博客MNIST手写字体数据集解析.

【第一个隐藏层】

第一个隐藏层数据维度(784, 500),其中784为权重weights个数,500为偏置个数.

【第二个隐藏层】

第二个隐藏层数据维度(500, 10),其中500为权重weights个数,10为偏置个数.

【输出层】

输出数据维度为(1,10),输出结果为长度为10的列向量,因为手写字体数字从0~9.

2 网络结构-源

【Demo】

import os
from os import path
import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data

'''配置神经网络参数.
INPUT_NODE:输入数据维度
OUTPUT_NODE:输出数据维度
LAYER1_NODE:隐藏层维度
'''
INPUT_NODE = 784
OUTPUT_NODE = 10
LAYER1_NODE = 500
'''训练数据设置.
BATCH_SIZE:每组数据数量
LEARNING_RATE_BASE:学习率初始值
LEARNING_RATE_DECAY:学习率衰减率
REGULARAZTION_RATE: 正则率
TRAING_STEPS: 训练次数
MOVING_AVERAGE_DECAY:滑动平均衰减率
'''
BATCH_SIZE = 100
LEARNING_RATE_BASE = 0.8
LEARNING_RATE_DECAY = 0.99
REGULARAZTION_RATE = 0.0001
TRAING_STEPS = 30000
MOVING_AVERAGE_DECAY = 0.99
'''保存模型的路径及文件名'''
MODEL_SAVE_PATH = "./new_models"
MODEL_NAME = "model.ckpt"

if not os.path.exists(MODEL_SAVE_PATH):
	'''保证模型路径存在,若不存在则新建'''
	os.makedirs(MODEL_SAVE_PATH)

def get_weight_variable(shape, regularizer):
	'''初始化权重.
	shape: 权重维度
	regularizer: 正则化标志位
	返回:
	权重值
	'''
	weights = tf.get_variable("weights", shape,
		initializer=tf.truncated_normal_initializer(stddev=0.1))
	if regularizer !=None:
		tf.add_to_collection('losses', regularizer(weights))
	return weights

def inference(input_tensor, regularizer):
	'''神经网络前向计算.
	input_tensor: 输入数据张量
	regularizer: 正则化标志位
	返回:
	神经网络输出值
	'''
	with tf.variable_scope('layer1'):
		weights = get_weight_variable([INPUT_NODE, LAYER1_NODE], regularizer)
		biases = tf.get_variable("biases", [LAYER1_NODE],
			initializer=tf.constant_initializer(0.0))
		layer1 = tf.nn.relu(tf.matmul(input_tensor, weights) + biases)

	with tf.variable_scope('layer2'):
		weights = get_weight_variable(
			[LAYER1_NODE, OUTPUT_NODE], regularizer)
		biases = tf.get_variable(
			"biases", [OUTPUT_NODE],
			initializer=tf.constant_initializer(0.0))
		layer2 = tf.matmul(layer1, weights) + biases
	return layer2

3 训练及测试

3.1 载入数据

【Demo】

def main(argv=None):
	'''提取数据.'''
	mnist = input_data.read_data_sets("./mnist_data", one_hot=True)
	model_file = path.join(MODEL_SAVE_PATH, MODEL_NAME)
	train(mnist, model_file, True)

3.2 训练及保存模型

【Demo】

def train(mnist, model_file, restore_model):
	'''输入数据占位符'''
	x = tf.placeholder(
		tf.float32, [None, INPUT_NODE], name='x_input')
	y_ = tf.placeholder(
		tf.float32, [None, OUTPUT_NODE], name='y_input')
	regularizer = tf.contrib.layers.l2_regularizer(REGULARAZTION_RATE)

	'''预测结果.'''
	y = inference(x, regularizer)
	tf.add_to_collection("y_pre", y)
	global_step = tf.Variable(0, trainable=False)
	'''设置滑动模型.'''
	variable_averages = tf.train.ExponentialMovingAverage(
		MOVING_AVERAGE_DECAY, global_step)
	variables_averages_op = variable_averages.apply(
		tf.trainable_variables())
	'''损失函数.'''
	cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
		logits=y, labels=tf.argmax(y_, 1))
	cross_entropy_mean = tf.reduce_mean(cross_entropy)
	loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
	'''学习率.'''
	learning_rate = tf.train.exponential_decay(
		LEARNING_RATE_BASE,
		global_step,
		mnist.train.num_examples / BATCH_SIZE,
		LEARNING_RATE_DECAY)
	train_step = tf.train.GradientDescentOptimizer(learning_rate)
	.minimize(loss, global_step=global_step)
	with tf.control_dependencies([train_step, variables_averages_op]):
		train_op = tf.no_op(name='train')

	'''初始化Tensorflow持久化类.'''
	saver = tf.train.Saver()
	with tf.Session() as sess:
		tf.global_variables_initializer().run()
		if path.isfile(model_file+".meta") and restore_model:
			'''重新载入模型若存在之前训练的模型.'''
			print("Reloading modle file before training.")
			saver.restore(sess, model_file)
		else:
			print("Without any model and strat training.")

		'''训练.'''
		for i in range(1001):
			xs, ys = mnist.train.next_batch(BATCH_SIZE)
			_, loss_value, step = sess.run([train_op, loss, global_step],
				feed_dict={x: xs, y_: ys})

			'''每1000轮保存一次模型.'''
			if i % 1000 == 0:
				#输出当前的训练情况:当前batch上的损失函数结果.
				print("After %d training step(s) ,loss on training"
					"batch is %g."%(step, loss_value))
			saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME))

4 载入模型及预测

4.1 载入模型参数和模型结构

【Demo】

def load_model(images_num):
	'''载入模型并预测结果.
	:params images_num: 处理的图片数量
	返回:
	预测结果:如7
	'''
	mnist = input_data.read_data_sets("./mnist_data", one_hot=True)
	'''图像数据及对应标签值'''
	images_data = [mnist.test.images[i] for i in range(images_num)]
	images_label = [mnist.test.labels[i] for i in range(images_num)]
	'''载入模型结构.'''
	saver = tf.train.import_meta_graph("./new_models/model.ckpt.meta")
	'''检查最新模型'''
	model_params = tf.train.latest_checkpoint("./new_models")
	'''预测值和训练标签值'''
	predict_lables = []
	train_labels = []
	with tf.Session() as sess:
		for i in range(images_num):
			'''扩展矩阵,用于提取最大值,tf.agrmax(shape),shape为(1,n),获取的原始数据(n,1)'''
			train_label = tf.expand_dims(images_label[i], [0])
			train_label = tf.argmax(train_label, 1)
			train_label = sess.run(train_label)
			'''提取数值:[n]'''
			train_labels.append(train_label[0])
			images = tf.expand_dims(images_data[i], [0])
			images = sess.run(images)
			print("images shape: {}".format(images.shape))
			saver.restore(sess, model_params)
			g = tf.get_default_graph()
			pre = g.get_collection("y_pre")[0]
			x = g.get_tensor_by_name("x_input:0")
			'''预测值的维度 (1, 10)'''
			pre = sess.run(pre, feed_dict={x: images})
			print("prediction: {}".format(pre))
			pre_value = pre[0]
			print("Extract value from predicted result: {}".format(pre_value))
			sum_pre = sum(pre_value.tolist())
			print("sum predicted value: {}".format(sum_pre))
			'''提取最大值对应的序号,该需要即为数字值,如
			[[-2.0685697 -3.6447546  4.450996   7.5821013 -4.280662  -1.9156142
  			-7.415148  10.431478  -4.3287787  1.8845916]]
  			最大值10.43,对应索引:7,即为预测的数字
  			'''
			pre_num = tf.argmax(pre, 1)
			pre_num = sess.run(pre_num)
			predict_lables.append(pre_num[0])
			print("predicted value's shape: {}".format(pre.shape))
		print("train data labels: {}".format(train_labels))
		print("predicted number: {}".format(predict_lables))
		'''混淆矩阵'''
		conf_mx = confusion_matrix(train_labels, predict_lables)
		print("confusion matrixs: n {}".format(conf_mx))
		if not os.path.exists("./images"):
			os.makedirs("./images")
		plt.matshow(conf_mx, cmap=plt.cm.gray)
		plt.savefig("./images/confusion_matrix.png", format="png")
		plt.show()

【Result】

train data labels: [7, 2, 1, 0, 4, 1, 4, 9, 5, 9]
predicted number: [7, 2, 1, 0, 4, 1, 4, 9, 5, 9]
confusion matrixs: 
 [[1 0 0 0 0 0 0]
 [0 2 0 0 0 0 0]
 [0 0 1 0 0 0 0]
 [0 0 0 2 0 0 0]
 [0 0 0 0 1 0 0]
 [0 0 0 0 0 1 0]
 [0 0 0 0 0 0 2]]
图4.1 混淆矩阵图

【Analysis】

(1) 10个数据,标签种类为七种,混淆矩阵维度(7, 7),标签为0,1, 2, 4, 5, 7, 9,对应的个数为:1, 2, 1, 2, 1, 1, 2

(2) 通过混淆矩阵可以看出,预测的图片在主对角线上,说明他们被正确分类;

(3) 对角线上的方块颜色不一致,颜色越浅说明集中在该标签的数据量越多,其中第1, 3, 6类(从0索引)为白色,数量最多,均为2个,灰色为1个;

4.2 载入模型参数

【Demo】

g_params = tf.Graph()
with g_params.as_default():
	x = tf.placeholder(tf.float32, [None, INPUT_NODE], name="x_input")
	y_ = tf.placeholder(tf.float32, [None, OUTPUT_NODE], name='y_input')
	regularizer = tf.contrib.layers.l2_regularizer(REGULARAZTION_RATE)
	y = inference(x, regularizer)

def load_model_only_with_params(images_num):
	mnist = input_data.read_data_sets("./mnist_data", one_hot=True)
	images_data = [mnist.test.images[i] for i in range(images_num)]
	images_label = [mnist.test.labels[i] for i in range(images_num)]
	predict_labels = []
	train_labels = []
	with tf.Session(graph=g_params) as sess:
		saver = tf.train.Saver()
		ckpt = tf.train.get_checkpoint_state("./new_models")
		model_path = ckpt.model_checkpoint_path
		saver.restore(sess, model_path)
		for i in range(images_num):

			train_label = tf.expand_dims(images_label[i], [0])
			train_label = tf.argmax(train_label, 1)
			train_label = sess.run(train_label)
			'''Extract data from list such as [7].'''
			train_labels.append(train_label[0])
			images = tf.expand_dims(images_data[i], [0])
			images = sess.run(images)
#             print("images shape: {}".format(images.shape))
			pre = sess.run(y, feed_dict={x: images})
			pre_value = pre[0]
			print("Extract value from predicted result: {}".format(pre_value))
			sum_pre = sum(pre_value.tolist())
			print("sum predicted value: {}".format(sum_pre))
			'''Get value coresponding number.'''
			pre_num = tf.argmax(pre, 1)
			pre_num = sess.run(pre_num)
			predict_labels.append(pre_num[0])
		print("train data labels: {}".format(train_labels))
		print("predicted number: {}".format(predict_labels))
		conf_mx = confusion_matrix(train_labels, predict_labels)
		print("confusion matrixs: n {}".format(conf_mx))
		if not os.path.exists("./images"):
			os.makedirs("./images")
		plt.matshow(conf_mx, cmap=plt.cm.gray)
		plt.title("训练数据和预测数据的混淆矩阵", fontproperties=font)
		plt.savefig("./images/confusion_matrix.png", format="png")
		plt.show()

5 完整程序及可视化神经网络结构

5.1 源

github传送门

【Demo】

import os
from os import path
import tensorflow as tf 
from tensorflow.examples.tutorials.mnist import input_data
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
font = FontProperties(fname="/usr/share/fonts/truetype/arphic/ukai.ttc")

INPUT_NODE = 784
OUTPUT_NODE = 10
LAYER1_NODE = 500
BATCH_SIZE = 100

LEARNING_RATE_BASE = 0.8
LEARNING_RATE_DECAY = 0.99
REGULARAZTION_RATE = 0.0001
TRAING_STEPS = 30000
MOVING_AVERAGE_DECAY = 0.99
MODEL_SAVE_PATH = "./new_models"
MODEL_NAME = "model.ckpt"
LOG_DIR = "./logs"

if not os.path.exists(MODEL_SAVE_PATH):
	os.makedirs(MODEL_SAVE_PATH)

if not os.path.exists(LOG_DIR):
	os.makedirs(LOG_DIR)

def get_weight_variable(shape, regularizer):
	weights = tf.get_variable("weights", shape,
		initializer=tf.truncated_normal_initializer(stddev=0.1))
	if regularizer !=None:
		tf.add_to_collection('losses', regularizer(weights))
	return weights

def inference(input_tensor, regularizer):
	with tf.variable_scope('layer1'):
		weights = get_weight_variable([INPUT_NODE, LAYER1_NODE], regularizer)
		biases = tf.get_variable("biases", [LAYER1_NODE],
			initializer=tf.constant_initializer(0.0))
		layer1 = tf.nn.relu(tf.matmul(input_tensor, weights) + biases)

	with tf.variable_scope('layer2'):
		weights = get_weight_variable(
			[LAYER1_NODE, OUTPUT_NODE], regularizer)
		biases = tf.get_variable(
			"biases", [OUTPUT_NODE],
			initializer=tf.constant_initializer(0.0))
		layer2 = tf.matmul(layer1, weights) + biases

	return layer2

def train(mnist, model_file, restore_model):
	x = tf.placeholder(
		tf.float32, [None, INPUT_NODE], name='x_input')
	y_ = tf.placeholder(
		tf.float32, [None, OUTPUT_NODE], name='y_input')
	regularizer = tf.contrib.layers.l2_regularizer(REGULARAZTION_RATE)
	y = inference(x, regularizer)
	tf.add_to_collection("y_pre", y)
	global_step = tf.Variable(0, trainable=False)
	variable_averages = tf.train.ExponentialMovingAverage(
		MOVING_AVERAGE_DECAY, global_step)
	variables_averages_op = variable_averages.apply(
		tf.trainable_variables())
	cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
		logits=y, labels=tf.argmax(y_, 1))
	cross_entropy_mean = tf.reduce_mean(cross_entropy)
	loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
	tf.summary.scalar("losses", loss)
	merged = tf.summary.merge_all()
	learning_rate = tf.train.exponential_decay(
		LEARNING_RATE_BASE,
		global_step,
		mnist.train.num_examples / BATCH_SIZE,
		LEARNING_RATE_DECAY)
	train_step = tf.train.GradientDescentOptimizer(learning_rate)
	.minimize(loss, global_step=global_step)
	with tf.control_dependencies([train_step, variables_averages_op]):
		train_op = tf.no_op(name='train')


	saver = tf.train.Saver()
	with tf.Session() as sess:
		tf.global_variables_initializer().run()
		summary_writer = tf.summary.FileWriter(LOG_DIR, sess.graph)
		if path.isfile(model_file+".meta") and restore_model:
			print("Reloading modle file before training.")
			saver.restore(sess, model_file)
		else:
			print("Without any model and strat training.")
		for i in range(TRAING_STEPS):
			xs, ys = mnist.train.next_batch(BATCH_SIZE)
			summary, _, loss_value, step = sess.run([merged, train_op, loss, global_step],
				feed_dict={x: xs, y_: ys})

			if i % 100 == 0:
				#输出当前的训练情况:当前batch上的损失函数结果.
				print("After %d training step(s) ,loss on training"
					"batch is %g."%(step, loss_value))
			saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME))
			summary_writer.add_summary(summary, i)
	summary_writer.close()
		
def main(argv=None):
	mnist = input_data.read_data_sets("./mnist_data", one_hot=True)
	model_file = path.join(MODEL_SAVE_PATH, MODEL_NAME)
	train(mnist, model_file, True)

def load_model(images_num):
	mnist = input_data.read_data_sets("./mnist_data", one_hot=True)
	images_data = [mnist.test.images[i] for i in range(images_num)]
	images_label = [mnist.test.labels[i] for i in range(images_num)]
	saver = tf.train.import_meta_graph("./new_models/model.ckpt.meta")
	model_params = tf.train.latest_checkpoint("./new_models")
	predict_lables = []
	train_labels = []
	with tf.Session() as sess:
		for i in range(images_num):
			train_label = tf.expand_dims(images_label[i], [0])
			train_label = tf.argmax(train_label, 1)
			train_label = sess.run(train_label)
			train_labels.append(train_label[0])
		# print("train data labels: {}".format(train_labels))
			images = tf.expand_dims(images_data[i], [0])
			images = sess.run(images)
			print("images shape: {}".format(images.shape))
			saver.restore(sess, model_params)
			g = tf.get_default_graph()
			pre = g.get_collection("y_pre")[0]
			x = g.get_tensor_by_name("x_input:0")
			pre = sess.run(pre, feed_dict={x: images})
			print("prediction: {}".format(pre))
			pre_value = pre[0]
			print("Extract value from predicted result: {}".format(pre_value))
			sum_pre = sum(pre_value.tolist())
			print("sum predicted value: {}".format(sum_pre))
			pre_num = tf.argmax(pre, 1)
			pre_num = sess.run(pre_num)
			predict_lables.append(pre_num[0])
			# print("train data labels: {}".format(train_labels))
			print("predicted value's shape: {}".format(pre.shape))
			# print("predicted number: {}".format(pre_num[0]))
		print("train data labels: {}".format(train_labels))
		print("predicted number: {}".format(predict_lables))
		conf_mx = confusion_matrix(train_labels, predict_lables)
		print("confusion matrixs: n {}".format(conf_mx))
		if not os.path.exists("./images"):
			os.makedirs("./images")
		plt.matshow(conf_mx, cmap=plt.cm.gray)
		plt.title("训练数据和预测数据的混淆矩阵", fontproperties=font)
		plt.savefig("./images/confusion_matrix.png", format="png")
		plt.show()

if __name__ == '__main__':
	'''This function for training model.'''
	# tf.app.run()
	'''This function for loading model and predict.'''
	load_model(10)

5.2 可视化神经网路

图5.1 神经网路可视化

6 训练结果

6.1 损失值

图6.1 损失值

6.2 预测

图6.2 预测结果

博客:MNIST手写字体数据集解析

7 总结

序号

重点

描述

1

神经网络设计

根据训练任务设计神经网络, 最重要的是输入和输出数据维度的设计,本里的输入数据维度为(100, 784),输出维度为(100, 10),其次是神经网络的设计,依据隐藏层的个数,设计数据维度

2

数据初始化

初始化包括权重和偏置数据的初始化,权重初始化方式为truncated_normal_initializer(stddev=0.1), 偏置初始化方式constant_initializer(0.0)

3

正则化

防止过拟合,对权重进行正则化处理,因为模型的复杂度只由权重weights决定,因此权重的取值直接影响模型的预测精度

4

滑动平均模型

提高模型健壮性即泛化能力(测试数据集上的预测能力),控制模型更新速度

5

数据处理

处理数据需要将数据转换为相应的维度,如输入数据维度(None, 784), 源数据数据的维度(784, 1), 标签(10, 1),需要转化,next_batch已经转化

6

混淆矩阵

混淆矩阵是评估模型结果的指标,可用于判断分类器的优劣


参考文献

1 https://blog.csdn.net/Xin_101/article/details/89373524

2 https://blog.csdn.net/Xin_101/article/details/85013734

3https://blog.csdn.net/Xin_101/article/details/82987227

4 https://scikit-learn.org/stable/auto_examples/semi_supervised/plot_label_propagation_digits_active_learning.html#sphx-glr-auto-examples-semi-supervised-plot-label-propagation-digits-active-learning-py

5 https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html