【tensorflow2.0】训练模型的三种方法
时间:2022-07-23
本文章向大家介绍【tensorflow2.0】训练模型的三种方法,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。
模型的训练主要有内置fit方法、内置tran_on_batch方法、自定义训练循环。
注:fit_generator方法在tf.keras中不推荐使用,其功能已经被fit包含。
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import *
# 打印时间分割线
@tf.function
def printbar():
ts = tf.timestamp()
today_ts = ts%(24*60*60)
hour = tf.cast(today_ts//3600+8,tf.int32)%tf.constant(24)
minite = tf.cast((today_ts%3600)//60,tf.int32)
second = tf.cast(tf.floor(today_ts%60),tf.int32)
def timeformat(m):
if tf.strings.length(tf.strings.format("{}",m))==1:
return(tf.strings.format("0{}",m))
else:
return(tf.strings.format("{}",m))
timestring = tf.strings.join([timeformat(hour),timeformat(minite),
timeformat(second)],separator = ":")
tf.print("=========="*8,end = "")
tf.print(timestring)
MAX_LEN = 300
BATCH_SIZE = 32
(x_train,y_train),(x_test,y_test) = datasets.reuters.load_data()
x_train = preprocessing.sequence.pad_sequences(x_train,maxlen=MAX_LEN)
x_test = preprocessing.sequence.pad_sequences(x_test,maxlen=MAX_LEN)
MAX_WORDS = x_train.max()+1
CAT_NUM = y_train.max()+1
ds_train = tf.data.Dataset.from_tensor_slices((x_train,y_train))
.shuffle(buffer_size = 1000).batch(BATCH_SIZE)
.prefetch(tf.data.experimental.AUTOTUNE).cache()
ds_test = tf.data.Dataset.from_tensor_slices((x_test,y_test))
.shuffle(buffer_size = 1000).batch(BATCH_SIZE)
.prefetch(tf.data.experimental.AUTOTUNE).cache()
一,内置fit方法
该方法功能非常强大, 支持对numpy array, tf.data.Dataset以及 Python generator数据进行训练。
并且可以通过设置回调函数实现对训练过程的复杂控制逻辑。
tf.keras.backend.clear_session()
def create_model():
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(CAT_NUM,activation = "softmax"))
return(model)
def compile_model(model):
model.compile(optimizer=optimizers.Nadam(),
loss=losses.SparseCategoricalCrossentropy(),
metrics=[metrics.SparseCategoricalAccuracy(),metrics.SparseTopKCategoricalAccuracy(5)])
return(model)
model = create_model()
model.summary()
model = compile_model(model)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 300, 7) 216874
_________________________________________________________________
conv1d (Conv1D) (None, 296, 64) 2304
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 146, 32) 6176
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 2336) 0
_________________________________________________________________
dense (Dense) (None, 46) 107502
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
history = model.fit(ds_train,validation_data = ds_test,epochs = 10)
Epoch 1/10
281/281 [==============================] - 8s 28ms/step - loss: 1.9854 - sparse_categorical_accuracy: 0.4876 - sparse_top_k_categorical_accuracy: 0.7488 - val_loss: 1.6438 - val_sparse_categorical_accuracy: 0.5841 - val_sparse_top_k_categorical_accuracy: 0.7636
Epoch 2/10
281/281 [==============================] - 8s 28ms/step - loss: 1.4446 - sparse_categorical_accuracy: 0.6294 - sparse_top_k_categorical_accuracy: 0.8037 - val_loss: 1.5316 - val_sparse_categorical_accuracy: 0.6126 - val_sparse_top_k_categorical_accuracy: 0.7925
Epoch 3/10
281/281 [==============================] - 8s 28ms/step - loss: 1.1883 - sparse_categorical_accuracy: 0.6906 - sparse_top_k_categorical_accuracy: 0.8549 - val_loss: 1.6185 - val_sparse_categorical_accuracy: 0.6278 - val_sparse_top_k_categorical_accuracy: 0.8019
Epoch 4/10
281/281 [==============================] - 8s 28ms/step - loss: 0.9406 - sparse_categorical_accuracy: 0.7546 - sparse_top_k_categorical_accuracy: 0.9057 - val_loss: 1.7211 - val_sparse_categorical_accuracy: 0.6153 - val_sparse_top_k_categorical_accuracy: 0.8041
Epoch 5/10
281/281 [==============================] - 8s 29ms/step - loss: 0.7207 - sparse_categorical_accuracy: 0.8108 - sparse_top_k_categorical_accuracy: 0.9404 - val_loss: 1.9749 - val_sparse_categorical_accuracy: 0.6233 - val_sparse_top_k_categorical_accuracy: 0.7996
Epoch 6/10
281/281 [==============================] - 8s 28ms/step - loss: 0.5558 - sparse_categorical_accuracy: 0.8540 - sparse_top_k_categorical_accuracy: 0.9643 - val_loss: 2.2560 - val_sparse_categorical_accuracy: 0.6269 - val_sparse_top_k_categorical_accuracy: 0.7947
Epoch 7/10
281/281 [==============================] - 8s 28ms/step - loss: 0.4438 - sparse_categorical_accuracy: 0.8916 - sparse_top_k_categorical_accuracy: 0.9781 - val_loss: 2.4731 - val_sparse_categorical_accuracy: 0.6238 - val_sparse_top_k_categorical_accuracy: 0.7965
Epoch 8/10
281/281 [==============================] - 8s 29ms/step - loss: 0.3710 - sparse_categorical_accuracy: 0.9086 - sparse_top_k_categorical_accuracy: 0.9837 - val_loss: 2.6960 - val_sparse_categorical_accuracy: 0.6175 - val_sparse_top_k_categorical_accuracy: 0.7939
Epoch 9/10
281/281 [==============================] - 8s 28ms/step - loss: 0.3201 - sparse_categorical_accuracy: 0.9203 - sparse_top_k_categorical_accuracy: 0.9894 - val_loss: 3.1160 - val_sparse_categorical_accuracy: 0.6193 - val_sparse_top_k_categorical_accuracy: 0.7898
Epoch 10/10
281/281 [==============================] - 8s 28ms/step - loss: 0.2827 - sparse_categorical_accuracy: 0.9262 - sparse_top_k_categorical_accuracy: 0.9922 - val_loss: 2.9516 - val_sparse_categorical_accuracy: 0.6264 - val_sparse_top_k_categorical_accuracy: 0.7974
二,内置train_on_batch方法
该内置方法相比较fit方法更加灵活,可以不通过回调函数而直接在批次层次上更加精细地控制训练的过程。
tf.keras.backend.clear_session()
def create_model():
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(CAT_NUM,activation = "softmax"))
return(model)
def compile_model(model):
model.compile(optimizer=optimizers.Nadam(),
loss=losses.SparseCategoricalCrossentropy(),
metrics=[metrics.SparseCategoricalAccuracy(),metrics.SparseTopKCategoricalAccuracy(5)])
return(model)
model = create_model()
model.summary()
model = compile_model(model)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 300, 7) 216874
_________________________________________________________________
conv1d (Conv1D) (None, 296, 64) 2304
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 146, 32) 6176
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 2336) 0
_________________________________________________________________
dense (Dense) (None, 46) 107502
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
def train_model(model,ds_train,ds_valid,epoches):
for epoch in tf.range(1,epoches+1):
model.reset_metrics()
# 在后期降低学习率
if epoch == 5:
model.optimizer.lr.assign(model.optimizer.lr/2.0)
tf.print("Lowering optimizer Learning Rate...nn")
for x, y in ds_train:
train_result = model.train_on_batch(x, y)
for x, y in ds_valid:
valid_result = model.test_on_batch(x, y,reset_metrics=False)
if epoch%1 ==0:
printbar()
tf.print("epoch = ",epoch)
print("train:",dict(zip(model.metrics_names,train_result)))
print("valid:",dict(zip(model.metrics_names,valid_result)))
print("")
train_model(model,ds_train,ds_test,10)
================================================================================11:49:43
epoch = 1
train: {'loss': 2.0567171573638916, 'sparse_categorical_accuracy': 0.4545454680919647, 'sparse_top_k_categorical_accuracy': 0.6818181872367859}
valid: {'loss': 1.6894209384918213, 'sparse_categorical_accuracy': 0.5605521202087402, 'sparse_top_k_categorical_accuracy': 0.7617987394332886}
================================================================================11:49:53
epoch = 2
train: {'loss': 1.4644863605499268, 'sparse_categorical_accuracy': 0.6363636255264282, 'sparse_top_k_categorical_accuracy': 0.7727272510528564}
valid: {'loss': 1.5152910947799683, 'sparse_categorical_accuracy': 0.6157613396644592, 'sparse_top_k_categorical_accuracy': 0.7938557267189026}
================================================================================11:50:01
epoch = 3
train: {'loss': 1.0017579793930054, 'sparse_categorical_accuracy': 0.7727272510528564, 'sparse_top_k_categorical_accuracy': 0.9545454382896423}
valid: {'loss': 1.5588842630386353, 'sparse_categorical_accuracy': 0.6228851079940796, 'sparse_top_k_categorical_accuracy': 0.8058770895004272}
================================================================================11:50:10
epoch = 4
train: {'loss': 0.6004871726036072, 'sparse_categorical_accuracy': 0.9090909361839294, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.7447566986083984, 'sparse_categorical_accuracy': 0.6233303546905518, 'sparse_top_k_categorical_accuracy': 0.8174532651901245}
Lowering optimizer Learning Rate...
================================================================================11:50:19
epoch = 5
train: {'loss': 0.3866238594055176, 'sparse_categorical_accuracy': 0.9545454382896423, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.8871253728866577, 'sparse_categorical_accuracy': 0.6308993697166443, 'sparse_top_k_categorical_accuracy': 0.816117525100708}
================================================================================11:50:28
epoch = 6
train: {'loss': 0.27341774106025696, 'sparse_categorical_accuracy': 0.9545454382896423, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 2.0595862865448, 'sparse_categorical_accuracy': 0.6273375153541565, 'sparse_top_k_categorical_accuracy': 0.8089937567710876}
================================================================================11:50:37
epoch = 7
train: {'loss': 0.1923554539680481, 'sparse_categorical_accuracy': 0.9545454382896423, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 2.2238168716430664, 'sparse_categorical_accuracy': 0.6251112818717957, 'sparse_top_k_categorical_accuracy': 0.8085485100746155}
================================================================================11:50:46
epoch = 8
train: {'loss': 0.12688547372817993, 'sparse_categorical_accuracy': 0.9545454382896423, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 2.3778438568115234, 'sparse_categorical_accuracy': 0.6175423264503479, 'sparse_top_k_categorical_accuracy': 0.8072128295898438}
================================================================================11:50:55
epoch = 9
train: {'loss': 0.08024053275585175, 'sparse_categorical_accuracy': 0.9545454382896423, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 2.501840829849243, 'sparse_categorical_accuracy': 0.6135351657867432, 'sparse_top_k_categorical_accuracy': 0.8081033229827881}
================================================================================11:51:04
epoch = 10
train: {'loss': 0.05211604759097099, 'sparse_categorical_accuracy': 1.0, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 2.61771559715271, 'sparse_categorical_accuracy': 0.6126446723937988, 'sparse_top_k_categorical_accuracy': 0.8085485100746155}
三,自定义训练循环
自定义训练循环无需编译模型,直接利用优化器根据损失函数反向传播迭代参数,拥有最高的灵活性。
tf.keras.backend.clear_session()
def create_model():
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(CAT_NUM,activation = "softmax"))
return(model)
model = create_model()
model.summary()
optimizer = optimizers.Nadam()
loss_func = losses.SparseCategoricalCrossentropy()
train_loss = metrics.Mean(name='train_loss')
train_metric = metrics.SparseCategoricalAccuracy(name='train_accuracy')
valid_loss = metrics.Mean(name='valid_loss')
valid_metric = metrics.SparseCategoricalAccuracy(name='valid_accuracy')
@tf.function
def train_step(model, features, labels):
with tf.GradientTape() as tape:
predictions = model(features,training = True)
loss = loss_func(labels, predictions)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss.update_state(loss)
train_metric.update_state(labels, predictions)
@tf.function
def valid_step(model, features, labels):
predictions = model(features)
batch_loss = loss_func(labels, predictions)
valid_loss.update_state(batch_loss)
valid_metric.update_state(labels, predictions)
def train_model(model,ds_train,ds_valid,epochs):
for epoch in tf.range(1,epochs+1):
for features, labels in ds_train:
train_step(model,features,labels)
for features, labels in ds_valid:
valid_step(model,features,labels)
logs = 'Epoch={},Loss:{},Accuracy:{},Valid Loss:{},Valid Accuracy:{}'
if epoch%1 ==0:
printbar()
tf.print(tf.strings.format(logs,
(epoch,train_loss.result(),train_metric.result(),valid_loss.result(),valid_metric.result())))
tf.print("")
train_loss.reset_states()
valid_loss.reset_states()
train_metric.reset_states()
valid_metric.reset_states()
train_model(model,ds_train,ds_test,10)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 300, 7) 216874
_________________________________________________________________
conv1d (Conv1D) (None, 296, 64) 2304
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 146, 32) 6176
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 2336) 0
_________________________________________________________________
dense (Dense) (None, 46) 107502
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
================================================================================11:52:04
Epoch=1,Loss:2.02564383,Accuracy:0.464707196,Valid Loss:1.68035507,Valid Accuracy:0.55921638
================================================================================11:52:11
Epoch=2,Loss:1.48306167,Accuracy:0.612781107,Valid Loss:1.52322364,Valid Accuracy:0.606411397
================================================================================11:52:18
Epoch=3,Loss:1.20491719,Accuracy:0.677243352,Valid Loss:1.56225574,Valid Accuracy:0.624666095
================================================================================11:52:25
Epoch=4,Loss:0.944778264,Accuracy:0.749387681,Valid Loss:1.7202934,Valid Accuracy:0.620658934
================================================================================11:52:32
Epoch=5,Loss:0.701866329,Accuracy:0.817635298,Valid Loss:1.97179747,Valid Accuracy:0.61843276
================================================================================11:52:39
Epoch=6,Loss:0.531810164,Accuracy:0.866844773,Valid Loss:2.25338316,Valid Accuracy:0.605075717
================================================================================11:52:46
Epoch=7,Loss:0.425013304,Accuracy:0.896236897,Valid Loss:2.47035336,Valid Accuracy:0.601068556
================================================================================11:52:53
Epoch=8,Loss:0.355143964,Accuracy:0.915609,Valid Loss:2.67822,Valid Accuracy:0.591718614
================================================================================11:53:00
Epoch=9,Loss:0.30812338,Accuracy:0.92785573,Valid Loss:2.86121941,Valid Accuracy:0.583704352
================================================================================11:53:07
Epoch=10,Loss:0.275565386,Accuracy:0.934535742,Valid Loss:2.99354172,Valid Accuracy:0.579252
参考:
开源电子书地址:https://lyhue1991.github.io/eat_tensorflow2_in_30_days/
GitHub 项目地址:https://github.com/lyhue1991/eat_tensorflow2_in_30_days
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- 详解Android Gradle插件3.0挖坑日记
- Android开发之拼音转换工具类PinyinUtils示例
- Android多线程断点续传下载示例详解
- Android设备与外接U盘实现数据读取操作的示例
- [Alibaba-ARouter]浅谈简单好用的Android页面路由框架
- android屏幕圆角实现方法的示例代码
- Android开发中日期工具类DateUtil完整实例
- Android模仿实现微博详情页滑动固定顶部栏的效果实例
- Android EventBus(普通事件/粘性事件)详解
- Android实现EventBus登录界面与传值(粘性事件)
- Android自定义LinearLayout布局显示不完整的解决方法
- android短信管理器SmsManager实例详解
- Android开发判断一个app应用是否在运行的方法详解
- 收割腾讯等十几个Offer后,揭秘进大厂的秘诀和Android技术面试题汇总!
- Flutter BLoC 异步通信、BlocBuilder的基本使用、BlocProvider的初探