Keras

Keras是一个用纯Python编写的高级神经网络API
它能够以TensorFlow、CNTK 或者Theano作为后端运行

因为keras是主流神经网络框架再次封装后的高级API,所以它的最大优点就是编写效率很高
除了极少极少的灵活功能无法很好实现,keras对于绝大多数任务都能以最轻松的方式实现,所以对于神经网络框架入门也十分友好

然而同样由于keras是再封装的高层API,它的执行效率难免较低
尽管如此,keras仍然是一个非常优秀的框架

(注:keras2.3.0将会是最后一个多后端 Keras 主版本,该版本及之后keras已被内置进tensorflow2,keras官方建议使用 import tensorflow.keras 替代import keras)

线性回归

线性回归即在二维平面上给定若干个点,求解其线性关系
线性回归问题非常适合加深对神经网络基本理论的理解以及入门任意一个深度学习框架
下面就以这个例子作为keras 2.4(tensorflow2.3.0 backend)的入门

构造数据集

首先利用numpy构造数据集,可视化如下图所示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def createData():
X = np.linspace(-1, 1, 200)
np.random.shuffle(X)
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (200,))

X = np.expand_dims(X, 1) # 使X.shape为(200, 1),表示200个一维数据
Y = np.expand_dims(Y, 1)

X_Train, Y_Train = X[:160], Y[:160]
X_Test, Y_Test = X[160:], Y[160:]

return X_Train, Y_Train, X_Test, Y_Test

X_Train, Y_Train, X_Test, Y_Test = createData()

2_point

网络搭建

我们已知线性回归的目标函数形如$y=wx+b$
令w对应神经元的连接权,b对应偏置,易知只需要输入层和输出层各一个神经元

这里使用了keras的Sequential类搭建网络,也即顺序模型
顾名思义,其中的前向传播将按照add添加层的顺序执行

添加完需要的层之后使用compile指定损失函数和参数更新器,网络就搭建完了
这里用字符串指定的optimizer使用默认参数,也可以从keras.optimizers中实例化一个optimizer来指定参数

1
2
3
4
5
6
7
8
9
10
11
from keras.models import Sequential
from keras.layers import Dense, InputLayer

# 定义顺序模型
model = Sequential()

# Dense为全连接层,units表示神经元数量,input_shape表示输入张量尺寸,只在第一层需要
model.add(Dense(units=1, input_shape=(1,), name='Dense1'))

# 损失函数和参数更新器
model.compile(loss='mse', optimizer='sgd')

我们可以查看搭建好的网络信息
其中output shape中(None,1)的None表示batch_size

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
print(model.summary())

'''
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 1) 2
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________
None
'''

网络训练与测试

keras训练支持直接传入numpy类型数据,这是keras方便的一个重要体现

模型的fit方法传入输入输出数据以及batch_size和epoch即可训练
fit方法再训练完成后返回一个History对象,History.history属性是一个字典,包含类每个epoch完成时的loss和metric

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
print("Training...")
hist = model.fit(X_Train, Y_Train, batch_size=40, epochs=200, verbose=1)
# verbose=1表示打印训练日志

print(hist.history['loss'])

''' 训练日志
Training...
Epoch 1/200
4/4 [==============================] - 0s 499us/step - loss: 4.3215
Epoch 2/200
4/4 [==============================] - 0s 628us/step - loss: 3.7308
Epoch 3/200
...
Epoch 199/200
4/4 [==============================] - 0s 470us/step - loss: 0.0025
Epoch 200/200
4/4 [==============================] - 0s 498us/step - loss: 0.0025
'''

除此之外keras也支持手动一个batch一个batch的训练
train_on_batch方法每次返回一个batch的loss

1
2
3
# another way to train
for step in range(301):
loss = model.train_on_batch(X_Train, Y_Train)

最后是测试

1
2
3
4
5
6
7
8
9
print("Testing...")
loss = model.evaluate(X_Test, Y_Test, batch_size=40)
print("Testing Cost: ",loss)

'''
Testing...
1/1 [==============================] - 0s 964us/step - loss: 0.0024
Testing Cost: 0.0023815783206373453
'''

当然我们也可以用get_weights()直接获取网络训练完成后的参数
还可以调用predict方法进行单次预测

1
2
3
4
5
6
7
k, b = model.layers[0].get_weights()
print("k: {}, b: {}".format(k,b))

Y_Pred = model.predict(X_Test)
plt.scatter(X_Test, Y_Test)
plt.plot(X_Test, Y_Pred, 'red')
plt.show()

2_regress

完整代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import numpy as np
from keras.models import Sequential # keras2.3.0以上建议使用tensorflow.keras调用
from keras.layers import Dense, InputLayer
from keras.utils import plot_model
import matplotlib.pyplot as plt

def createData():
X = np.linspace(-1, 1, 200)
np.random.shuffle(X)
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (200,))

X = np.expand_dims(X, 1) # 使X.shape为(200, 1),表示200个一维数据
Y = np.expand_dims(Y, 1)

X_Train, Y_Train = X[:160], Y[:160]
X_Test, Y_Test = X[160:], Y[160:]

return X_Train, Y_Train, X_Test, Y_Test

X_Train, Y_Train, X_Test, Y_Test = createData()

# 建立神经网络
model = Sequential()
model.add(Dense(units=1, input_shape=(1,))) # units即旧版本output_dim

# 损失函数和优化器
model.compile(loss='mse', optimizer='sgd')

print(model.summary())

print("Training...")
hist = model.fit(X_Train, Y_Train, batch_size=40, epochs=200, verbose=1)

print(hist.history['loss'])

# another way to train
# for step in range(301):
# cost = model.train_on_batch(X_Train, Y_Train)

print("Testing...")
cost = model.evaluate(X_Test, Y_Test, batch_size=40)
print("Testing Cost: ",cost)

k, b = model.layers[0].get_weights()
print("k: {}, b: {}".format(k,b))

Y_Pred = model.predict(X_Test)
plt.scatter(X_Test, Y_Test)
plt.plot(X_Test, Y_Pred, 'red')
plt.show()

# plot_model(model, to_file='regression.png', show_shapes=True)

分类

获取Mnist数据集及预处理

mnist是一个经典的手写数字图像数据集,keras已内置了mnist
包含大小为60000训练集和大小为10000的测试集
其中的图片尺寸均为28x28,且为单通道灰度图像

由于是分类任务,数据的原始标签需要转换为one-hot编码
例如标签5转化为[0, 0, 0, 0, 0, 1, 0, 0, 0, 0]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from keras.datasets import mnist
from keras.utils import np_utils

# 初次使用将下载mnist数据集到 '~/.keras/datasets/'
# X_train shape (60000, 28, 28), Y_train shape (60000, )
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# 将每个图片矩阵一维序列化并将数值归一化
X_train = X_train.reshape(X_train.shape[0], -1) / 255
X_test = X_test.reshape(X_test.shape[0], -1) / 255

# 将整型标签转化为onehot编码,num_classes为标签类别数
Y_train = np_utils.to_categorical(Y_train, num_classes=10)
Y_test = np_utils.to_categorical(Y_test, num_classes=10)

网络搭建

同样使用Sequential类,其中第一个全连接层后使用了relu激活
由于之前数据标签已转换成了one-hot编码,因此输出层神经元有十个,且使用了softmax激活

之后optimizer我们实例化了Adam类,并自定义了学习率
因为是多分类任务,loss函数使用了交叉熵损失

注意到compile多了一个metrics,它用来指定一些模型的度量标准,但它不参与网络学习,只显示训练情况

1
2
3
4
5
6
7
8
9
10
11
12
13
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import Adam

model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

model.compile(optimizer=Adam(lr=0.002),
loss='categorical_crossentropy',
metrics=['accuracy'])

网络训练与测试

这里训练日志多了一个accuracy,就是之前compile指定的metrics

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
print('Training...')
model.fit(X_train, Y_train, epochs=2, batch_size=32, verbose=1)

print('Testing...')
loss, accuracy = model.evaluate(X_test, Y_test, batch_size=32, verbose=1)

print('test loss: ', loss)
print('test accuracy: ', accuracy)

'''
Training...
Epoch 1/2
1875/1875 [==============================] - 2s 838us/step - loss: 0.3021 - accuracy: 0.9124
Epoch 2/2
1875/1875 [==============================] - 2s 871us/step - loss: 0.1681 - accuracy: 0.9510
Testing...
313/313 [==============================] - 0s 953us/step - loss: 0.1570 - accuracy: 0.9528
test loss: 0.15704774856567383
test accuracy: 0.9527999758720398
'''

完整代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# 手写数字书别

import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import Adam
from keras.utils import plot_model

# 下载mnist数据集到 '~/.keras/datasets/'
# X_train shape (60000, 28, 28), Y_train shape (60000, )
# X_test shape (10000, 28, 28), Y_test shape (10000, )

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# 将每个图片矩阵一维序列化并将数值归一化
X_train = X_train.reshape(X_train.shape[0], -1) / 255
X_test = X_test.reshape(X_test.shape[0], -1) / 255

# 将整型标签转化为onehot编码,num_classes为标签类别数
# 例如5转化为[0, 0, 0, 0, 0, 5, 0, 0, 0, 0](编码从0开始)
Y_train = np_utils.to_categorical(Y_train, num_classes=10)
Y_test = np_utils.to_categorical(Y_test, num_classes=10)

# 建立神经网络

# method-1
# model = Sequential([
# Dense(32, input_dim=784),
# Activation('relu'),
# Dense(10),
# Activation('softmax'),
# ])

# method-2
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

model.compile(optimizer=Adam(lr=0.002),
loss='categorical_crossentropy',
metrics=['accuracy'])

print('Training...')
model.fit(X_train, Y_train, epochs=2, batch_size=32, verbose=1)

print('Testing...')
loss, accuracy = model.evaluate(X_test, Y_test, batch_size=32, verbose=1)

print('test loss: ', loss)
print('test accuracy: ', accuracy)

# plot_model(model, to_file='model.png', show_shapes=True)

keras的其他常用功能

模型保存

1
2
3
4
5
6
7
from keras.models import load_model

model.save('./save/model.h5') # 保存模型结构和权值
model.save_weights('./save/model_weights.h5') # 只保存模型权值

model = load_model('./save/model.h5') # 读取整个模型
model.load_weights('./save/model_weights.h5') # 载入权值,网络必须和之前完全相同

模型可视化,例如刚才的分类网络

1
2
from keras.utils import plot_model
plot_model(model, to_file='model.png', show_shapes=True)

model