实际操作过程中,常见的深度学习场景,基本上会在数据筛选、数据处理、模型结构调整、训练参数调整等,常见的可操作的点有:
1,数据预处理相关,数据集合选取,数据集预处理,训练集与测试集比例和划分
2,模型和调参数相关,模型设计和调整,learning rate, epoch, batch_size,激活函数,dropout,归一化,损失函数选择,优化算法选择/梯度下降类型选择
深度学习场景下,一般不需要再进行特征选择,但上述1还是会有一定的工作量,这里只针对第2点;
模型的设计和调整:神经网络的深度,每一层的神经元数
learning rate:参数迭代的速度,越小越慢越精细,越大越快,可能错过全局最优而不收敛,合适选择
epoch:全数据训练集合的训练轮次
batch_size:一次训练涉及的样本数量,越大,训练/迭代的越快,参数迭代的轮数可以认为等于epoch*训练样本数/batch_size
激活函数:sogmoid, relu, softmax等等
dropout:防止过拟合的方法之一,适当舍弃神经元,注意,如果训练时dropout=p,测试时权重要乘1-p
归一化:L1/L2,max-min归一化等等
损失函数选择:欧式距离,最大似然,交叉墒等
优化算法选择/梯度下降类型选择:SGD,Adam等
下面两个基于keras的手写字识别例子,注释掉的是上面提到的一些调试选择1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62import numpy as np
import keras
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam
from keras.utils import np_utils
from keras.datasets import mnist
def load_data():
(x_train, y_train), (x_test, y_test) = mnist.load_data()
number = 10000
x_train = x_train[0:number]
x_train = x_train.reshape(number, 28*28)
x_train = x_train.astype('float32')
y_train = y_train[0:number]
y_train = np_utils.to_categorical(y_train, 10)
x_test = x_test.reshape(x_test.shape[0], 28*28)
x_test = x_test.astype('float32')
y_test = np_utils.to_categorical(y_test, 10)
# normalize value between [0, 1]
#x_train = x_train
x_train = x_train / 255
#x_test = x_test
x_test = x_test / 255
return (x_train, y_train), (x_test, y_test)
(x_train, y_train), (x_test, y_test) = load_data()
model = Sequential()
#model.add(Dense(input_dim=28*28, units=689, activation='sigmoid'))
model.add(Dense(input_dim=28*28, units=689, activation='relu'))
model.add(Dropout(0.7))
#model.add(Dense(units=689, activation='sigmoid'))
model.add(Dense(units=689, activation='relu'))
model.add(Dropout(0.7))
#model.add(Dense(units=689, activation='sigmoid'))
model.add(Dense(units=689, activation='relu'))
model.add(Dropout(0.7))
'''
for _ in range(10):
model.add(Dense(units=689, activation='sigmoid'))
'''
model.add(Dense(units=10, activation='softmax'))
#model.compile(loss='mse', optimizer=SGD(lr=0.1), metrics=['accuracy'])
#model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.1), metrics=['accuracy'])
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.1), metrics=['accuracy'])
#model.fit(x_train, y_train, batch_size=10000, epochs=20)
model.fit(x_train, y_train, batch_size=100, epochs=20)
result = model.evaluate(x_test, y_test)
print('TEST ACC: ', result[1])
#result = model.predict(x_test)
另外一种写法,1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72import numpy as np
import random
import keras
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import RMSprop, SGD, Adam
from keras.utils import np_utils
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)
x_train = x_train.reshape(x_train.shape[0], -1)
x_train = x_train.astype('float32')
x_test = x_test.reshape(x_test.shape[0], -1)
x_test = x_test.astype('float32')
# normalization to [0, 1]
x_train = x_train / 255
x_test = x_test / 255
# specify the class label
batch_size = 128
number_epochs = 10
# convert class vectors to one-hot vector
y_train = np_utils.to_categorical(y_train, num_classes=10)
y_test = np_utils.to_categorical(y_test, num_classes=10)
# model definition
model = Sequential()
# hidden layer, 1st fully-connected layer
model.add(Dense(units=512, input_shape=(28*28,)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
# hidden layer, 2st fully-connected layer
model.add(Dense(units=512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
# output layer
model.add(Dense(units=10))
model.add(Activation('softmax'))
# print the model summary
model.summary()
# start compilation
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# start training
history = model.fit(x_train, y_train, epochs=number_epochs, batch_size=batch_size)
# print the history
print('model params: ', history.params)
# start evaluation
evl_result = model.evaluate(x_test, y_test)
print(model.metrics_names[0], ':', evl_result[0], model.metrics_names[1], ':', evl_result[1])
# select one pic fro previewing
x_test_0 = x_test[0, :].reshape(1, 28*28)
y_test_0 = y_test[0, :]
plt.imshow(x_test_0.reshape(28, 28))
plt.show()
# start prediction
prediction = model.predict(x_test_0[:])
print('truth :', np.argmax(y_test_0), 'prediction :', prediction[0], 'dnn predict :', np.argmax(prediction[0]))