理论和代码-线性回归

步骤1:模型假设,如何选择模型;这里选一维线性模型
$y=w.x+b$

步骤2:模型评估,如何判断哪个模型好坏;一般是通过损失函数
$L(w,b)=\Sigma_{i}^{n}(\widehat{y_{i}}-(w.x_{i}+b))^2$,其中$\widehat{y_{i}}$是真实值
步骤3:模型优化,如何筛选出最优的模型;梯度下降是方法之一,这里用这个
$\frac{\partial{L(w,b)}}{\partial{w}}=\Sigma_{i}^{n}2.(\widehat{y_{i}}-(w.x_{i}+b)).(-x_{i})$
$\frac{\partial{L(w,b)}}{\partial{b}}=\Sigma_{i}^{n}2.(\widehat{y_{i}}-(w.x_{i}+b)).(-1)$
$w^{j+1}\leftarrow w^{j}-\eta.\frac{\partial{L(w,b)}}{\partial{w}}\vert w=w^{j},b=b^{j}$
$b^{j+1}\leftarrow b^{j}-\eta.\frac{\partial{L(w,b)}}{\partial{b}}\vert w=w^{j},b=b^{j}$
步骤4:必要的调整,如有
1,多模型组合
2,更多的输入参数,如输入x从一维变成n维,此时,步骤1模型假设为$y=\Sigma_i^{n}{w_{i}.x_{ji}}+b$,其中某个输入$x_{j}=[x_{j0},x_{j1}…]$
3,加入正则化,对于多维线性模型,L1正则化步骤2模型评估有$L(w,b)=\Sigma_{i}^{n}(\widehat{y_{i}}-(\Sigma w_{i}.x_{ji}+b))^2+\lambda.\Sigma\vert w_{i} \vert$
L2正则化步骤2模型评估有$L(w,b)=\Sigma_{i}^{n}(\widehat{y_{i}}-(\Sigma w_{i}.x_{ji}+b))^2+\lambda.\Sigma (w_{i})^2$
4,选择性使用AdaptiveGradient,或 StochasticGradient
5,多维度特征时,特征的归一化到同一范围,进行特征缩放,比如$x_{j}^{i}\leftarrow \frac{x_{j}^{i}-m_{i}}{\sigma_{i}}$,其中$x_{j}^{i}$是第j个训练数据的第i维特征值,$m_{i}$是所有训练数据的第i维特征值的均值,$\sigma_{i}$是所有训练数据第i维特征值的标准差

以前三个步骤为例,coding如下

raw python 版本 demo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#! /usr/bin/python
# -*- coding:utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
from pylab import mpl

# to support chinese
plt.rcParams['font.sans-serif'] = ['Simhei']
# make '-' show correct
plt.rcParams['axes.unicode_minus'] = False


# train data
x_data = [338., 333., 328., 207., 226., 25., 179., 60., 208., 606.]
y_data = [640., 633., 619., 393., 428., 27., 193., 66., 226., 1591.]

# init
bias = -120
weight = -4
learning_rate = 1
iteration = 10000

# for show
bias_history = []
weight_history = []

# for better result, special value
learning_rate_bias = 0
learning_rate_weight= 0

# train
for i in range(iteration):
bias_grad = 0.0
weight_grad = 0.0
for j in range(len(x_data)):
bias_grad += 2.0*(y_data[j]-weight*x_data[j]-bias)*(-1.0)
weight_grad += 2.0*(y_data[j]-weight*x_data[j]-bias)*(-1.0 * x_data[j])
learning_rate_bias += bias_grad ** 2
learning_rate_weight= weight_grad ** 2

bias -= learning_rate / np.sqrt(learning_rate_bias) * bias_grad
weight -= learning_rate / np.sqrt(learning_rate_weight) * weight_grad

# add to history for showing
bias_history.append(bias)
weight_history.append(weight)
#break

# for show
x = np.arange(-200, -100, 1)
y = np.arange(-5, 5, 0.1)
X, Y = np.meshgrid(x, y)
Z = np.zeros((len(x), len(y)))
#print X, Y, Z

plt.contourf(X, Y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))
plt.plot([-188.4], [2.67], 'x', ms=12, mew=3, color='red')
plt.plot(bias_history, weight_history, 'o-', ms=3, lw=1.5, color='black')
plt.xlim(-200, -100)
plt.ylim(-5, 5)
plt.xlabel(r'$bias$')
plt.ylabel(r'$weight$')
plt.title("Linear Regression")
plt.show()

numpy 版 demo,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#! /usr/bin/python
# -*- coding:utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
from pylab import mpl

# to support chinese
plt.rcParams['font.sans-serif'] = ['Simhei']
# make '-' show correct
plt.rcParams['axes.unicode_minus'] = False


# train data
x_data = np.array([338., 333., 328., 207., 226., 25., 179., 60., 208., 606.])
y_data = np.array([640., 633., 619., 393., 428., 27., 193., 66., 226., 1591.])

# init
bias = -120
weight = -4
learning_rate = 1
iteration = 10000

# for show
bias_history = []
weight_history = []

# for better result, special value
learning_rate_bias = 0
learning_rate_weight= 0

# train
for i in range(iteration):
delt = y_data -(weight * x_data + bias)
loss = np.dot(delt, delt)
bias_grad = -2.0 * np.sum(delt)
weight_grad = -2.0 * np.dot(delt, x_data)

learning_rate_bias += bias_grad ** 2
learning_rate_weight= weight_grad ** 2

bias -= learning_rate / np.sqrt(learning_rate_bias) * bias_grad
weight -= learning_rate / np.sqrt(learning_rate_weight) * weight_grad

# add to history for showing
bias_history.append(bias)
weight_history.append(weight)
#break

# for show
x = np.arange(-200, -100, 1)
y = np.arange(-5, 5, 0.1)
X, Y = np.meshgrid(x, y)
Z = np.zeros((len(x), len(y)))
#print X, Y, Z

plt.contourf(X, Y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))
plt.plot([-188.4], [2.67], 'x', ms=12, mew=3, color='red')
plt.plot(bias_history, weight_history, 'o-', ms=3, lw=1.5, color='black')
plt.xlim(-200, -100)
plt.ylim(-5, 5)
plt.xlabel(r'$bias$')
plt.ylabel(r'$weight$')
plt.title("Linear Regression")
plt.show()

tensorflow 版demo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import numpy as np
import tensorflow as tf

# train data
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.3 + 0.2

# create tf structure
weight = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
bias = tf.Variable(tf.zeros([1]))
learning_rate = 0.5
iteration = 1000

# model
y = weight * x_data + bias

loss = tf.reduce_mean(tf.square(y_data-y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(loss)

# init
init = tf.global_variables_initializer()

with tf.Session() as sess:
sess.run(init)

for step in range(iteration):
sess.run(train)

if step % 20 == 0:
print step, sess.run(weight) , sess.run(bias)

坚持原创技术分享,您的支持将鼓励我继续创作!

Related Issues not found

Please contact @caszhang to initialize the comment