为什么我的线性回归模型会收敛到非零梯度值?

数据挖掘 线性回归 梯度下降
2022-03-02 21:24:06

我有一个基本的 2D 线性回归模型被编码出来(使用梯度下降),但它似乎并没有像它应该的那样工作。

我所期望的是mandc应该分别接近 4 和 3,并且m's 斜率 或c's 斜率应该趋向于 0;然而实际发生的是,它c的斜率接近一个非零值,并且c它本身接近一个取决于时期的值(大约 0.5,时期为 100。)

但是,如果我查看 的图表c,它会随着时间的推移而缓慢上升。

代码在这里:

import random, math
import matplotlib.pyplot as plt

def linreg(x, y):
    """ Performs linear regression: input x, output y. """
    n = float(len(x))
    m = random.random()
    c = random.random()
    dm, dc = [], []
    rate = 0.00001
    epoch = 100
    for run in range(epoch):
        d_m = 0
        d_c = 0
        for i in range(len(x)):
            d_m += (y[i] - m*x[i] - c)*x[i]
            d_c += (y[i] - m*x[i] - c)
        d_m *= -2/n
        d_c *= -2/n
        m -= d_m * rate
        c -= d_c * rate
        dm.append(d_m)
        dc.append(d_c)
    return m, c, dm, dc

x = [i for i in range(400)]
y = [4*i + 3 for i in x]


m, c, dm, dc = linreg(x, y)

print(m, c)

plt.grid()
plt.scatter(x, y)
plt.plot(x, [m*i + c for i in x], color='red')
plt.show()

plt.grid()
plt.plot([i for i in range(len(dm))], dm)
plt.plot([i for i in range(len(dc))], dc, color='red')
plt.show()
1个回答

这些 x 值远大于应有的值。如果你标准化你的 x 并提高学习率,它会很容易收敛。一般假设是输入来自正态分布,因此我们转换输入。

这是代码(请注意,我只从 x 中减去平均值并将其除以其标准差,然后将比率更改为 0.05):

import random, math
import matplotlib.pyplot as plt
import numpy as np


def linreg(x, y):
    """ Performs linear regression: input x, output y. """
    n = float(len(x))
    m = random.random()
    c = random.random()
    dm, dc = [], []
    rate = 0.05
    epoch = 100
    for run in range(epoch):
        d_m = 0
        d_c = 0
        for i in range(len(x)):
            d_m += (y[i] - m*x[i] - c)*x[i]
            d_c += (y[i] - m*x[i] - c)
        d_m *= -2/n
        d_c *= -2/n
        m -= d_m * rate
        c -= d_c * rate
        dm.append(d_m)
        dc.append(d_c)
    return m, c, dm, dc


x = [i for i in range(400)]
x_mean = np.mean(x)
x_std = np.std(x)
x = [(i - x_mean) / x_std for i in x]
y = [4*i + 3 for i in x]

m, c, dm, dc = linreg(x, y)

print(m, c)

plt.grid()
plt.scatter(x, y)
plt.plot(x, [m*i + c for i in x], color='red')
plt.show()

plt.grid()
plt.plot([i for i in range(len(dm))], dm)
plt.plot([i for i in range(len(dc))], dc, color='red')
plt.show()

在我的情况下的输出:

3.999898446167822 2.999938454995635