机器算法验证 - 传统的状态空间模型和 LSTM - 吾爱随笔录

我试图了解 LSTM 与来自传统状态空间模型（例如，卡尔曼滤波）的直觉相关的性质。下面的代码旨在模拟一个简单的单变量线性状态空间+观察模型；接下来，模拟的观察结果以序列形式设置为 LSTM 的输入，目的是估计基础状态。理论上，我相信 LSTM 应该能够相对容易地学习这种映射。我发现通过我的实现似乎并非如此。将训练示例反馈回拟合模型会导致状态估计，这不是我认为对真实状态的“差”估计，但似乎也不是很好的估计，我会想象一个传统的卡尔曼过滤器可以做得更好。

作为参考，我在这里使用了优秀教程中描述的 LSTM 实现： https ://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/

import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, LSTM, TimeDistributed
import sklearn.metrics

# generate a state-space model output
def genSimLinear():
    N = 100
    X = np.zeros((1, N, 1))
    y = np.zeros((1, N, 1))
    v_sigma = 1
    w_sigma = 1
    V = np.random.randn(X.shape[0], N)*v_sigma
    W = np.random.randn(y.shape[0], N)*w_sigma
    for it in range(1, N):
        X[0, it, 0] = X[0, it-1, 0] + V[0,it]
        y[0, it, 0] = X[0, it, 0] + W[0, it]
    state = X[0, 1:, 0].reshape(1, N-1, 1)
    measurement = y[0, 1:, 0].reshape(1, N-1, 1)
    return state, measurement

# lstm model
def model():
    m = Sequential()
    m.add(LSTM(1, input_shape=(None, 1),\
               return_sequences=True))
    m.add(TimeDistributed(Dense(1)))
    m.compile(loss='mean_squared_error', optimizer='sgd')
    return m

# run analysis
def workflow():
    # simulated raw state-space data
    trueState, measurement = genSimLinear()

    m = model()
    m.fit(trueState, trueState, epochs=500)
    stateEstimate = m.predict(measurement, batch_size=1)

    plt.figure(figsize=(20,20))
    plt.plot(trueState[0,:,0], 'r', label='state')
    plt.plot(measurement[0,:,0], 'g', label='measurement')
    plt.plot(stateEstimate[0,:,0], 'b', label='stateEstimateLSTM')
    plt.grid()
    plt.legend()
    plt.title('MSE: ' +\
              str(sklearn.metrics.mean_squared_error(trueState[0,:,0], xhat[0,:,0])));