我试图了解 LSTM 与来自传统状态空间模型(例如,卡尔曼滤波)的直觉相关的性质。下面的代码旨在模拟一个简单的单变量线性状态空间+观察模型;接下来,模拟的观察结果以序列形式设置为 LSTM 的输入,目的是估计基础状态。理论上,我相信 LSTM 应该能够相对容易地学习这种映射。我发现通过我的实现似乎并非如此。将训练示例反馈回拟合模型会导致状态估计,这不是我认为对真实状态的“差”估计,但似乎也不是很好的估计,我会想象一个传统的卡尔曼过滤器可以做得更好。
作为参考,我在这里使用了优秀教程中描述的 LSTM 实现: https ://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, LSTM, TimeDistributed
import sklearn.metrics
# generate a state-space model output
def genSimLinear():
N = 100
X = np.zeros((1, N, 1))
y = np.zeros((1, N, 1))
v_sigma = 1
w_sigma = 1
V = np.random.randn(X.shape[0], N)*v_sigma
W = np.random.randn(y.shape[0], N)*w_sigma
for it in range(1, N):
X[0, it, 0] = X[0, it-1, 0] + V[0,it]
y[0, it, 0] = X[0, it, 0] + W[0, it]
state = X[0, 1:, 0].reshape(1, N-1, 1)
measurement = y[0, 1:, 0].reshape(1, N-1, 1)
return state, measurement
# lstm model
def model():
m = Sequential()
m.add(LSTM(1, input_shape=(None, 1),\
return_sequences=True))
m.add(TimeDistributed(Dense(1)))
m.compile(loss='mean_squared_error', optimizer='sgd')
return m
# run analysis
def workflow():
# simulated raw state-space data
trueState, measurement = genSimLinear()
m = model()
m.fit(trueState, trueState, epochs=500)
stateEstimate = m.predict(measurement, batch_size=1)
plt.figure(figsize=(20,20))
plt.plot(trueState[0,:,0], 'r', label='state')
plt.plot(measurement[0,:,0], 'g', label='measurement')
plt.plot(stateEstimate[0,:,0], 'b', label='stateEstimateLSTM')
plt.grid()
plt.legend()
plt.title('MSE: ' +\
str(sklearn.metrics.mean_squared_error(trueState[0,:,0], xhat[0,:,0])));