我正在尝试训练 LSTM 对IMDb 评论数据集进行情绪分析。
作为词嵌入层的输入,我将每条评论转换为索引列表(对应于词汇集中的词索引)。我想将文本转换为单热/计数矩阵,但最终会得到巨大的稀疏矩阵(我应该担心这个吗?)。
以下是我创建网络架构的方式:
model = Sequential()
model.add(Embedding(
input_dim=vocab_size,
output_dim=word_embed_vector_size,
input_length=sentence_len_max)
)
model.add(LSTM(units=1))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy', 'binary_accuracy'])
model.summary()
以下是模型摘要:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_2 (Embedding) (None, 1422, 4) 201764
_________________________________________________________________
lstm_2 (LSTM) (None, 1) 24
_________________________________________________________________
dense_2 (Dense) (None, 1) 2
=================================================================
Total params: 201,790
Trainable params: 201,790
Non-trainable params: 0
___________________________
现在,当我尝试训练模型时,我发现准确率停留在 50%
losses = model.fit(
x = term_idx_train,
y = y_train,
epochs = epochs,
batch_size = batch_size,
validation_split = 0.01
)
这是时代的输出:
Epoch 1/10
25000/25000 [==============================] - 1148s 46ms/step - loss: 7.9712 - acc: 0.5000 - binary_accuracy: 0.5000
Epoch 2/10
25000/25000 [==============================] - 1156s 46ms/step - loss: 7.9712 - acc: 0.5000 - binary_accuracy: 0.5000
Epoch 3/10
25000/25000 [==============================] - 1149s 46ms/step - loss: 7.9712 - acc: 0.5000 - binary_accuracy: 0.5000
Epoch 4/10
25000/25000 [==============================] - 1110s 44ms/step - loss: 7.9712 - acc: 0.5000 - binary_accuracy: 0.5000
Epoch 5/10
16800/25000 [===================>..........] - ETA: 6:10 - loss: 7.9816 - acc: 0.4993 - binary_accuracy: 0.4993
将激活函数更改为 asigmoid并将 LSTM 块更改为 32 无济于事(1 epoch):
Train on 24750 samples, validate on 250 samples
Epoch 1/1
24750/24750 [==============================] - 1186s 48ms/step - loss: 0.6932 - acc: 0.5022 - binary_accuracy: 0.5022 - val_loss: 0.6951 - val_acc: 0.0000e+00 - val_binary_accuracy: 0.0000e+00
Epoch 00001: val_loss improved from inf to 0.69513, saving model to sentiment_model
看看 LSTM 的预测,我看到:
count 25000.000000
mean 0.499023
std 0.000013
min 0.499010
25% 0.499010
50% 0.499010
75% 0.499010
max 0.499443
知道为什么要这样做吗?以及我该如何解决这个问题?