准确性提高,但 val_acc 在 ConvNet 中振荡。这是什么意思?

数据挖掘 神经网络 张量流 超参数调整 表现 卷积神经网络
2022-02-15 22:23:38

在我的 ConvNet 模型中,我正在尝试对一些图像进行分类。它是恶意软件图像,它不包含复杂的特征(我认为),正如预期的模型学习轻松对图像进行分类一样。您可以在此处查看我的网络拓扑摘要:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_17 (Conv2D)           (None, 128, 128, 32)      1312      
_________________________________________________________________
conv2d_18 (Conv2D)           (None, 125, 119, 32)      40992     
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 62, 59, 32)        0         
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 62, 59, 32)        40992     
_________________________________________________________________
conv2d_20 (Conv2D)           (None, 59, 50, 32)        40992     
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 29, 25, 32)        0         
_________________________________________________________________
flatten_5 (Flatten)          (None, 23200)             0         
_________________________________________________________________
dense_9 (Dense)              (None, 64)                1484800   
_________________________________________________________________
batch_normalization_5 (Batch (None, 64)                256       
_________________________________________________________________
activation_5 (Activation)    (None, 64)                0         
_________________________________________________________________
dropout_5 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_10 (Dense)             (None, 8)                 520       
=================================================================
Total params: 1,609,864
Trainable params: 1,609,736
Non-trainable params: 128
_________________________________________________________________

但是,一段时间后,准确度仍在(缓慢地)提高,val_accuracy 开始波动,之后训练和验证数据之间的准确度差异增加。我正在对所有类分布相同的数据集进行交叉验证。您可以在下面看到一些训练输出:

Epoch 20/80
240/240 [==============================] - 64s 267ms/step - loss: 0.1247 - acc: 0.9654 - val_loss: 0.3417 - val_acc: 0.9270
Epoch 21/80
240/240 [==============================] - 66s 275ms/step - loss: 0.1030 - acc: 0.9700 - val_loss: 0.3560 - val_acc: 0.9220
Epoch 22/80
240/240 [==============================] - 76s 316ms/step - loss: 0.1085 - acc: 0.9671 - val_loss: 0.3471 - val_acc: 0.9100
Epoch 23/80
240/240 [==============================] - 66s 274ms/step - loss: 0.0804 - acc: 0.9787 - val_loss: 0.4013 - val_acc: 0.9060
Epoch 24/80
240/240 [==============================] - 64s 267ms/step - loss: 0.1004 - acc: 0.9725 - val_loss: 0.4071 - val_acc: 0.8920
Epoch 25/80
240/240 [==============================] - 64s 266ms/step - loss: 0.0859 - acc: 0.9754 - val_loss: 0.4733 - val_acc: 0.9110
Epoch 26/80
240/240 [==============================] - 64s 267ms/step - loss: 0.0980 - acc: 0.9717 - val_loss: 0.3792 - val_acc: 0.9120
Epoch 27/80
240/240 [==============================] - 64s 267ms/step - loss: 0.0807 - acc: 0.9775 - val_loss: 0.4354 - val_acc: 0.9100
Epoch 28/80
240/240 [==============================] - 64s 266ms/step - loss: 0.0806 - acc: 0.9754 - val_loss: 0.4109 - val_acc: 0.9070
Epoch 29/80
240/240 [==============================] - 64s 266ms/step - loss: 0.0652 - acc: 0.9821 - val_loss: 0.4318 - val_acc: 0.9070
Epoch 30/80
240/240 [==============================] - 64s 265ms/step - loss: 0.0604 - acc: 0.9825 - val_loss: 0.4095 - val_acc: 0.9190

它是过度拟合还是无法学习,因为我的网络不够深/dense_10 中的神经元数量不够?我在卷积层之后做了两到三层密集层,但是在某些时候它已经过拟合了。

1个回答

如果训练损失不断改善而验证损失停滞或减少,则表明过度拟合。

这意味着您的模型正在继续学习训练数据中的模式,因此层/单元数肯定不会太低。但这些模式并不普遍,也不存在于您的验证数据中。

为了解决这个问题,您可以增加正则化、提前停止甚至减少网络中的单元/层数。