有没有办法提高这种图像识别模型的验证准确性?

数据挖掘 机器学习 Python 图像识别
2022-02-22 06:03:35

我对机器学习真的很陌生,这个模型应该区分石头、纸和剪刀。

import tensorflow as tf
import zipfile, os

local_zip = 'rockpaperscissors.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp')
zip_ref.close()

import splitfolders
base_dir = '/tmp/rockpaperscissors/rps-cv-images'
splitfolders.ratio(base_dir, output = '/tmp/rockpaperscissors', seed = 1337, ratio = (.6, .4))

train_dir = os.path.join('/tmp/rockpaperscissors', 'train')
val_dir = os.path.join('/tmp/rockpaperscissors', 'val')

rock_dir = os.path.join(base_dir, 'rock')
paper_dir = os.path.join(base_dir, 'paper')
scissors_dir = os.path.join(base_dir, 'scissors')

from sklearn.model_selection import train_test_split

train_rock_dir, val_rock_dir = train_test_split (os.listdir (rock_dir), test_size = 0.4)
train_paper_dir, val_paper_dir = train_test_split (os.listdir (paper_dir), test_size = 0.4)
train_scissors_dir, val_scissors_dir = train_test_split (os.listdir (scissors_dir), test_size = 0.4)

train_rock = os.path.join(train_dir, 'rock')
train_paper = os.path.join(train_dir, 'paper')
train_scissors = os.path.join(train_dir, 'scissors')

val_rock = os.path.join(val_dir, 'rock')
val_paper = os.path.join(val_dir, 'paper')
val_scissors = os.path.join(val_dir, 'scissors')

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
                rescale=1./255,
                rotation_range=20,
                horizontal_flip=True,
                shear_range = 0.2,
                fill_mode = 'nearest')

validation_datagen = ImageDataGenerator(
                rescale=1./255,
                rotation_range=20,
                horizontal_flip=True,
                shear_range = 0.2,
                fill_mode = 'nearest')

train_generator = train_datagen.flow_from_directory(
               train_dir,
               target_size=(150, 150),
               batch_size=4,
               class_mode='categorical')

validation_generator = validation_datagen.flow_from_directory(
               val_dir,
               target_size=(150, 150),
               batch_size=4,
               class_mode='categorical')

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3,3), padding = 'same', activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3,3), padding = 'same', activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64, (3,3), padding = 'same', activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(128, (3,3), padding = 'same', activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(128, (3,3), padding = 'same', activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.6),
    tf.keras.layers.Dense(512, activation='elu'),
    tf.keras.layers.Dense(3, activation='softmax')
 ])

from keras.callbacks import EarlyStopping, ReduceLROnPlateau
es = EarlyStopping(
    monitor = 'val_accuracy',
    min_delta = 0.0001,
    patience = 10,
    verbose = 1,
    mode = 'max'
)

red = ReduceLROnPlateau(
    monitor = 'val_accuracy',
    factor = 0.3,
    patience = 10,
    verbose = 0,
    mode = 'auto',
    min_delta = 0.0001,
    cooldown = 4,
    min_lr = 10e-7
)

es2 = EarlyStopping(
    monitor = 'val_loss',
    min_delta = 0.0001,
    patience = 10,
    verbose = 1,
    mode = 'min'
)

red2 = ReduceLROnPlateau(
    monitor = 'val_loss',
    factor = 0.2,
    patience = 10,
    verbose = 0,
    mode = 'auto',
    min_delta = 0.0001,
    cooldown = 4,
    min_lr = 10e-7
)

cb_list = [es, red, es2, red2]

model.compile(loss=tf.keras.losses.CategoricalCrossentropy(),
              optimizer=tf.keras.optimizers.Adam(),
              metrics=['accuracy'])

model.fit(
      train_generator,
      steps_per_epoch = 25,
      epochs=50, 
      validation_data = validation_generator,
      validation_steps = 5,
      callbacks = cb_list,
      shuffle = True,
      verbose=2)

这是训练结果。

Epoch 1/50
25/25 - 6s - loss: 1.2190 - accuracy: 0.3600 - val_loss: 1.2458 - val_accuracy: 0.2000
Epoch 2/50
25/25 - 5s - loss: 1.1036 - accuracy: 0.3200 - val_loss: 1.0900 - val_accuracy: 0.4500
Epoch 3/50
25/25 - 5s - loss: 1.0969 - accuracy: 0.3700 - val_loss: 1.0796 - val_accuracy: 0.4000
Epoch 4/50
25/25 - 5s - loss: 1.1017 - accuracy: 0.3100 - val_loss: 1.0607 - val_accuracy: 0.4000
Epoch 5/50
25/25 - 5s - loss: 1.0996 - accuracy: 0.4100 - val_loss: 1.1253 - val_accuracy: 0.2000
Epoch 6/50
25/25 - 5s - loss: 1.1087 - accuracy: 0.3200 - val_loss: 1.1006 - val_accuracy: 0.3000
Epoch 7/50
25/25 - 5s - loss: 1.1033 - accuracy: 0.2900 - val_loss: 1.0933 - val_accuracy: 0.4000
Epoch 8/50
25/25 - 5s - loss: 1.0941 - accuracy: 0.3700 - val_loss: 1.0923 - val_accuracy: 0.3500
Epoch 9/50
25/25 - 5s - loss: 1.0937 - accuracy: 0.4300 - val_loss: 1.1142 - val_accuracy: 0.4000
Epoch 10/50
25/25 - 5s - loss: 0.9846 - accuracy: 0.5100 - val_loss: 1.0261 - val_accuracy: 0.4000
Epoch 11/50
25/25 - 5s - loss: 0.9833 - accuracy: 0.6000 - val_loss: 0.6188 - val_accuracy: 0.8000
Epoch 12/50
25/25 - 5s - loss: 0.7028 - accuracy: 0.6700 - val_loss: 0.5505 - val_accuracy: 0.8000
Epoch 13/50
25/25 - 5s - loss: 0.7925 - accuracy: 0.7400 - val_loss: 0.7147 - val_accuracy: 0.7000
Epoch 14/50
25/25 - 5s - loss: 0.7215 - accuracy: 0.7500 - val_loss: 0.3434 - val_accuracy: 0.8500
Epoch 15/50
25/25 - 5s - loss: 0.5115 - accuracy: 0.8200 - val_loss: 0.4057 - val_accuracy: 0.8500
Epoch 16/50
25/25 - 5s - loss: 0.5340 - accuracy: 0.7400 - val_loss: 0.2904 - val_accuracy: 0.9000
Epoch 17/50
25/25 - 5s - loss: 0.4614 - accuracy: 0.8200 - val_loss: 0.6332 - val_accuracy: 0.8000
Epoch 18/50
25/25 - 5s - loss: 0.4115 - accuracy: 0.8200 - val_loss: 0.2076 - val_accuracy: 0.9500
Epoch 19/50
25/25 - 5s - loss: 0.5848 - accuracy: 0.7700 - val_loss: 0.2534 - val_accuracy: 0.9500
Epoch 20/50
25/25 - 5s - loss: 0.2918 - accuracy: 0.9000 - val_loss: 0.1854 - val_accuracy: 0.8500
Epoch 21/50
25/25 - 5s - loss: 0.4666 - accuracy: 0.8000 - val_loss: 0.2347 - val_accuracy: 0.9000
Epoch 22/50
25/25 - 5s - loss: 0.4083 - accuracy: 0.8100 - val_loss: 0.4281 - val_accuracy: 0.8500
Epoch 23/50
25/25 - 5s - loss: 0.5365 - accuracy: 0.8100 - val_loss: 0.3296 - val_accuracy: 0.8500
Epoch 24/50
25/25 - 5s - loss: 0.5211 - accuracy: 0.8000 - val_loss: 0.1931 - val_accuracy: 0.9500
Epoch 25/50
25/25 - 5s - loss: 0.3975 - accuracy: 0.8200 - val_loss: 0.2931 - val_accuracy: 0.8500
Epoch 26/50
25/25 - 5s - loss: 0.4249 - accuracy: 0.8600 - val_loss: 0.2481 - val_accuracy: 0.8500
Epoch 27/50
25/25 - 5s - loss: 0.3410 - accuracy: 0.8900 - val_loss: 0.4368 - val_accuracy: 0.8000
Epoch 28/50
25/25 - 5s - loss: 0.2647 - accuracy: 0.8600 - val_loss: 0.1591 - val_accuracy: 0.9500
Epoch 00028: early stopping
<tensorflow.python.keras.callbacks.History at 0x7fb704b6b080>

有没有办法进一步提高 val_accuracy 或准确度?

网上有很多不同的意见。一位消息人士建议添加更多层,而另一位消息人士则建议减少层数。这是为什么?

我尝试运行它几次,我知道我激活了随机播放功能,我得到了一些奇怪的不同结果(这个最大值是 0.95,另一个是 0.50 或 0.75 等等)。我不应该每次都得到大致相同的结果吗?

以及如何检查该模型的整体准确性?

如果代码中有任何不必要的函数或变量和错误,请提前道歉。

1个回答

看看你的火车损失与验证损失。如您所见,一般来说,您的模型的验证准确度比火车高很多。此外,您的准确性在每个时期后都会显着波动。另外,我认为您的模型结构不完善。努力使它更适合您的模型。我会说你的模型不适合你的数据,因此,建立一个更复杂的模型。或者使用预先构建的、现成的强大模型,如 ResNet、VGG 等。

此外,从您的时代运行速度快,我假设您的数据集很小。如果是这种情况,请尝试查找更多数据或进行数据扩充。

从训练过程来看,我认为更好的模型可以从你的数据中学到更多。至少,它并不表示转换为一个好的最佳值,因为精度和损耗的波动很大。

如果情况仍然如此,请在您的新模型中增加 epoch 大小并保留您的训练过程的历史以选择具有最高验证准确度的最佳 epoch。