为什么 Keras 模型总是预测同一个类/如何提高这个模型的准确性?

数据挖掘 喀拉斯 多类分类
2022-02-12 21:37:32

第一次在这里发帖。我正在做一个关于多类图像分类的项目,并使用 Keras 创建了一个 python 脚本来训练具有迁移学习的模型。令我沮丧的是,该模型总是预测相同的类别,我已将模型简化为 3 个图像类别(我使用的是 kaggle 食品图像库,每个类别有 800 个训练样本和 800 个验证样本以及图像重新格式化)并尝试了不同的优化器,但它仍然归结为同一类,而该模型在 25 个训练阶段显然也只有 ~0.2563 的准确度。我已经发布了下面的代码,我怎样才能提高这个脚本的准确性并解决相同的预测类问题?

import pandas as pd
import numpy as np
import os
import keras
import matplotlib.pyplot as plt
from keras.layers import Dense, GlobalAveragePooling2D
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras import optimizers
from keras import applications
from keras.applications.vgg16 import preprocess_input

img_classes = 3

base_model = applications.VGG16(weights='imagenet', include_top=False)

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
x = Dense(1024, activation='relu')(x)
x = Dense(512, activation='relu')(x)
preds = Dense(img_classes, activation='softmax')(x)

model = Model(inputs=base_model.input, outputs=preds)

for i, layer in enumerate(model.layers):
    print(i, layer.name)

for layer in model.layers[:25]:
    layer.trainable = False

train_datagen = ImageDataGenerator(rescale=1./255,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   fill_mode='nearest',
                                   preprocessing_function=preprocess_input)

train_generator = train_datagen.flow_from_directory('./food-101/bigtrain',
                                                    target_size=(128, 128),
                                                    color_mode='rgb',
                                                    classes=['apple_pie', 'churros', 'miso_soup'],
                                                    batch_size=1,
                                                    class_mode='categorical',
                                                    shuffle=True)

val_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest',
    preprocessing_function=preprocess_input,)

val_generator = val_datagen.flow_from_directory(
    './food-101/bigval',
    target_size=(128, 128),
    classes=['apple_pie', 'churros', 'miso_soup'],
    batch_size=1,
    class_mode='categorical',
    shuffle=True)

# model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

model.compile(optimizer=optimizers.SGD(lr=0.00001,
                                       momentum=0.9,
                                       decay=0.0001,
                                       nesterov=True), loss='categorical_crossentropy', metrics=['accuracy'])

batch_size = 1


validation_steps = 64 // batch_size
step_size_train = train_generator.n//train_generator.batch_size

model.fit_generator(generator=train_generator,
                    steps_per_epoch=step_size_train,
                    epochs=25,
                    validation_data=val_generator,
                    validation_steps=validation_steps)

model.save('./test_try_vgg_9.h5')

预测结果示例:

类:apple_pie、churros、miso_soup

miso soup
[0.3202575  0.48074356 0.19899891] rmsprop 
[0.45246536 0.4505403  0.09699439] sgd

churros
[0.37473327 0.35784692 0.2674198 ] rmsprop
[0.4145825  0.465228   0.12018944] sgd

这是预测脚本:

from keras.models import load_model
from keras import optimizers
from keras.preprocessing import image
import numpy as np
from keras.applications.vgg16 import preprocess_input

# dimensions of our images
img_width, img_height = 512, 512

# load model
model = load_model('./test_try_vgg_9.h5')

# predicting images
img = image.load_img('./food-101/training/apple_pie/551535.jpg')
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

pred = model.predict(x)
print("Probability: ")
print(pred[0])
1个回答

您已将模型的所有参数设置为 untrainable

您可以通过打印模型摘要来轻松检查它:

model.summary() # This command prints the summary of the model

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
global_average_pooling2d_1 ( (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              525312    
_________________________________________________________________
dense_2 (Dense)              (None, 1024)              1049600   
_________________________________________________________________
dense_3 (Dense)              (None, 512)               524800    
_________________________________________________________________
dense_4 (Dense)              (None, 3)                 1539      
=================================================================
Total params: 16,815,939
Trainable params: 0
Non-trainable params: 16,815,939
_________________________________________________________________

正如您在摘要底部看到的那样,您有0 个可训练的权重要获得您想要的,请将设置层可训练参数的 for 循环从[:25]更改为[:20]

for layer in model.layers[:20]:
    layer.trainable = False

model.layers[20] 是您的第一个新层(在摘要中命名为“dense_1”):

model.layers[20].name
'dense_1'