如何在自动编码器中拥有相同的输入和输出形状?

人工智能 卷积神经网络 张量流 喀拉斯 自动编码器
2021-11-01 10:43:32

我正在构建一个去噪自动编码器。我想拥有相同的输入和输出形状图像。

这是我的架构:

input_img = Input(shape=(IMG_HEIGHT, IMG_WIDTH, 1))  

x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)



x = Conv2D(32, (3, 3), activation='relu', padding='valid')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)


# decodedSize = K.int_shape(decoded)[1:]

# x_size = K.int_shape(input_img)
# decoded = Reshape(decodedSize, input_shape=decodedSize)(decoded)


autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

我的输入形状是:1169x827

这是 Keras 输出:

Model: "model_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_7 (InputLayer)         [(None, 1169, 827, 1)]    0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 1169, 827, 32)     320       
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 585, 414, 32)      0         
_________________________________________________________________
conv2d_31 (Conv2D)           (None, 585, 414, 64)      18496     
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 293, 207, 64)      0         
_________________________________________________________________
conv2d_32 (Conv2D)           (None, 291, 205, 32)      18464     
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 582, 410, 32)      0         
_________________________________________________________________
conv2d_33 (Conv2D)           (None, 582, 410, 32)      9248      
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 1164, 820, 32)     0         
_________________________________________________________________
conv2d_34 (Conv2D)           (None, 1162, 818, 1)      289       
===============================================================

我怎样才能有相同的输入和输出形状?

2个回答

如果您查看 Keras 的输出,会发现有多个步骤会丢失像素:

奇数大小的最大池化总是会丢失一个像素。使用 3x3 内核的 Conv2D 也会丢失 2 个像素,尽管我很困惑,它似乎不会在下采样步骤中发生。

直观地说,用足够的边界像素填充原始图像以补偿由于各个层造成的像素损失将是最简单的解决方案。目前我无法计算它应该是多少,但我怀疑四舍五入到 4 的倍数应该可以处理最大池化层数。对于去噪,可以只从最外层的像素复制边界,可能使用某种低通滤波来避免伪影。

我不知道这是否是正确的做法,但我解决了这个问题。

按照上面的代码,我添加了:

img_size = K.int_shape(input_img)[1:]

resized_image_tensor = tf.image.resize(decoded, list(img_size[:2]))****


autoencoder = Model(input_img, resized_image_tensor)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

我使用 tf.image.resize 来同步重建图像和输入图像的形状。

希望能帮助到你。