数据挖掘 - 在我的 GAN 模型中，鉴别器损失迅速下降到10− 410−4而发电机损失在 5+ 的水平？ - 吾爱随笔录

我正在创建一个生成对抗网络 (GAN) 来生成人工交易卡，但我是该领域的新手。我一直遇到的问题是我的鉴别器，即使它更弱（基于可学习的参数），它的损失下降到 $10^{-4}$ （十的负四次方）。相比之下，生成器损失在前几个 epoch 中从 5+ 加速到 10+。此外，鉴别器对真假图像的准确率立即达到 100%，两端最多相差 2%。

我当前的生成模型：

def generator_model():

    model = tf.keras.Sequential()

    # First Dense Layer
    model.add(Dense(8*8*64, input_dim=100)) #input_shape=(100,)))
    model.add(BatchNormalization())
    model.add(LeakyReLU())

    model.add(Reshape((8, 8, 64)))

    # First Conv2DTranspose Layer
    model.add(Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    model.add(LeakyReLU())

    # Second Conv2DTranspose Layer
    model.add(Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    model.add(LeakyReLU())

    # Third Conv2DTranspose Layer
    model.add(Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    model.add(LeakyReLU())

    # Fourth Conv2DTranspose Layer
    model.add(Conv2DTranspose(32, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    model.add(LeakyReLU())

    # Fifth Conv2DTranspose Layer
    model.add(Conv2DTranspose(3, (3, 3), strides=(1, 1), padding='same', use_bias=False, activation='tanh'))

    return model

我当前的鉴别器模型：

def discriminator_model():

    model = tf.keras.Sequential()

    # First Conv2D Layer
    model.add(Conv2D(64, (5, 5), strides=(1, 1), padding='same', input_shape=[64, 64, 3]))
    model.add(LeakyReLU())
    model.add(Dropout(0.3))

    # Second Conv2D Layer
    model.add(Conv2D(64, (5, 5), strides=(2, 2), padding='same'))
    model.add(LeakyReLU())
    model.add(Dropout(0.3))

    # Third Conv2D Layer
    model.add(Conv2D(64, (5, 5), strides=(2, 2), padding='same'))
    model.add(LeakyReLU())
    model.add(Dropout(0.3))

    # Fourth Conv2D Layer
    model.add(Conv2D(32, (5, 5), strides=(2, 2), padding='same'))
    model.add(LeakyReLU())
    model.add(Dropout(0.3))

    # Flatten the Output and Give Binary Output via Sigmoid Activation Function
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))

    optimizer = tf.keras.optimizers.Adam(learning_rate=2e-4, beta_1=0.5)

    # Compile the Discriminator Model
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])

    return model

我目前的 GAN 模型：

def gan_model(generator, discriminator):
    GAN = tf.keras.Sequential()
    discriminator.trainable = False

    GAN.add(generator)
    GAN.add(discriminator)

    optimizer = tf.keras.optimizers.Adam(learning_rate=2e-4, beta_1=0.5)

    GAN.compile(loss='binary_crossentropy', optimizer=optimizer)

    return GAN

我目前的 GAN 训练方法：（ 我想知道这是否是一个构建得很糟糕的训练步骤）

def training_gan(gan_model, discriminator, generator, batch_size=256, epochs=100, epoch_steps=468, noise_dim=100):
    # Training the model by enumerating epochs 
    for epoch in range(0,epochs): 
        for step in range(0, epoch_steps):
            # Generating fake images 
            X_fake, y_fake = generate_img_using_model(generator, noise_dim, batch_size)
            # Generating real images 
            X_real, y_real = generate_real_images(batch_size)
            # Creating training set
            X_batch = np.concatenate([X_real, X_fake], axis = 0)
            y_batch = np.concatenate([y_real, y_fake], axis = 0)      
            # Training the discriminator
            d_loss, _ = discriminator.train_on_batch(X_batch, y_batch)
            # Gnerating noise input for the generator 
            X_gan = np.random.randn(noise_dim * batch_size)
            X_gan = X_gan.reshape(batch_size, noise_dim)
            y_gan = np.ones((batch_size, 1))
            # Training the GAN model using the generated noise 
            gan_loss = gan_model.train_on_batch(X_gan,y_gan)
            # Report the training progress
            areport_progress(epoch=epoch, step=step, d_loss=d_loss, gan_loss=gan_loss, noise_dim=noise_dim, epoch_steps=epoch_steps)
    # Report the progress on the full epoch
    report_progress(epoch=epoch, step=step, d_loss=d_loss, gan_loss=gan_loss, noise_dim=noise_dim, epoch_steps=epoch_steps, gan_model=gan_model, generator=generator, discriminator=discriminator, eoe=True)

我当前的“report_progress”方法：

def report_progress(epoch, step, d_loss, gan_loss, noise_dim = None, epoch_steps= None, gan_model=None, generator=None, discriminator=None, n_samples=100, eoe= False):
    if eoe and step == (epoch_steps-1):
        # Report a full epoch training performance
        # Sample some real images from the training set
        X_real, y_real = generate_real_images(n_samples)
        # Measure the accuracy of the discrinminator on real sampled images
        _ , acc_real = discriminator.evaluate(X_real, y_real, verbose=0)
        # Generates fake examples
        X_fake, y_fake = generate_img_using_model(generator, noise_dim, n_samples)
        # evaluate discriminator on fake images
        _, acc_fake = discriminator.evaluate(X_fake, y_fake, verbose=0)
        # summarize discriminator performance
        # plot images
        plt.figure(figsize=(20, 12), dpi=64)
        for i in range(10 * 10):
            # define subplot
            plt.subplot(10, 10, 1 + I)
            # turn off axis
            plt.axis('off')
            # plot raw pixel data
            plt.imshow(upscale_image(X_fake[i, :, :, :])) #, cmap='gray_r')
            #plt.show()
        filename = 'generated_examples_epoch%04d.png' % (epoch+1)
        plt.savefig(filename)
        print('Disciminator Accuracy on real images: %.0f%%, on fake images: %.0f%%' % (acc_real*100, acc_fake*100))
        # save the generator model tile file
        filename = 'generator_epochs/generator_model_%04d.h5' % epoch
        generator.save(filename)
        filename = 'discriminator_epochs/discriminator_model_%04d.h5' % epoch
        discriminator.save(filename)
        filename = 'GAN_epochs/GAN_model_%04d.h5' % epoch
        gan_model.save(filename)
    elif step % 10 == 0:
        # Report a single step training performance 
        print(f"[Epoch {epoch}, Step {step}] d_loss = {round(d_loss, 4)} | gan_loss = {round(gan_loss, 4)}")

注意：以上所有代码主要来自一个教程，我不再有现成的链接。

附加问题：

为什么在这么多教程中，所有“Conv2DTranspose”层中的“use_bias”都设置为“False”？
为什么在最终的“Conv2DTranspose”层中使用“tanh”激活器？
为什么在 GAN 模型中将“discriminator.trainable”设置为“False”？

任何关于构建协同生成器和鉴别器网络的基础知识的建议和/或推荐读物也将不胜感激。