假设我有以下参数的网络:
用于语义分割的完全卷积网络
损失=加权二元交叉熵(但它可能是任何损失函数,无所谓)
5类 - 输入是图像,地面实例是二元掩模
批量大小= 16
现在,我知道损失是按以下方式计算的:二进制交叉熵应用于图像中关于每个类的每个像素.基本上,每个像素将有5个损耗值
这一步后会发生什么?
当我训练我的网络时,它只为一个纪元打印一个损失值.在生成单个值时需要进行多种级别的损失累积,以及在文档/代码中它的发生方式根本不明确.
首先结合的是 - (1)类的损失值(例如5个值(每个类一个)得到每个像素组合)然后是图像中的所有像素或(2)图像中的每个像素个别班级,然后所有班级损失合并?
这些不同的像素组合究竟是如何发生的 - 它在何处被求和/在哪里被平均?
Keras的binary_crossentropy平均值超过axis=-1
.那么这是所有类的所有像素的平均值还是所有类的平均值,还是两者都是?
以不同的方式说明:不同类别的损失如何组合以产生图像的单一损失值?
这完全没有在文档中解释,对于对keras进行多类预测的人来说非常有用,无论网络类型如何.这是keras代码开始的链接,其中一个首先通过了损失函数.
我能找到最接近解释的是
loss:String(目标函数的名称)或目标函数.看到损失.如果模型具有多个输出,则可以通过传递字典或损失列表在每个输出上使用不同的损失.然后,模型将最小化的损失值将是所有单个损失的总和
来自keras.那么这是否意味着图像中每个类的损失只是总和?
此处的示例代码供有人试用.这是从Kaggle借来的基本实现,并针对多标签预测进行了修改:
# Build U-Net model num_classes = 5 IMG_DIM = 256 IMG_CHAN = 3 weights = {0: 1, 1: 1, 2: 1, 3: 1, 4: 1000} #chose an extreme value just to check for any reaction inputs = Input((IMG_DIM, IMG_DIM, IMG_CHAN)) s = Lambda(lambda x: x / 255) (inputs) c1 = Conv2D(8, (3, 3), activation='relu', padding='same') (s) c1 = Conv2D(8, (3, 3), activation='relu', padding='same') (c1) p1 = MaxPooling2D((2, 2)) (c1) c2 = Conv2D(16, (3, 3), activation='relu', padding='same') (p1) c2 = Conv2D(16, (3, 3), activation='relu', padding='same') (c2) p2 = MaxPooling2D((2, 2)) (c2) c3 = Conv2D(32, (3, 3), activation='relu', padding='same') (p2) c3 = Conv2D(32, (3, 3), activation='relu', padding='same') (c3) p3 = MaxPooling2D((2, 2)) (c3) c4 = Conv2D(64, (3, 3), activation='relu', padding='same') (p3) c4 = Conv2D(64, (3, 3), activation='relu', padding='same') (c4) p4 = MaxPooling2D(pool_size=(2, 2)) (c4) c5 = Conv2D(128, (3, 3), activation='relu', padding='same') (p4) c5 = Conv2D(128, (3, 3), activation='relu', padding='same') (c5) u6 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same') (c5) u6 = concatenate([u6, c4]) c6 = Conv2D(64, (3, 3), activation='relu', padding='same') (u6) c6 = Conv2D(64, (3, 3), activation='relu', padding='same') (c6) u7 = Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same') (c6) u7 = concatenate([u7, c3]) c7 = Conv2D(32, (3, 3), activation='relu', padding='same') (u7) c7 = Conv2D(32, (3, 3), activation='relu', padding='same') (c7) u8 = Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same') (c7) u8 = concatenate([u8, c2]) c8 = Conv2D(16, (3, 3), activation='relu', padding='same') (u8) c8 = Conv2D(16, (3, 3), activation='relu', padding='same') (c8) u9 = Conv2DTranspose(8, (2, 2), strides=(2, 2), padding='same') (c8) u9 = concatenate([u9, c1], axis=3) c9 = Conv2D(8, (3, 3), activation='relu', padding='same') (u9) c9 = Conv2D(8, (3, 3), activation='relu', padding='same') (c9) outputs = Conv2D(num_classes, (1, 1), activation='sigmoid') (c9) model = Model(inputs=[inputs], outputs=[outputs]) model.compile(optimizer='adam', loss=weighted_loss(weights), metrics=[mean_iou]) def weighted_loss(weightsList): def lossFunc(true, pred): axis = -1 #if channels last #axis= 1 #if channels first classSelectors = K.argmax(true, axis=axis) classSelectors = [K.equal(tf.cast(i, tf.int64), tf.cast(classSelectors, tf.int64)) for i in range(len(weightsList))] classSelectors = [K.cast(x, K.floatx()) for x in classSelectors] weights = [sel * w for sel,w in zip(classSelectors, weightsList)] weightMultiplier = weights[0] for i in range(1, len(weights)): weightMultiplier = weightMultiplier + weights[i] loss = BCE_loss(true, pred) - (1+dice_coef(true, pred)) loss = loss * weightMultiplier return loss return lossFunc model.summary()
实际的BCE-DICE损失功能可以在这里找到.
问题的动机:基于上述代码,20个时期后网络的总验证损失约为1%; 然而,前4个班级的联盟得分平均交叉点各自超过95%,但最后一个班级则为23%.清楚地表明第五节课表现不佳.但是,这种损失的准确性并没有在损失中得到反映.因此,这意味着样本的个人损失正在以一种完全抵消我们在第五类中看到的巨大损失的方式进行组合.因此,当每个样品的损失在批次中合并时,它仍然非常低.我不确定如何调和这些信息.