18赞

Keras分类 - 物体检测

作者：惬听风吟jyy_802 | 2023-09-06 13:11

如何解决《Keras分类-物体检测》经验，为你挑选了1个好方法。

我正在使用Keras和Python进行分类然后对象检测.我已经对猫/狗进行了80%以上的准确度分类,我现在的结果还不错.我的问题是如何从输入图像中检测猫或狗？我完全糊涂了.我想使用自己的高度,而不是来自互联网的预训练.

这是我目前的代码:

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
import numpy as np
import matplotlib.pyplot as plt
import matplotlib

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

#########################################################################################################
#VALUES
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000 #1000 cats/dogs
nb_validation_samples = 800 #400cats/dogs
nb_epoch = 50
#########################################################################################################

#MODEL
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])


# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)
##########################################################################################################
#TEST AUGMENTATION
img = load_img('data/train/cats/cat.0.jpg')  # this is a PIL image
x = img_to_array(img)  # this is a Numpy array with shape (3, 150, 150)
x = x.reshape((1,) + x.shape)  # this is a Numpy array with shape (1, 3, 150, 150)

# the .flow() command below generates batches of randomly transformed images
# and saves the results to the `preview/` directory
i = 0
for batch in train_datagen.flow(x, batch_size=1,
                          save_to_dir='data/TEST AUGMENTATION', save_prefix='cat', save_format='jpeg'):
    i += 1
    if i > 20:
        break  # otherwise the generator would loop indefinitely
##########################################################################################################
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)

#PREPARE TRAINING DATA
train_generator = train_datagen.flow_from_directory(
        train_data_dir, #data/train
        target_size=(img_width, img_height),  #RESIZE to 150/150
        batch_size=32,
        class_mode='binary')  #since we are using binarycrosentropy need binary labels

#PREPARE VALIDATION DATA
validation_generator = test_datagen.flow_from_directory(
        validation_data_dir,  #data/validation
        target_size=(img_width, img_height), #RESIZE 150/150
        batch_size=32,
        class_mode='binary')


#START model.fit
history =model.fit_generator(
        train_generator, #train data
        samples_per_epoch=nb_train_samples,
        nb_epoch=nb_epoch,
        validation_data=validation_generator,  #validation data
        nb_val_samples=nb_validation_samples)


model.save_weights('savedweights.h5')
# list all data in history
print(history.history.keys())

#ACC VS VAL_ACC
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy ACC VS VAL_ACC')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
#LOSS VS VAL_LOSS
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss LOSS vs VAL_LOSS')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()


model.load_weights('first_try.h5')

所以现在,因为我对猫和狗进行了分类,我需要做什么以及如何输入图像并通过它来找到带有边界框的猫或狗？我对这一点完全不熟悉,甚至不确定我是否以正确的方式解决这个问题？谢谢.

更新嗨,很抱歉这么晚发布结果,几天都无法解决这个问题.我正在导入图像并将其重塑为1,3,150,150形状,因为150,150形状带来错误:

Exception: Error when checking : expected convolution2d_input_1 to have 4 dimensions, but got array with shape (150L, 150L)

导入图片:

#load test image
img=load_img('data/prediction/cat.155.jpg')
#reshape to 1,3,150,150
img = np.arange(1* 150 * 150).reshape((1,3,150, 150))
#check shape
print(img.shape)

然后我将def predict_function(x)更改为:

def predict_function(x):
    # example of prediction function for simplicity, you
    # should probably use `return model.predict(x)`
   # random.seed(x[0][0])
  #  return random.random()
   return model.predict(img)

现在我跑的时候:

best_box = get_best_bounding_box(img, predict_function)
print('best bounding box %r' % (best_box, ))

我得到输出作为最佳边界框:无

所以我跑了:

model.predict(img)

得到以下结论:

model.predict(img)
Out[54]: array([[ 0.]], dtype=float32)

所以它根本不检查它是猫还是狗......有什么想法吗？

注意:当def预测时)函数(x)使用:

random.seed(x[0][0])
   return random.random()

我确实得到输出,它复选框并给出最好的输出.

1> ShmulikA..：

你构建的机器学习模型和你想要实现的任务是不一样的.模型尝试解决分类任务,同时您的目标是检测图像内的对象,这是一个对象检测任务.

分类有一个布尔问题,而检测问题有两个以上的答案答案.

你能做什么？

我建议你尝试三种可能性:

1.使用滑动窗口结合您的模型

定义尺寸的裁剪箱(例如从20X20到160X160)并使用滑动窗.对于每个窗口,尝试预测它的概率,并最终采用您预测的最大窗口.

这将为边界框生成多个候选项,您将使用最高概率选择边界框.

这可能很慢,因为我们需要预测数百个+样本.

另一个选择是尝试在网络上实现RCNN(另一个链接)或Faster-RCNN网络.这些网络基本上减少了候选使用的边界框窗口的数量.

更新 - 计算滑动窗口示例

以下代码演示了如何进行滑动窗口算法.你可以改变参数.

import random
import numpy as np

WINDOW_SIZES = [i for i in range(20, 160, 20)]


def get_best_bounding_box(img, predict_fn, step=10, window_sizes=WINDOW_SIZES):
    best_box = None
    best_box_prob = -np.inf

    # loop window sizes: 20x20, 30x30, 40x40...160x160
    for win_size in window_sizes:
        for top in range(0, img.shape[0] - win_size + 1, step):
            for left in range(0, img.shape[1] - win_size + 1, step):
                # compute the (top, left, bottom, right) of the bounding box
                box = (top, left, top + win_size, left + win_size)

                # crop the original image
                cropped_img = img[box[0]:box[2], box[1]:box[3]]

                # predict how likely this cropped image is dog and if higher
                # than best save it
                print('predicting for box %r' % (box, ))
                box_prob = predict_fn(cropped_img)
                if box_prob > best_box_prob:
                    best_box = box
                    best_box_prob = box_prob

    return best_box


def predict_function(x):
    # example of prediction function for simplicity, you
    # should probably use `return model.predict(x)`
    random.seed(x[0][0])
    return random.random()


# dummy array of 256X256
img = np.arange(256 * 256).reshape((256, 256))

best_box = get_best_bounding_box(img, predict_function)
print('best bounding box %r' % (best_box, ))

示例输出:

predicting for box (0, 0, 20, 20)
predicting for box (0, 10, 20, 30)
predicting for box (0, 20, 20, 40)
...
predicting for box (110, 100, 250, 240)
predicting for box (110, 110, 250, 250)
best bounding box (140, 80, 160, 100)

2.训练新的网络进行物体检测任务

你可以看一下pascal数据集(这里的例子),它包含20个类,其中两个是猫和狗.

数据集包含作为Y目标的对象的位置.

3.使用现有网络完成此任务

最后但并非最不重要的是,您可以重复使用现有网络,甚至可以为您的特定任务执行"知识转移"(此处为keras示例).

看看下面的convnets-keraslib.

所以选择最好的方法来更新我们的结果.

推荐阅读

程序员
XCode 4.5中的分布式构建？

如何解决《XCode4.5中的分布式构建？》经验，为你挑选了1个好方法。 ... [详细]
程序员
飞镖货币格式

如何解决《飞镖货币格式》经验，为你挑选了2个好方法。 ... [详细]
程序员
如何以编程方式训练SpeechRecognitionEngine并将音频文件转换为C#或vb.net中的文本

如何解决《如何以编程方式训练SpeechRecognitionEngine并将音频文件转换为C#或vb.net中的文本》经验，为你挑选了1个好方法。 ... [详细]
程序员
MySQL UPDATE随机数介于1-3之间

如何解决《MySQLUPDATE随机数介于1-3之间》经验，为你挑选了2个好方法。 ... [详细]
程序员
--launcher.XXMaxPermSize在eclipse.ini中出现两次

如何解决《--launcher.XXMaxPermSize在eclipse.ini中出现两次》经验，为你挑选了0个好方法。 ... [详细]
程序员
对于在时间1和时间2之间没有发生任何变化的情况,删除所有id的情况

如何解决《对于在时间1和时间2之间没有发生任何变化的情况,删除所有id的情况》经验，为你挑选了0个好方法。 ... [详细]
程序员
初学Java挑战

如何解决《初学Java挑战》经验，为你挑选了1个好方法。 ... [详细]
程序员
登录用户注册与巫术

如何解决《登录用户注册与巫术》经验，为你挑选了1个好方法。 ... [详细]
程序员
从一个元组中的两个列表中总结相应的对 - 在Haskell中

如何解决《从一个元组中的两个列表中总结相应的对-在Haskell中》经验，为你挑选了1个好方法。 ... [详细]
程序员
所有电子邮件提供商都会忽略@前面的时段吗？

如何解决《所有电子邮件提供商都会忽略@前面的时段吗？》经验，为你挑选了2个好方法。 ... [详细]
程序员
AI有公共聊天数据库吗？

如何解决《AI有公共聊天数据库吗？》经验，为你挑选了0个好方法。 ... [详细]
程序员
如何降低svg过滤器中alpha层的不透明度？

如何解决《如何降低svg过滤器中alpha层的不透明度？》经验，为你挑选了2个好方法。 ... [详细]
程序员
零正确(资源处理)规则在哪里？

如何解决《零正确(资源处理)规则在哪里？》经验，为你挑选了2个好方法。 ... [详细]
程序员
Flexslider 100%宽度创建水平滚动条

如何解决《Flexslider100%宽度创建水平滚动条》经验，为你挑选了1个好方法。 ... [详细]
程序员
在java EE中,我应该将哪些jar放入库目录？

如何解决《在javaEE中,我应该将哪些jar放入库目录？》经验，为你挑选了2个好方法。 ... [详细]
程序员
在AngularJS指令中查看函数的值

如何解决《在AngularJS指令中查看函数的值》经验，为你挑选了2个好方法。 ... [详细]
程序员
页面右侧的Twitter引导白色缺口

如何解决《页面右侧的Twitter引导白色缺口》经验，为你挑选了4个好方法。 ... [详细]
程序员
Qt - 如何构建多标签窗口？

如何解决《Qt-如何构建多标签窗口？》经验，为你挑选了1个好方法。 ... [详细]
程序员
在NSString中解码Base-64编码的PNG

如何解决《在NSString中解码Base-64编码的PNG》经验，为你挑选了3个好方法。 ... [详细]
程序员
onClick函数导致"Uncaught SyntaxError:Unexpected token}"错误

如何解决《onClick函数导致"UncaughtSyntaxError:Unexpectedtoken}"错误》经验，为你挑选了1个好方法。 ... [详细]

惬听风吟jyy_802

这个屌丝很懒，什么也没留下！

关注作者

Tags | 热门标签

RankList | 热门文章