我已经阅读了关于TensorFlow的CNN教程,我正在尝试为我的项目使用相同的模型.现在的问题是数据读取.我有大约25000张图像用于培训,大约5000张用于测试和验证.文件是png格式,我可以读取它们并将它们转换为numpy.ndarray.
教程中的CNN示例使用队列从提供的文件列表中获取记录.我试图通过将我的图像重塑为一维数组并在其前面附加标签值来创建我自己的二进制文件.所以我的数据看起来像这样
[[1,12,34,24,53,...,105,234,102], [12,112,43,24,52,...,115,244,98], .... ]
上述数组的单行长度为22501,其中第一个元素是标签.
我将文件转储到使用pickle并尝试使用tf.FixedLengthRecordReader从文件中读取以从文件中读取,如示例中所示
我正在做与cifar10_input.py中给出的相同的事情来读取二进制文件并将它们放入记录对象中.
现在,当我从文件中读取标签和图像值不同时.我可以理解这是因为pickle还在二进制文件中转储大括号和括号的额外信息,并且它们更改了固定长度的记录大小.
上面的示例使用文件名并将其传递给队列以获取文件,然后将队列传递给文件中的单个记录.
我想知道我是否可以将上面定义的numpy数组而不是文件名传递给某些阅读器,它可以从该数组而不是文件中逐个获取记录.
使用CNN示例代码使数据工作的最简单方法可能是修改版本read_cifar10()
并使用它:
写出包含numpy数组内容的二进制文件.
import numpy as np images_and_labels_array = np.array([[...], ...], # [[1,12,34,24,53,...,102], # [12,112,43,24,52,...,98], # ...] dtype=np.uint8) images_and_labels_array.tofile("/tmp/images.bin")
此文件类似于CIFAR10数据文件中使用的格式.您可能希望生成多个文件以获得读取并行性.请注意,ndarray.tofile()
以行主顺序写入二进制数据而没有其他元数据; pickle数组将添加TensorFlow的解析例程无法理解的特定于Python的元数据.
编写一个read_cifar10()
处理您的记录格式的修改版本.
def read_my_data(filename_queue): class ImageRecord(object): pass result = ImageRecord() # Dimensions of the images in the dataset. label_bytes = 1 # Set the following constants as appropriate. result.height = IMAGE_HEIGHT result.width = IMAGE_WIDTH result.depth = IMAGE_DEPTH image_bytes = result.height * result.width * result.depth # Every record consists of a label followed by the image, with a # fixed number of bytes for each. record_bytes = label_bytes + image_bytes assert record_bytes == 22501 # Based on your question. # Read a record, getting filenames from the filename_queue. No # header or footer in the binary, so we leave header_bytes # and footer_bytes at their default of 0. reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) result.key, value = reader.read(filename_queue) # Convert from a string to a vector of uint8 that is record_bytes long. record_bytes = tf.decode_raw(value, tf.uint8) # The first bytes represent the label, which we convert from uint8->int32. result.label = tf.cast( tf.slice(record_bytes, [0], [label_bytes]), tf.int32) # The remaining bytes after the label represent the image, which we reshape # from [depth * height * width] to [depth, height, width]. depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]), [result.depth, result.height, result.width]) # Convert from [depth, height, width] to [height, width, depth]. result.uint8image = tf.transpose(depth_major, [1, 2, 0]) return result
修改distorted_inputs()
以使用新数据集:
def distorted_inputs(data_dir, batch_size): """[...]""" filenames = ["/tmp/images.bin"] # Or a list of filenames if you # generated multiple files in step 1. for f in filenames: if not gfile.Exists(f): raise ValueError('Failed to find file: ' + f) # Create a queue that produces the filenames to read. filename_queue = tf.train.string_input_producer(filenames) # Read examples from files in the filename queue. read_input = read_my_data(filename_queue) reshaped_image = tf.cast(read_input.uint8image, tf.float32) # [...] (Maybe modify other parameters in here depending on your problem.)
考虑到您的起点,这是一个最小的步骤.使用TensorFlow操作进行PNG解码可能更有效,但这将是一个更大的变化.
在你的问题中,你特别问:
我想知道我是否可以将上面定义的numpy数组而不是文件名传递给某些阅读器,它可以从该数组而不是文件中逐个获取记录.
您可以直接将numpy数组提供给队列,但cifar10_input.py
与其他答案建议相比,它将对代码进行更具侵略性的更改.
和以前一样,假设您的问题中包含以下数组:
import numpy as np images_and_labels_array = np.array([[...], ...], # [[1,12,34,24,53,...,102], # [12,112,43,24,52,...,98], # ...] dtype=np.uint8)
然后,您可以定义包含整个数据的队列,如下所示:
q = tf.FIFOQueue([tf.uint8, tf.uint8], shapes=[[], [22500]]) enqueue_op = q.enqueue_many([image_and_labels_array[:, 0], image_and_labels_array[:, 1:]])
...然后调用sess.run(enqueue_op)
填充队列.
另外,更有效的方法将是养活记录到队列中,你可以从一个并行线程执行(见这个答案,详细了解如何做到这一点的工作):
# [With q as defined above.] label_input = tf.placeholder(tf.uint8, shape=[]) image_input = tf.placeholder(tf.uint8, shape=[22500]) enqueue_single_from_feed_op = q.enqueue([label_input, image_input]) # Then, to enqueue a single example `i` from the array. sess.run(enqueue_single_from_feed_op, feed_dict={label_input: image_and_labels_array[i, 0], image_input: image_and_labels_array[i, 1:]})
或者,要一次将批次排入队列,这样会更有效:
label_batch_input = tf.placeholder(tf.uint8, shape=[None]) image_batch_input = tf.placeholder(tf.uint8, shape=[None, 22500]) enqueue_batch_from_feed_op = q.enqueue([label_batch_input, image_batch_input]) # Then, to enqueue a batch examples `i` through `j-1` from the array. sess.run(enqueue_single_from_feed_op, feed_dict={label_input: image_and_labels_array[i:j, 0], image_input: image_and_labels_array[i:j, 1:]})