假设我有一个非常小的数据集,只有50个图像.我想重新使用Red Pill教程中的代码,但是在每批训练中将随机变换应用于同一组图像,比如对亮度,对比度等进行随机更改.我只添加了一个函数:
def preprocessImages(x):
retValue = numpy.empty_like(x)
for i in range(50):
image = x[i]
image = tf.reshape(image, [28,28,1])
image = tf.image.random_brightness(image, max_delta=63)
#image = tf.image.random_contrast(image, lower=0.2, upper=1.8)
# Subtract off the mean and divide by the variance of the pixels.
float_image = tf.image.per_image_whitening(image)
float_image_Mat = sess.run(float_image)
retValue[i] = float_image_Mat.reshape((28*28))
return retValue
对旧代码的小改动:
batch = mnist.train.next_batch(50)
for i in range(1000):
#batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:preprocessImages(batch[0]), y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: preprocessImages(batch[0]), y_: batch[1], keep_prob: 0.5})
第一次迭代成功,然后崩溃:
step 0, training accuracy 0.02
W tensorflow/core/common_runtime/executor.cc:1027] 0x117e76c0 Compute status: Invalid argument: ReluGrad input is not finite. : Tensor had NaN values
[[Node: gradients_4/Relu_12_grad/Relu_12/CheckNumerics = CheckNumerics[T=DT_FLOAT, message="ReluGrad input is not finite.", _device="/job:localhost/replica:0/task:0/cpu:0"](add_16)]]
W tensorflow/core/common_runtime/executor.cc:1027] 0x117e76c0 Compute status: Invalid argument: ReluGrad input is not finite. : Tensor had NaN values
[[Node: gradients_4/Relu_13_grad/Relu_13/CheckNumerics = CheckNumerics[T=DT_FLOAT, message="ReluGrad input is not finite.", _device="/job:localhost/replica:0/task:0/cpu:0"](add_17)]]
W tensorflow/core/common_runtime/executor.cc:1027] 0x117e76c0 Compute status: Invalid argument: ReluGrad input is not finite. : Tensor had NaN values
[[Node: gradients_4/Relu_14_grad/Relu_14/CheckNumerics = CheckNumerics[T=DT_FLOAT, message="ReluGrad input is not finite.", _device="/job:localhost/replica:0/task:0/cpu:0"](add_18)]]
Traceback (most recent call last):
File "", line 1, in
File "/media/sf_Data/mnistConv.py", line 69, in
train_step.run(feed_dict={x: preprocessImages(batch[0]), y_: batch[1], keep_prob: 0.5})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1267, in run
_run_using_default_session(self, feed_dict, self.graph, session)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2763, in _run_using_default_session
session.run(operation, feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 345, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 419, in _do_run
e.code)
tensorflow.python.framework.errors.InvalidArgumentError: ReluGrad input is not finite. : Tensor had NaN values
[[Node: gradients_4/Relu_12_grad/Relu_12/CheckNumerics = CheckNumerics[T=DT_FLOAT, message="ReluGrad input is not finite.", _device="/job:localhost/replica:0/task:0/cpu:0"](add_16)]]
Caused by op u'gradients_4/Relu_12_grad/Relu_12/CheckNumerics', defined at:
File "", line 1, in
File "/media/sf_Data/mnistConv.py", line 58, in
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 165, in minimize
gate_gradients=gate_gradients)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 205, in compute_gradients
loss, var_list, gate_gradients=(gate_gradients == Optimizer.GATE_OP))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients.py", line 414, in gradients
in_grads = _AsList(grad_fn(op_wrapper, *out_grads))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_grad.py", line 107, in _ReluGrad
t = _VerifyTensor(op.inputs[0], op.name, "ReluGrad input is not finite.")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_grad.py", line 100, in _VerifyTensor
verify_input = array_ops.check_numerics(t, message=msg)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 48, in check_numerics
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 988, in __init__
self._traceback = _extract_stack()
...which was originally created as op u'Relu_12', defined at:
File "", line 1, in
File "/media/sf_Data/mnistConv.py", line 34, in
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 506, in relu
return _op_def_lib.apply_op("Relu", features=features, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 988, in __init__
self._traceback = _extract_stack()
这与我使用50个训练样例的个人数据集得到的错误完全相同.