当前位置:  开发笔记 > 人工智能 > 正文

TPU分类器InvalidArgumentError:未注册OpKernel以支持Op'CrossReplicaSum'与这些attrs

如何解决《TPU分类器InvalidArgumentError:未注册OpKernel以支持Op'CrossReplicaSum'与这些attrs》经验,为你挑选了0个好方法。

我尝试Estimator使用TPUEstimatorAPI 实现基于Tensorflow的模型失败.它在训练期间遇到错误:

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'CrossReplicaSum' with these attrs.  Registered devices: [CPU], Registered kernels: 
[[Node: CrossReplicaSum_5 = CrossReplicaSum[T=DT_FLOAT](gradients/dense_2/BiasAdd_grad/tuple/control_dependency_1)]]

一开始也有警告,但我不确定它是否相关:

WARNING:tensorflow:CrossShardOptimizer should be used within a tpu_shard_context, but got unset number_of_shards. Assuming 1.

这是模型函数的相关部分:

def model_fn(features, labels, mode, params):
"""A simple NN with two hidden layers of 10 nodes each."""
input_layer = tf.feature_column.input_layer(features, params['feature_columns'])

dense1 = tf.layers.dense(inputs=input_layer, units=10, activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer())
dense2 = tf.layers.dense(inputs=dense1, units=10, activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer())
logits = tf.layers.dense(inputs=dense2, units=4)

reshaped_logits = tf.reshape(logits, [-1, 1, 4])

onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=4)

loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=reshaped_logits)

if mode == tf.estimator.ModeKeys.TRAIN:

    optimizer = tf.contrib.tpu.CrossShardOptimizer(tf.train.AdagradOptimizer(learning_rate=0.05))

    train_op = optimizer.minimize(
        loss=loss,
        global_step=tf.train.get_global_step())

我正在TPUEstimator通过将--use_tpu标志设置为来尝试本地CPU执行False.在TPUEstimator被实例化和train被称为正是如此:

estimator_classifier = tf.contrib.tpu.TPUEstimator(
        model_fn=model_fn, 
        model_dir="/tmp/estimator_classifier_logs",
        config=tf.contrib.tpu.RunConfig(
            session_config=tf.ConfigProto(
                allow_soft_placement=True, log_device_placement=True),
            tpu_config=tf.contrib.tpu.TPUConfig()
        ),
        train_batch_size=DEFAULT_BATCH_SIZE,
        use_tpu=False,
        params={
            'feature_columns': feature_columns  
        }
    )

    tensors_to_log = {"probabilities": "softmax_tensor"}
    logging_hook = tf.train.LoggingTensorHook(tensors=tensors_to_log, every_n_iter=50)

    estimator_classifier.train(
        input_fn=data_factory.make_tpu_train_input_fn(train_x, train_y, DEFAULT_BATCH_SIZE),
        steps=DEFAULT_STEPS,
        hooks=[logging_hook]
    )

这个错误是什么意思,我该如何排除故障?

推荐阅读
wangtao
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有