17赞

泡菜蟒蛇烤宽面条模型

作者：可爱的天使keven_464 | 2023-09-09 19:47

如何解决《泡菜蟒蛇烤宽面条模型》经验，为你挑选了1个好方法。

我已经在这里训练了recipie以下的烤宽面条一个简单的长短期记忆(LSTM)型号:https://github.com/Lasagne/Recipes/blob/master/examples/lstm_text_generation.py

这是架构:

l_in = lasagne.layers.InputLayer(shape=(None, None, vocab_size))

# We now build the LSTM layer which takes l_in as the input layer
# We clip the gradients at GRAD_CLIP to prevent the problem of exploding gradients. 

l_forward_1 = lasagne.layers.LSTMLayer(
    l_in, N_HIDDEN, grad_clipping=GRAD_CLIP,
    nonlinearity=lasagne.nonlinearities.tanh)

l_forward_2 = lasagne.layers.LSTMLayer(
    l_forward_1, N_HIDDEN, grad_clipping=GRAD_CLIP,
    nonlinearity=lasagne.nonlinearities.tanh)

# The l_forward layer creates an output of dimension (batch_size, SEQ_LENGTH, N_HIDDEN)
# Since we are only interested in the final prediction, we isolate that quantity and feed it to the next layer. 
# The output of the sliced layer will then be of size (batch_size, N_HIDDEN)
l_forward_slice = lasagne.layers.SliceLayer(l_forward_2, -1, 1)

# The sliced output is then passed through the softmax nonlinearity to create probability distribution of the prediction
# The output of this stage is (batch_size, vocab_size)
l_out = lasagne.layers.DenseLayer(l_forward_slice, num_units=vocab_size, W = lasagne.init.Normal(), nonlinearity=lasagne.nonlinearities.softmax)

# Theano tensor for the targets
target_values = T.ivector('target_output')

# lasagne.layers.get_output produces a variable for the output of the net
network_output = lasagne.layers.get_output(l_out)

# The loss function is calculated as the mean of the (categorical) cross-entropy between the prediction and target.
cost = T.nnet.categorical_crossentropy(network_output,target_values).mean()

# Retrieve all parameters from the network
all_params = lasagne.layers.get_all_params(l_out)

# Compute AdaGrad updates for training
print("Computing updates ...")
updates = lasagne.updates.adagrad(cost, all_params, LEARNING_RATE)

# Theano functions for training and computing cost
print("Compiling functions ...")
train = theano.function([l_in.input_var, target_values], cost, updates=updates, allow_input_downcast=True)
compute_cost = theano.function([l_in.input_var, target_values], cost, allow_input_downcast=True)

# In order to generate text from the network, we need the probability distribution of the next character given
# the state of the network and the input (a seed).
# In order to produce the probability distribution of the prediction, we compile a function called probs. 

probs = theano.function([l_in.input_var],network_output,allow_input_downcast=True)

并通过以下方式培训模型:

for it in xrange(data_size * num_epochs / BATCH_SIZE):
        try_it_out() # Generate text using the p^th character as the start. 

        avg_cost = 0;
        for _ in range(PRINT_FREQ):
            x,y = gen_data(p)

            #print(p)
            p += SEQ_LENGTH + BATCH_SIZE - 1 
            if(p+BATCH_SIZE+SEQ_LENGTH >= data_size):
                print('Carriage Return')
                p = 0;


            avg_cost += train(x, y)
        print("Epoch {} average loss = {}".format(it*1.0*PRINT_FREQ/data_size*BATCH_SIZE, avg_cost / PRINT_FREQ))

如何保存模型,以便我不需要再次训练？使用scikit我通常只是挑选模型对象.然而,我不清楚与Theano /烤宽面条的类似过程.

1> Michael Gygl..：

您可以使用numpy保存权重:

np.savez('model.npz', *lasagne.layers.get_all_param_values(network_output))

然后再加载它们,如下所示:

with np.load('model.npz') as f:
     param_values = [f['arr_%d' % i] for i in range(len(f.files))]
lasagne.layers.set_all_param_values(network_output, param_values)

资料来源:https://github.com/Lasagne/Lasagne/blob/master/examples/mnist.py

至于模型定义本身:在设置预训练权重之前,一个选项肯定是保留代码并重新生成网络.

推荐阅读

程序员
如何使用htmlagilitypack从这个示例中提取HTML文本？

如何解决《如何使用htmlagilitypack从这个示例中提取HTML文本？》经验，为你挑选了1个好方法。 ... [详细]
程序员
在"文本"小部件中重新绑定"全选"

如何解决《在"文本"小部件中重新绑定"全选"》经验，为你挑选了1个好方法。 ... [详细]
程序员
当我不关心调用约定时,我是否应该更喜欢__fastcall上的"默认"调用约定？

如何解决《当我不关心调用约定时,我是否应该更喜欢__fastcall上的"默认"调用约定？》经验，为你挑选了2个好方法。 ... [详细]
程序员
将php输出保存在文件中

如何解决《将php输出保存在文件中》经验，为你挑选了1个好方法。 ... [详细]
程序员
Linq到实体的联盟订单

如何解决《Linq到实体的联盟订单》经验，为你挑选了0个好方法。 ... [详细]
程序员
StyleCop Madness:受保护的领域

如何解决《StyleCopMadness:受保护的领域》经验，为你挑选了1个好方法。 ... [详细]
程序员
在Android中的CURL

如何解决《在Android中的CURL》经验，为你挑选了1个好方法。 ... [详细]
程序员
SimpleDB Manager

如何解决《SimpleDBManager》经验，为你挑选了1个好方法。 ... [详细]
程序员
我如何在godaddy为s3托管网站转发domain.com到www.domain.com？

如何解决《我如何在godaddy为s3托管网站转发domain.com到www.domain.com？》经验，为你挑选了2个好方法。 ... [详细]
程序员
如何检查是否提供了液体模板的所有值？

如何解决《如何检查是否提供了液体模板的所有值？》经验，为你挑选了0个好方法。 ... [详细]
程序员
Symfony2/Memcached集成

如何解决《Symfony2/Memcached集成》经验，为你挑选了1个好方法。 ... [详细]
程序员
调试时会打开.class文件而不是.java

如何解决《调试时会打开.class文件而不是.java》经验，为你挑选了2个好方法。 ... [详细]
程序员
button_to使用GET方法

如何解决《button_to使用GET方法》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何从ipodlibrary获取歌曲并使用AVPlayer播放

如何解决《如何从ipodlibrary获取歌曲并使用AVPlayer播放》经验，为你挑选了1个好方法。 ... [详细]
程序员
Java - 为PC/Mac创建独立GUI应用程序的任何方法？

如何解决《Java-为PC/Mac创建独立GUI应用程序的任何方法？》经验，为你挑选了0个好方法。 ... [详细]
程序员
是否有任何工具可以在iOS应用运行时生成对象图？

如何解决《是否有任何工具可以在iOS应用运行时生成对象图？》经验，为你挑选了0个好方法。 ... [详细]
程序员
IOS重复接口定义

如何解决《IOS重复接口定义》经验，为你挑选了1个好方法。 ... [详细]
程序员
防止包装菜单项

如何解决《防止包装菜单项》经验，为你挑选了1个好方法。 ... [详细]
程序员
如何在android中保存图像和表面视图？

如何解决《如何在android中保存图像和表面视图？》经验，为你挑选了0个好方法。 ... [详细]
程序员
Symfony/Doctrine中的SQL注入

如何解决《Symfony/Doctrine中的SQL注入》经验，为你挑选了2个好方法。 ... [详细]

可爱的天使keven_464

这个屌丝很懒，什么也没留下！

关注作者

Tags | 热门标签

RankList | 热门文章