当前位置:  开发笔记 > 编程语言 > 正文

gensim入门错误:没有这样的文件或目录:'text8'

如何解决《gensim入门错误:没有这样的文件或目录:'text8'》经验,为你挑选了1个好方法。

我正在学习python中的word2vec和GloVe模型,所以我在这里可以看到这个.

在Idle3中逐步编译这些代码之后:

>>>from gensim.models import word2vec
>>>import logging
>>>logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
>>>sentences = word2vec.Text8Corpus('text8')
>>>model = word2vec.Word2Vec(sentences, size=200)

我收到此错误:

2017-01-13 11:15:41,471 : INFO : collecting all words and their counts
Traceback (most recent call last):
  File "", line 1, in 
    model = word2vec.Word2Vec(sentences, size=200)
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 469, in __init__
    self.build_vocab(sentences, trim_rule=trim_rule)
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 533, in build_vocab
    self.scan_vocab(sentences, progress_per=progress_per, trim_rule=trim_rule)  # initial survey
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 545, in scan_vocab
    for sentence_no, sentence in enumerate(sentences):
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 1536, in __iter__
    with utils.smart_open(self.fname) as fin:
  File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 127, in smart_open
    return file_smart_open(parsed_uri.uri_path, mode)
  File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 558, in file_smart_open
    return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'text8'

我该如何纠正这个?在此先感谢您的帮助.



1> Jim Fasaraki..:

看来你错过了这里使用的文件.具体来说,它试图打开text8并找不到它(因此FileNotFoundError).

您可以从此处下载文件本身,如以下文档中所述Text8Corpus:

Docstring:      
Iterate over sentences from the "text8" corpus, unzipped from http://mattmahoney.net/dc/text8.zip .

并使其可用.提取它,然后将其作为参数提供给Text8Corpus:

sentences = word2vec.Text8Corpus('/path/to/text8')

推荐阅读
linjiabin43
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有