我在AWS的Ubuntu 14.04实例中测试安装脚本.实例类型c4.xlarge,使用EBS 50 GB.每次安装时,我都会从我创建的新实例开始测试.
不断地,nltk数据无法在panlex_lite包上安装.
有任何想法吗 ?(我从安装中附上了很多行,以便与我看到的信息一致.很抱歉这些长列表).
谢谢,
我在nltk数据之前做的命令是:
sudo apt-get install python3-setuptools -y sudo apt-get install python3.4-dev -y # Installing Python packages sudo easy_install3 pip sudo easy_install3 inflect sudo easy_install3 elasticsearch sudo easy_install3 geopy sudo easy_install3 geojson sudo easy_install3 simplejson sudo easy_install3 python_instagram sudo easy_install3 flickrapi sudo easy_install3 oauth sudo easy_install3 xlrd sudo easy_install3 pytz sudo easy_install3 tweepy sudo easy_install3 BeautifulSoup4 sudo easy_install3 psutil sudo pip3 install -U nltk sudo pip3 install -U numpy sudo python3 -m nltk.downloader all
最后一行失败.从psutil完成开始,日志如下:
Finished processing dependencies for psutil sudo: unable to resolve host ip-172-30-0-207 The directory '/home/ubuntu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. The directory '/home/ubuntu/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. Collecting nltk Downloading nltk-3.1.tar.gz (1.1MB) Installing collected packages: nltk Running setup.py install for nltk Successfully installed nltk-3.1 sudo: unable to resolve host ip-172-30-0-207 The directory '/home/ubuntu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. The directory '/home/ubuntu/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. Collecting numpy Downloading numpy-1.10.1.tar.gz (4.0MB) Installing collected packages: numpy Running setup.py install for numpy Successfully installed numpy-1.10.1 sudo: unable to resolve host ip-172-30-0-207 [nltk_data] Downloading collection 'all' [nltk_data] | [nltk_data] | Downloading package abc to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/abc.zip. [nltk_data] | Downloading package alpino to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/alpino.zip. [nltk_data] | Downloading package biocreative_ppi to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/biocreative_ppi.zip. [nltk_data] | Downloading package brown to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/brown.zip. [nltk_data] | Downloading package brown_tei to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/brown_tei.zip. [nltk_data] | Downloading package cess_cat to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/cess_cat.zip. [nltk_data] | Downloading package cess_esp to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/cess_esp.zip. [nltk_data] | Downloading package chat80 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/chat80.zip. [nltk_data] | Downloading package city_database to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/city_database.zip. [nltk_data] | Downloading package cmudict to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/cmudict.zip. [nltk_data] | Downloading package comparative_sentences to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/comparative_sentences.zip. [nltk_data] | Downloading package comtrans to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package conll2000 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/conll2000.zip. [nltk_data] | Downloading package conll2002 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/conll2002.zip. [nltk_data] | Downloading package conll2007 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package crubadan to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/crubadan.zip. [nltk_data] | Downloading package dependency_treebank to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/dependency_treebank.zip. [nltk_data] | Downloading package europarl_raw to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/europarl_raw.zip. [nltk_data] | Downloading package floresta to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/floresta.zip. [nltk_data] | Downloading package framenet_v15 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/framenet_v15.zip. [nltk_data] | Downloading package gazetteers to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/gazetteers.zip. [nltk_data] | Downloading package genesis to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/genesis.zip. [nltk_data] | Downloading package gutenberg to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/gutenberg.zip. [nltk_data] | Downloading package ieer to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/ieer.zip. [nltk_data] | Downloading package inaugural to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/inaugural.zip. [nltk_data] | Downloading package indian to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/indian.zip. [nltk_data] | Downloading package jeita to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package kimmo to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/kimmo.zip. [nltk_data] | Downloading package knbc to /home/ubuntu/nltk_data... [nltk_data] | Downloading package lin_thesaurus to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/lin_thesaurus.zip. [nltk_data] | Downloading package mac_morpho to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/mac_morpho.zip. [nltk_data] | Downloading package machado to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package masc_tagged to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package moses_sample to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping models/moses_sample.zip. [nltk_data] | Downloading package movie_reviews to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/movie_reviews.zip. [nltk_data] | Downloading package names to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/names.zip. [nltk_data] | Downloading package nombank.1.0 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package nps_chat to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/nps_chat.zip. [nltk_data] | Downloading package oanc_masc to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package omw to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/omw.zip. [nltk_data] | Downloading package opinion_lexicon to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/opinion_lexicon.zip. [nltk_data] | Downloading package paradigms to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/paradigms.zip. [nltk_data] | Downloading package pil to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/pil.zip. [nltk_data] | Downloading package pl196x to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/pl196x.zip. [nltk_data] | Downloading package ppattach to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/ppattach.zip. [nltk_data] | Downloading package problem_reports to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/problem_reports.zip. [nltk_data] | Downloading package propbank to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package ptb to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/ptb.zip. [nltk_data] | Downloading package oanc_masc to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Package oanc_masc is already up-to-date! [nltk_data] | Downloading package product_reviews_1 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/product_reviews_1.zip. [nltk_data] | Downloading package product_reviews_2 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/product_reviews_2.zip. [nltk_data] | Downloading package pros_cons to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/pros_cons.zip. [nltk_data] | Downloading package qc to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/qc.zip. [nltk_data] | Downloading package reuters to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package rte to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/rte.zip. [nltk_data] | Downloading package semcor to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package senseval to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/senseval.zip. [nltk_data] | Downloading package sentiwordnet to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/sentiwordnet.zip. [nltk_data] | Downloading package sentence_polarity to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/sentence_polarity.zip. [nltk_data] | Downloading package shakespeare to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/shakespeare.zip. [nltk_data] | Downloading package sinica_treebank to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/sinica_treebank.zip. [nltk_data] | Downloading package smultron to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/smultron.zip. [nltk_data] | Downloading package state_union to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/state_union.zip. [nltk_data] | Downloading package stopwords to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/stopwords.zip. [nltk_data] | Downloading package subjectivity to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/subjectivity.zip. [nltk_data] | Downloading package swadesh to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/swadesh.zip. [nltk_data] | Downloading package switchboard to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/switchboard.zip. [nltk_data] | Downloading package timit to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/timit.zip. [nltk_data] | Downloading package toolbox to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/toolbox.zip. [nltk_data] | Downloading package treebank to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/treebank.zip. [nltk_data] | Downloading package twitter_samples to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/twitter_samples.zip. [nltk_data] | Downloading package udhr to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/udhr.zip. [nltk_data] | Downloading package udhr2 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/udhr2.zip. [nltk_data] | Downloading package unicode_samples to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/unicode_samples.zip. [nltk_data] | Downloading package universal_treebanks_v20 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package verbnet to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/verbnet.zip. [nltk_data] | Downloading package webtext to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/webtext.zip. [nltk_data] | Downloading package wordnet to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/wordnet.zip. [nltk_data] | Downloading package wordnet_ic to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/wordnet_ic.zip. [nltk_data] | Downloading package words to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/words.zip. [nltk_data] | Downloading package ycoe to /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/ycoe.zip. [nltk_data] | Downloading package rslp to /home/ubuntu/nltk_data... [nltk_data] | Unzipping stemmers/rslp.zip. [nltk_data] | Downloading package hmm_treebank_pos_tagger to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping taggers/hmm_treebank_pos_tagger.zip. [nltk_data] | Downloading package maxent_treebank_pos_tagger to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping taggers/maxent_treebank_pos_tagger.zip. [nltk_data] | Downloading package universal_tagset to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping taggers/universal_tagset.zip. [nltk_data] | Downloading package maxent_ne_chunker to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping chunkers/maxent_ne_chunker.zip. [nltk_data] | Downloading package punkt to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping tokenizers/punkt.zip. [nltk_data] | Downloading package book_grammars to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping grammars/book_grammars.zip. [nltk_data] | Downloading package sample_grammars to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping grammars/sample_grammars.zip. [nltk_data] | Downloading package spanish_grammars to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping grammars/spanish_grammars.zip. [nltk_data] | Downloading package basque_grammars to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping grammars/basque_grammars.zip. [nltk_data] | Downloading package large_grammars to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping grammars/large_grammars.zip. [nltk_data] | Downloading package tagsets to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping help/tagsets.zip. [nltk_data] | Downloading package snowball_data to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package bllip_wsj_no_aux to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping models/bllip_wsj_no_aux.zip. [nltk_data] | Downloading package word2vec_sample to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping models/word2vec_sample.zip. [nltk_data] | Downloading package panlex_swadesh to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Downloading package mte_teip5 to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/mte_teip5.zip. [nltk_data] | Downloading package averaged_perceptron_tagger to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping taggers/averaged_perceptron_tagger.zip. [nltk_data] | Downloading package panlex_lite to [nltk_data] | /home/ubuntu/nltk_data... [nltk_data] | Unzipping corpora/panlex_lite.zip. Error installing package. Retry? [n/y/e]
它也不是尺寸异常:
Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvda1 51466360 6582776 42687092 14% / none 4 0 4 0% /sys/fs/cgroup udev 3824796 8 3824788 1% /dev tmpfs 765952 360 765592 1% /run none 5120 0 5120 0% /run/lock none 3829752 0 3829752 0% /run/shm none 102400 0 102400 0% /run/user
gprakhar.. 9
当使用旧的AWS教程进行推文数据的情绪分析时,我遇到了同样的问题.本教程使用引导脚本在EMR集群上使用命令安装NLTK及其数据,
$ sudo python -m nltk.downloader -d /usr/share/nltk_data all
在运行此命令时,我得到完全相同的panlex_lite安装问题.由于这是一个bootstrap脚本,提示符
安装包时出错.重试?[N/Y/E]
导致引导操作失败,EMR集群终止.:P
我通过以下方法克服了这个问题:A)假设这个包是非必要的B)修改命令,自动传递'n',这样脚本就不会无限期地等待.
$ yes n | sudo python -m nltk.downloader -d /usr/share/nltk_data all
希望这可以帮助.
更新25Jan2016:名为"panlex_lite"的数据集仍导致安装失败.
当使用旧的AWS教程进行推文数据的情绪分析时,我遇到了同样的问题.本教程使用引导脚本在EMR集群上使用命令安装NLTK及其数据,
$ sudo python -m nltk.downloader -d /usr/share/nltk_data all
在运行此命令时,我得到完全相同的panlex_lite安装问题.由于这是一个bootstrap脚本,提示符
安装包时出错.重试?[N/Y/E]
导致引导操作失败,EMR集群终止.:P
我通过以下方法克服了这个问题:A)假设这个包是非必要的B)修改命令,自动传递'n',这样脚本就不会无限期地等待.
$ yes n | sudo python -m nltk.downloader -d /usr/share/nltk_data all
希望这可以帮助.
更新25Jan2016:名为"panlex_lite"的数据集仍导致安装失败.