19赞

元组没有属性“ isdigit”

作者：手机用户2402852307 | 2023-09-11 08:59

如何解决《元组没有属性“isdigit”》经验，为你挑选了1个好方法。

我需要使用NLTK模块进行一些文字处理，然后出现以下错误：AttributeError：'tuple'对象没有属性'isdigit'

有人知道如何处理此错误吗？

Traceback (most recent call last):
  File "preprocessing-edit.py", line 36, in 
    postoks = nltk.tag.pos_tag(tok)
NameError: name 'tok' is not defined

PS C:\Users\moham\Desktop\Presentation> python preprocessing-edit.py
Traceback (most recent call last):
  File "preprocessing-edit.py", line 37, in 
    postoks = nltk.tag.pos_tag(tok)
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\__init__.py", line 111, in pos_tag
    return _pos_tag(tokens, tagset, tagger)
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\__init__.py", line 82, in _pos_tag
    tagged_tokens = tagger.tag(tokens)
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\perceptron.py", line 153, in tag
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\perceptron.py", line 153, in 
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "c:\python34\lib\site-packages\nltk-3.1-py3.4.egg\nltk\tag\perceptron.py", line 228, in normalize
    elif word.isdigit() and len(word) == 4:
AttributeError: 'tuple' object has no attribute 'isdigit'

import nltk

with open ("SHORT-LIST.txt", "r",encoding='utf8') as myfile:
    text =  (myfile.read().replace('\n', ''))

#text = "program managment is complicated issue for human workers"

# Used when tokenizing words
sentence_re = r'''(?x)      # set flag to allow verbose regexps
      ([A-Z])(\.[A-Z])+\.?  # abbreviations, e.g. U.S.A.
    | \w+(-\w+)*            # words with optional internal hyphens
    | \$?\d+(\.\d+)?%?      # currency and percentages, e.g. $12.40, 82%
    | \.\.\.                # ellipsis
    | [][.,;"'?():-_`]      # these are separate tokens
'''

lemmatizer = nltk.WordNetLemmatizer()
stemmer = nltk.stem.porter.PorterStemmer()


grammar = r"""
    NBAR:
        {*}  # Nouns and Adjectives, terminated with Nouns

    NP:
        {}
        {}  # Above, connected with in/of/etc...
"""
chunker = nltk.RegexpParser(grammar)

tok = nltk.regexp_tokenize(text, sentence_re)

postoks = nltk.tag.pos_tag(tok)

#print (postoks)

tree = chunker.parse(postoks)

from nltk.corpus import stopwords
stopwords = stopwords.words('english')


def leaves(tree):
    """Finds NP (nounphrase) leaf nodes of a chunk tree."""
    for subtree in tree.subtrees(filter = lambda t: t.label()=='NP'):
        yield subtree.leaves()

def normalise(word):
    """Normalises words to lowercase and stems and lemmatizes it."""
    word = word.lower()
    word = stemmer.stem_word(word)
    word = lemmatizer.lemmatize(word)
    return word

def acceptable_word(word):
    """Checks conditions for acceptable word: length, stopword."""
    accepted = bool(2 <= len(word) <= 40
        and word.lower() not in stopwords)
    return accepted


def get_terms(tree):
    for leaf in leaves(tree):
        term = [ normalise(w) for w,t in leaf if acceptable_word(w) ]
        yield term

terms = get_terms(tree)


with open("results.txt", "w+") as logfile:
    for term in terms: 
        for word in term:
            result = word
            logfile.write("%s\n" % str(word))
#           print (word),
#       (print)

logfile.close()

Ramtin M. Se.. 5

另一种简便的方法是更改此部分：

tok = nltk.regexp_tokenize(text, sentence_re)
postoks = nltk.tag.pos_tag(tok)

并将其替换为nltk标准单词标记器：

toks = nltk.word_tokenize(text)
postoks = nltk.tag.pos_tag(toks)

从理论上讲，性能和结果之间应该没有太大差异。

1> Ramtin M. Se..：

另一种简便的方法是更改此部分：

tok = nltk.regexp_tokenize(text, sentence_re)
postoks = nltk.tag.pos_tag(tok)

并将其替换为nltk标准单词标记器：

toks = nltk.word_tokenize(text)
postoks = nltk.tag.pos_tag(toks)

从理论上讲，性能和结果之间应该没有太大差异。

推荐阅读

程序员
LXC和libcontainer之间的区别

如何解决《LXC和libcontainer之间的区别》经验，为你挑选了1个好方法。 ... [详细]
程序员
什么问题可以在互斥范围外调用std :: list/vector/map/deque :: empty()？

如何解决《什么问题可以在互斥范围外调用std::list/vector/map/deque::empty()？》经验，为你挑选了1个好方法。 ... [详细]
程序员
android animate()withEndAction()vs setListener()onAnimationEnd()

如何解决《androidanimate()withEndAction()vssetListener()onAnimationEnd()》经验，为你挑选了1个好方法。 ... [详细]
程序员
在C/C++中自动检测OS

如何解决《在C/C++中自动检测OS》经验，为你挑选了2个好方法。 ... [详细]
程序员
使用require()使用OpenSeadragon的示例

如何解决《使用require()使用OpenSeadragon的示例》经验，为你挑选了1个好方法。 ... [详细]
程序员
ng-map拖动标记后获取地址

如何解决《ng-map拖动标记后获取地址》经验，为你挑选了1个好方法。 ... [详细]
程序员
调试时NodeJS没有响应(在VS代码中)

如何解决《调试时NodeJS没有响应(在VS代码中)》经验，为你挑选了2个好方法。 ... [详细]
程序员
R,如何将此操作矢量化

如何解决《R,如何将此操作矢量化》经验，为你挑选了1个好方法。 ... [详细]
程序员
结合多个Linq Where语句

如何解决《结合多个LinqWhere语句》经验，为你挑选了1个好方法。 ... [详细]
程序员
Google Map:InvalidValueError:setLabel:不是字符串; 没有文字属性

如何解决《GoogleMap:InvalidValueError:setLabel:不是字符串;没有文字属性》经验，为你挑选了1个好方法。 ... [详细]
程序员
在列表对象中排序

如何解决《在列表对象中排序》经验，为你挑选了2个好方法。 ... [详细]
程序员
PHP:如何解析JSON字符串并获取变量？

如何解决《PHP:如何解析JSON字符串并获取变量？》经验，为你挑选了1个好方法。 ... [详细]
程序员
Django-自定义装饰器，仅允许ajax请求

如何解决《Django-自定义装饰器，仅允许ajax请求》经验，为你挑选了2个好方法。 ... [详细]
程序员
使用PHP 7和WAMP

如何解决《使用PHP7和WAMP》经验，为你挑选了2个好方法。 ... [详细]
程序员
如何从Linux内核维护者的开发分支下载最新的Linux内核代码？

如何解决《如何从Linux内核维护者的开发分支下载最新的Linux内核代码？》经验，为你挑选了1个好方法。 ... [详细]
程序员
Mongodb多文档插入忽略自定义重复字段错误

如何解决《Mongodb多文档插入忽略自定义重复字段错误》经验，为你挑选了1个好方法。 ... [详细]
程序员
Laravel 5.1:将数据传递给View Composer

如何解决《Laravel5.1:将数据传递给ViewComposer》经验，为你挑选了1个好方法。 ... [详细]
程序员
快速r连续匹配(基于位置相似性)

如何解决《快速r连续匹配(基于位置相似性)》经验，为你挑选了1个好方法。 ... [详细]
程序员
开源Redis集群和Redis labs企业集群之间有什么区别？

如何解决《开源Redis集群和Redislabs企业集群之间有什么区别？》经验，为你挑选了1个好方法。 ... [详细]
程序员
Google Compute Engine：直接从gcloud控制台在vm中执行shell命令

如何解决《GoogleComputeEngine：直接从gcloud控制台在vm中执行shell命令》经验，为你挑选了1个好方法。 ... [详细]

手机用户2402852307

这个屌丝很懒，什么也没留下！

关注作者

Tags | 热门标签

RankList | 热门文章