当前位置:  开发笔记 > 编程语言 > 正文

强制xpath返回一个字符串lxml

如何解决《强制xpath返回一个字符串lxml》经验,为你挑选了1个好方法。

我正在使用lxml,我有一个来自Google学术搜索的报废页面.以下是一个最小的工作示例和我尝试过的事情.

In [56]: seed = "https://scholar.google.com/citations?view_op=search_authors&hl=en&mauthors=label:machine_learning"

In [60]: page = urllib2.urlopen(seed).read()

In [63]: tree = html.fromstring(page)

In [64]: xpath = '(/html/body/div[1]/div[4]/div[2]/div/span/button[2]/@onclick)[1]'

In [65]: tree.xpath(xpath)
#first element returns as list
Out[65]: ["window.location='/citations?view_op\\x3dsearch_authors\\x26hl\\x3den\\x26oe\\x3dASCII\\x26mauthors\\x3dlabel:machine_learning\\x26after_author\\x3dVCoCALPY_v8J\\x26astart\\x3d10'"]         

In [66]: xpath = '(/html/body/div[1]/div[4]/div[2]/div/span/button[2]/@onclick)[2]'

#there is no second element
In [67]: tree.xpath(xpath)
Out[67]: []     

In [70]: xpath = '(/html/body/div[1]/div[4]/div[2]/div/span/button[2]/@onclick)'

#The list contains only one element
In [71]: tree.xpath(xpath)
Out[71]: ["window.location='/citations?view_op\\x3dsearch_authors\\x26hl\\x3den\\x26oe\\x3dASCII\\x26mauthors\\x3dlabel:machine_learning\\x26after_author\\x3dVCoCALPY_v8J\\x26astart\\x3d10'"]         

根据此处的文档,返回值可以是智能字符串,但我无法从xpath函数获取字符串输出.如何编写xpath以便从xpath获取字符串输出



1> Martin Honne..:

您可以使用XPath表达式string(/html/body/div[1]/div[4]/div[2]/div/span/button[2]/@onclick),在这种情况下,您将获得一个简单的字符串值.

推荐阅读
和谐啄木鸟
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有