我有一个脚本,我想继续使用,但看起来我要么找到一些解决方法来解决Python 3中的错误,要么降级回2.6,因此不得不降级其他脚本......
希望有人在这里找到了解决方法.
问题在于,由于Python 3.0中有关字节和字符串的新变化,并非所有的库代码都经过了明显的测试.
我有一个从Web服务器下载页面的脚本.这个脚本在python 2.6中传递了一个用户名和密码作为url的一部分,但在Python 3.0中,这不再起作用了.
例如,这个:
import urllib.request; url = "http://username:password@server/file"; urllib.request.urlretrieve(url, "temp.dat");
失败,出现此异常:
Traceback (most recent call last): File "C:\Temp\test.py", line 5, inurllib.request.urlretrieve(url, "test.html"); File "C:\Python30\lib\urllib\request.py", line 134, in urlretrieve return _urlopener.retrieve(url, filename, reporthook, data) File "C:\Python30\lib\urllib\request.py", line 1476, in retrieve fp = self.open(url, data) File "C:\Python30\lib\urllib\request.py", line 1444, in open return getattr(self, name)(url) File "C:\Python30\lib\urllib\request.py", line 1618, in open_http return self._open_generic_http(http.client.HTTPConnection, url, data) File "C:\Python30\lib\urllib\request.py", line 1576, in _open_generic_http auth = base64.b64encode(user_passwd).strip() File "C:\Python30\lib\base64.py", line 56, in b64encode raise TypeError("expected bytes, not %s" % s.__class__.__name__) TypeError: expected bytes, not str
显然,base64编码现在需要字节并输出一个字符串,因此urlretrieve(或其中的一些代码)构建一个username:password字符串,并尝试对此进行base64编码以进行简单授权,失败.
如果我改为尝试使用urlopen,就像这样:
import urllib.request; url = "http://username:password@server/file"; f = urllib.request.urlopen(url); contents = f.read();
然后它失败并出现此异常:
Traceback (most recent call last): File "C:\Temp\test.py", line 5, inf = urllib.request.urlopen(url); File "C:\Python30\lib\urllib\request.py", line 122, in urlopen return _opener.open(url, data, timeout) File "C:\Python30\lib\urllib\request.py", line 359, in open response = self._open(req, data) File "C:\Python30\lib\urllib\request.py", line 377, in _open '_open', req) File "C:\Python30\lib\urllib\request.py", line 337, in _call_chain result = func(*args) File "C:\Python30\lib\urllib\request.py", line 1082, in http_open return self.do_open(http.client.HTTPConnection, req) File "C:\Python30\lib\urllib\request.py", line 1051, in do_open h = http_class(host, timeout=req.timeout) # will parse host:port File "C:\Python30\lib\http\client.py", line 620, in __init__ self._set_hostport(host, port) File "C:\Python30\lib\http\client.py", line 632, in _set_hostport raise InvalidURL("nonnumeric port: '%s'" % host[i+1:]) http.client.InvalidURL: nonnumeric port: 'password@server'
显然,这个"下一代url检索库"中的url解析不知道如何处理url中的用户名和密码.
我还有其他选择吗?
直接来自Py3k文档:http://docs.python.org/dev/py3k/library/urllib.request.html#examples
import urllib.request # Create an OpenerDirector with support for Basic HTTP Authentication... auth_handler = urllib.request.HTTPBasicAuthHandler() auth_handler.add_password(realm='PDQ Application', uri='https://mahler:8092/site-updates.py', user='klem', passwd='kadidd!ehopper') opener = urllib.request.build_opener(auth_handler) # ...and install it globally so it can be used with urlopen. urllib.request.install_opener(opener) urllib.request.urlopen('http://www.example.com/login.html')