Why such different HTTP response results between 2.5 and 3.0
Brian Allen Vanderburg II
BrianVanderburg2 at aim.com
Mon Feb 2 01:37:13 EST 2009
an00na at gmail.com wrote:
> Below are two semantically same snippets for querying the same partial
> HTTP response, for Python2.5 and Python 3.0 respectively.
> However, the 3.0 version returns a not-so-right result(msg) which is a
> bytes of length 239775, while the 2.5 version returns a good msg which
> is a 239733 byte-long string that is the content of a proper zip file.
> I really can't figure out what's wrong, thought I've sought out some
> "\r\n" segments in msg 3.0 that is absent in msg 2.5.
> So are there anyone could give me some hints? Thanks in advance.
>
> Code:
>
> # Python 2.5
> import urllib2
> auth_handler = urllib2.HTTPBasicAuthHandler()
> auth_handler.add_password(realm="pluses and minuses",
> uri='http://www.pythonchallenge.com/pc/hex/
> unreal.jpg',
> user='butter',
> passwd='fly')
> opener = urllib2.build_opener(auth_handler)
>
> req = urllib2.Request('http://www.pythonchallenge.com/pc/hex/
> unreal.jpg')
> req.add_header('Range', 'bytes=1152983631-')
> res = opener.open(req)
> msg = res.read()
>
> # Python 3.0
> import urllib.request
> auth_handler = urllib.request.HTTPBasicAuthHandler()
> auth_handler.add_password(realm="pluses and minuses",
> uri='http://www.pythonchallenge.com/pc/hex/
> unreal.jpg',
> user='butter',
> passwd='fly')
> opener = urllib.request.build_opener(auth_handler)
>
> req = urllib.request.Request('http://www.pythonchallenge.com/pc/hex/
> unreal.jpg')
> req.add_header('Range', 'bytes=1152983631-')
> res = opener.open(req)
> msg = res.read()
> --
> http://mail.python.org/mailman/listinfo/python-list
>
From what I can tell, Python 2.5 returns the request automatically
decoded as text. Python 3.0 returns a bytes object and doesn't decode
it at all. I did a test with urlopen:
In 2.5 for http://google.com just get the regular HTML
In 3.0 I get some extras at the start and end:
191d\r\n at the start
\r\n0\r\n\r\n at the end
In 2.5, newlines are automatically decoded
In 3.0, the \r\n pairs are kept
I hope their is an easy way to decode it as it was in 2.x
Brian Vanderburg II
More information about the Python-list
mailing list