python3 urlopen(...).read() returns bytes
Glenn G. Chappell
glenn.chappell at gmail.com
Mon Dec 22 16:41:56 EST 2008
I just ran 2to3 on a py2.5 script that does pattern matching on the
text of a web page. The resulting script crashed, because when I did
f = urllib.request.urlopen(url)
text = f.read()
then "text" is a bytes object, not a string, and so I can't do a
regexp on it.
Of course, this is easy to patch: just do "f.read().decode()".
However, it strikes me as an obvious bug, which ought to be fixed.
That is, read() should return a string, as it did in py2.5.
But apparently others disagree? This was mentioned in issue 3930
( http://bugs.python.org/issue3930 ) back in September '08, but that
issue is now closed, apparently because consistent behavior was
achieved. But I figure consistently bad behavior is still bad.
This change breaks pretty much every Python program that opens a
webpage, doesn't it? 2to3 doesn't catch it, and, in any case, why
should read() return bytes, not string? Am I missing something?
By the way, I'm running Ubuntu 8.10. Doing "python3 --version" prints
"Python 3.0rc1+".
More information about the Python-list
mailing list