[Python-Dev] What to do for bytes in 2.6?
glyph at divmod.com
glyph at divmod.com
Sun Jan 20 02:54:48 CET 2008
On 19 Jan, 07:32 pm, guido at python.org wrote:
>There is no way to know whether that return value means text or data
>(plenty of apps legitimately read text straight off a socket in 2.x),
IMHO, this is a stretch of the word "legitimately" ;-). If you're
reading from a socket, what you're getting are bytes, whether they're
represented by str() or bytes(); correct code in 2.x must currently do a
.decode("ascii") or .decode("charmap") to "legitimately" identify the
result as text of some kind.
Now, ad-hoc code with a fast and loose definition of "text" can still
read arrays of bytes off a socket without specifying an encoding and get
away with it, but that's because Python's unicode implementation has
thus far been very forgiving, not because the data is cleanly text yet.
Why can't we get that warning in -3 mode just the same from something
read from a socket and a b"" literal? I've written lots of code that
aggressively rejects str() instances as text, as well as unicode
instances as bytes, and that's in code that still supports 2.3 ;).
>Really, the pure aliasing solution is just about optimal in terms of
>bang per buck. :-)
Not that I'm particularly opposed to the aliasing solution, either. It
would still allow writing code that was perfectly useful in 2.6 as well
as 3.0, and it would avoid disturbing code that did checks of type("").
It would just remove an opportunity to get one potentially helpful
warning.
More information about the Python-Dev
mailing list