
On 25.04.15 22:27, Markus Unterwaditzer wrote:
On Sat, Apr 25, 2015 at 07:14:57PM +0300, Serhiy Storchaka wrote:
Here is an idea that perhaps will help to prepare Python 2 code for converting to Python 3.
Currently bytes is just an alias of str in Python 2, and the "b" prefix of string literals is ignored. There are no differences between natural strings and bytes. I propose to add special bit to str instances and set it for bytes literals and strings created from binary sources (read from binary files, received from sockets, the result of unicode.encode() and struct.pack(), etc). With -3 flag operations with binary strings that doesn't allowed for bytes in Python 3 (e.g. encoding or coercing to unicode) will emit a warning. Unfortunately we can't change the bytes constructor in minor version, it should left an alias to str in 2.7. So the result of bytes() will be not tagged as binary string.
May be it is too late for this.
You can get similar kinds of warnings with unicode-nazi (https://github.com/mitsuhiko/unicode-nazi), so I'm not sure if this would be that helpful.
Not quite. With unicode-nazi 'foo' == u'foo' emits a warning, with my proposition it doesn't, but b'foo' == u'foo' does. unicode-nazi should produce a lot of warnings in the stdlib and user code, my proposition should be more compatible.