
On Mon, May 30, 2011 at 12:39 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
(However, there are use cases where it is claimed that 'HELO ' is needed both as str and as bytes.)
My current opinion is that all of this still needs more experimentation outside the core before we start fiddling any further with the builtins (we blinked once in the lead-up to 3.0 by allowing bytes and bytearray to retain a lot of string methods that assume an ASCII compatible encoding, and I now have my doubts about the wisdom of even that step). I don't have a good answer on how to deal with the real world situations where the *use case* blurs the bytes/text distinction (typically by embedding ASCII text inside an otherwise binary protocol), and given the potential to backslide into the bad old days of 8-bit strings, I'm not prepared to guess, either. 3.x has largely cleared the decks to allow a better solution to evolve in this space by making it harder to blur the line accidentally, and decode()/manipulate/encode() already nicely covers many stateless use cases. If it turns out we need another type, or some other API, to deal gracefully with any use cases where that isn't enough, then so be it. However, I think we need to let the status quo run for a while longer and see what people actually using the current types in production come up with. The bytes/text division in Python 3 is by far the biggest conceptual change between the two languages, so it's going to take some time before we can figure out how many of the problems encountered are real issues with the split model not covering some use cases and how many are just people (including us) taking time to get used to the sharp division between the two worlds. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia