[Python-Dev] bytes / unicode

Michael Foord fuzzyman at voidspace.org.uk
Thu Jun 24 13:00:12 CEST 2010


On 24/06/2010 11:58, M.-A. Lemburg wrote:
> Lennart Regebro wrote:
>    
>> On Tue, Jun 22, 2010 at 20:07, James Y Knight<foom at fuhm.net>  wrote:
>>      
>>> Yeah. This is a real issue I have with the direction Python3 went: it pushes
>>> you into decoding everything to unicode early, even when you don't care --
>>>        
>> Well, yes, maybe even if *you* don't care. But often the functions you
>> need to call must care, and then you need to decode to unicode, even
>> if you personally don't care. And in those cases, you should deocde as
>> early as possible.
>>
>> In the cases where neither you nor the functions you call care, then
>> you don't have to decode, and you can happily pass binary data from
>> one function to another.
>>
>> So this is not really a question of the direction Python 3 went. It's
>> more a case that some methods that *could* do their transformations in
>> a well defined way on bytes don't, and then force you to decode to
>> unicode. But that's not a problem with direction, it's just a missing
>> feature in the stdlib.
>>      
> The discussion is showing that in at least a few application spaces,
> the stdlib should be able to work on both bytes and Unicode, preferably
> using the same interfaces using polymorphism, i.e.
>
> some_function(bytes) ->  bytes
> some_function(str) ->  str
>
> In Python2 this partially works due to the automatic bytes->str
> conversion (in some cases you get some_function(bytes) ->  str),
> the codec base class implementations being a prime example.
>
> In Python3, things have to be done explicity and I think we need
> to add a few helpers to make writing such str/bytes interfaces
> easier.
>
> We've already had some suggestions in that area, but probably need
> to collect a few more ideas based on real-life porting attempts.
>
> I'd like to make this a topic at the upcoming language summit
> in Birmingham, if Michael agrees.
>
>    
Yep, it sounds like a great topic for the language summit.

Michael

-- 
http://www.ironpythoninaction.com/



More information about the Python-Dev mailing list