[Python-Dev] __str__ and unicode
Nick Coghlan
ncoghlan at gmail.com
Wed Dec 6 13:07:43 CET 2006
M.-A. Lemburg wrote:
> On 2006-12-06 10:26, Fredrik Lundh wrote:
>> From what I can tell, __str__ may return a Unicode object, but
>> only if can be converted to an 8-bit string using the default encoding. Is this
>> on purpose or by accident? Do we have a plan for improving the situation
>> in future 2.X releases ?
It has worked that way since at Python least 2.4 (I just tried returning
unicode from __str__ in 2.4.1 and it worked fine). That's the oldest version I
have handy, so I don't know if it was possible in earlier versions.
> This was added to make the transition to all Unicode in 3k easier:
>
> .__str__() may return a string or Unicode object.
>
> .__unicode__() must return a Unicode object.
>
> There is no restriction on the content of the Unicode string
> for .__str__().
It's also the basis for a tweak that was made in 2.5 to permit conversion to a
builtin string in a way that is idempotent for both str and unicode instances via:
as_builtin_string = '%s' % original
To use the terms from the deferred PEP 349, that conversion mechanism is both
Unicode-safe (unicode doesn't get coerced to str) and str-stable (str doesn't
get coerced to unicode).
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-Dev
mailing list