
M.-A. Lemburg wrote:
On 2006-12-06 10:26, Fredrik Lundh wrote:
From what I can tell, __str__ may return a Unicode object, but only if can be converted to an 8-bit string using the default encoding. Is this on purpose or by accident? Do we have a plan for improving the situation in future 2.X releases ?
It has worked that way since at Python least 2.4 (I just tried returning unicode from __str__ in 2.4.1 and it worked fine). That's the oldest version I have handy, so I don't know if it was possible in earlier versions.
This was added to make the transition to all Unicode in 3k easier:
.__str__() may return a string or Unicode object.
.__unicode__() must return a Unicode object.
There is no restriction on the content of the Unicode string for .__str__().
It's also the basis for a tweak that was made in 2.5 to permit conversion to a builtin string in a way that is idempotent for both str and unicode instances via: as_builtin_string = '%s' % original To use the terms from the deferred PEP 349, that conversion mechanism is both Unicode-safe (unicode doesn't get coerced to str) and str-stable (str doesn't get coerced to unicode). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org