[Python-Dev] Revised PEP 349: Allow str() to return unicode strings

Phillip J. Eby pje at telecommunity.com
Tue Aug 23 17:43:02 CEST 2005

At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> > then of course, one could change ``unicode.__str__()`` to return
> > ``self``, itself, which should work. but then, why so complicated?
>I think that may be the right fix.

No, it isn't.  Right now str(u"x") coerces the unicode object to a string, 
so changing this will be backwards-incompatible with any existing programs.

I think the new builtin is actually the right way to go for both 2.x and 
3.x Pythons.  i.e., text() would be a builtin in 2.x, along with a new 
bytes() type, and in 3.x text() could replace the basestring, str and 
unicode types.

I also think that the text() constructor should have a signature of 
'text(ob,encoding="ascii")'.  In the default case, strings can be returned 
by text() as long as they are pure ASCII (making the code str-stable *and* 
unicode-safe).  In the non-default case, a unicode object should always be 
returned, making the code unicode-safe but not str-stable.  Allowing text() 
to return 8-bit strings would be an obvious violation of its name: it's for 
text, not bytes.

More information about the Python-Dev mailing list