On Wed, Mar 09, 2005 at 11:10:59AM +0100, M.-A. Lemburg wrote:
The patch implements the PyObjbect_Text() idea (an API that returns a basestring instance, ie. string or unicode) and then uses this in '%s' (the string version) to properly propogate to u'%s' (the unicode version).
Maybe we should also expose the C API as suggested in the patch, e.g. as text(obj).
Perhaps the right thing to do is introduce a new format code that means insert text(obj) instead of str(obj), e.g %t. If we do that though then we should make "'%s' % u'xyz'" return a string instead of a unicode object. I suspect that would break a lot of code.
OTOH, having %s mean text(obj) instead of str(obj) may work just fine. People who want it to mean str() generally don't have any unicode strings floating around so text() has the same effect. People who are using unicode probably would find text() to be more useful behavior. I think that's why someone hacked PyString_Format to sometimes return unicode strings.
Regarding the use of __str__, to return a unicode object: we could introduce a new slot (e.g. __text__) instead. However, I can't see any advantage to that. If someone really wants a str object then they call str() or PyObject_Str().