[Python-Dev] unicode inconsistency?

Tim Peters tim.peters at gmail.com
Thu Sep 9 21:00:07 CEST 2004


[Martin v. Löwis]
> ...
> For the specific issue, I would maintain that str() should always
> return string objects.

__builtin__.str() always does -- or raises an exception.  Same for
PyObject_Str() and PyObject_Repr().

> I'm not so sure about %s since, as Neil observes, '%s' % unicode_string
> gives a unicode result.

That's because PyString_Format()'s '%s' processing special-cases the
snot out of unicode *inputs*.  All other inputs to '%s' (and '%r') go
thru PyObject_Str() or PyObject_Repr(), and, as above, those never
return a unicode.  In Neil's case, they raise the expected exception,
and there's nothing sane PyString_Format can do about that.

> I can't see any harm by supporting this operation also if __str__ returns
> a Unicode object.

It doesn't sound like a good idea to me, at least in part because it
would be darned messy to implement short of saying "OK, we don't give
a rip anymore about what type of objects PyObject_{Str,Repr} return",
and that would have broader consequences that just letting Neil get
away with whatever he's trying to do with str.__mod__.


More information about the Python-Dev mailing list