[Python-Dev] unicode inconsistency?

Tim Peters tim.peters at gmail.com
Thu Sep 9 20:44:56 CEST 2004


[Neil Schemenauer]
> Perhaps this is more approprate for python-list but I looks like a
> bug to me.  Example code:
>
>    class A:
>        def __str__(self):
>            return u'\u1234'
> 
>    '%s' % u'\u1234' # this works
>    '%s' % A() # this doesn't work
> 
> It will work if 'A' subclasses from 'unicode' but should not be
> necessary, IMHO.

You know better than to say "doesn't work".  I assume you mean the
latter raises UnicodeEncodeError.

> Any reason why this shouldn't be fixed?

Didn't we just go thru this, last week or so?  PyObject_Str() never
returns a unicode (it returns a str).  That is, str(A()) raises
UnicodeEncodeError, and that's out of interpolation's hands.  As
Martin said last time, a __str__ method that returns a unicode doesn't
make much sense.

I'm not sure you really mean "it will work if 'A' subclasses from
'unicode'" either:

>>> class A(unicode):
...   def __str__(self):
...     return u'\u1234'
...
>>> '%s' % A()
u''
>>> len(_)
0
>>>

That is, A.__str__ is ignored if A subclasses from Unicode.  So
"doesn't blow up" seems more on-target than "works" -- I don't think
you expected an empty Unicode string here.


More information about the Python-Dev mailing list