unicode bit me
Scott David Daniels
Scott.Daniels at Acm.Org
Sun May 10 02:19:21 EDT 2009
anuraguniyal at yahoo.com wrote:
> class A(object):
> def __unicode__(self):
> return u"©au"
> def __repr__(self):
> return unicode(self).encode("utf-8")
> __str__ = __repr__
> a = A()
> u1 = unicode(a)
> u2 = unicode([a])
>
> now I am not using print so that doesn't matter stdout can print
> unicode or not
> my naive question is line u2 = unicode([a]) throws
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
> 1: ordinal not in range(128)
>
> shouldn't list class call unicode on its elements?
> I was expecting that so instead do i had to do this
> u3 = "["+u",".join(map(unicode,[a]))+"]"
Why would you expect that? str([a]) doesn't call str on its elements.
Using our simple expedient:
class B(object):
def __unicode__(self):
return u'unicode'
def __repr__(self):
return 'repr'
def __str__(self):
return 'str'
>>> unicode(B())
u'unicode'
>>> unicode([B()])
u'[repr]'
>>> str(B())
'str'
>>> str([B()])
'[repr]'
Now if you ask _why_ call repr on its elements,
the answer is, "so that the following is not deceptive:
>>> repr(["a, b", "c"])
"['a, b', 'c']"
which does not look like a 3-element list.
--Scott David Daniels
Scott.Daniels at Acm.Org
More information about the Python-list
mailing list