unicode bit me

anuraguniyal at yahoo.com anuraguniyal at yahoo.com
Mon May 11 14:14:18 CEST 2009

On May 11, 10:47 am, Terry Reedy <tjre... at udel.edu> wrote:
> anuraguni... at yahoo.com wrote:
> > so unicode(obj) calls __unicode__ on that object
> It will look for the existence of type(ob).__unicode__ ...
>  > and if it isn't there __repr__ is used
> According to the below, type(ob).__str__ is tried first.
> > __repr__ of list by default return a str even if __repr__ of element
> > is unicode
>  From the fine library manual, built-in functions section:
> (I reccommend using it, along with interactive experiments.)
> "repr( object)
> Return a string ..."
> "str( [object])
> Return a string ..."
> "unicode( [object[, encoding [, errors]]])
> Return the Unicode string version of object using one of the following
> modes:
> If encoding and/or errors are given, ...
> If no optional parameters are given, unicode() will mimic the behaviour
> of str() except that it returns Unicode strings instead of 8-bit
> strings. More precisely, if object is a Unicode string or subclass it
> will return that Unicode string without any additional decoding applied.
> For objects which provide a __unicode__() method, it will call this
> method without arguments to create a Unicode string. For all other
> objects, the 8-bit string version or representation is requested and
> then converted to a Unicode string using the codec for the default
> encoding in 'strict' mode.
> "
> 'unicode(somelist)' has no optional parameters, so skip to third
> paragraph.  Somelist is not a unicode instance, so skip to the last
> paragraph.  If you do dir(list) I presume you will *not* see
> '__unicode__' listed.  So skip to the last sentence.
> unicode(somelist) == str(somelist).decode(default,'strict').
> I do not believe str() and repr() are specifically documented for
> builtin classes other than the general description, but you can figure
> that str(collection) or repr(collection) will call str or repr on the
> members of the collection in order to return a str, as the doc says.
Thanks for the explanation.

> (Details are available by experiment.)  Str(uni_string) encodes with the
> default encoding, which seems to be 'ascii' in 2.x.  I am sure it uses
> 'strict' errors.
> I would agree that str(some_unicode) could be better documented, like
> unicode(some_str) is.
> > so my only solution looks like to use my own list class everywhere i
> > use list
> > class mylist(list):
> >     def __unicode__(self):
> >         return u"["+u''.join(map(unicode,self))+u"]"
> Or write a function and use that instead, or, if and when you can,
> switch to 3.x where str and repr accept and produce unicode.
> tjr

More information about the Python-list mailing list