Why does the "".join(r) do this?

Peter Otten __peter__ at web.de
Thu May 20 12:17:47 EDT 2004


Skip Montanaro wrote:

> Try
> 
>     u"".join(r)
> 
> instead.  I think the join operation is trying to convert the Unicode bits
> in your list of strings to strings by encoding using the default codec,
> which appears to be ASCII.

This is bound to fail when the first non-ascii str occurs:

>>> u"".join(["a", "b"])
u'ab'
>>> u"".join(["a", chr(174)])
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xae in position 0:
ordinal not in range(128)
>>>

Apart from that, Python automatically switches to unicode if the list
contains unicode items:

>>> "".join(["a", u"o"])
u'ao'

Peter




More information about the Python-list mailing list