Why does the "".join(r) do this?

Jim Hefferon jhefferon at smcvt.edu
Thu May 20 20:45:21 EDT 2004


Peter Otten <__peter__ at web.de> wrote
> So why doesn't it just concatenate? Because there is no way of knowing how
> to properly decode chr(174) or any other non-ascii character to unicode:
> 
> >>> chr(174).decode("latin1")
>  u'\xae'
> >>> chr(174).decode("latin2")
>  u'\u017d'
> >>>

Forgive me, Peter, but you've only rephrased my question: I'm going to
decode them later, so why does the concatenator insist on decoding
them now?  As I understand it (perhaps this is my error),
encoding/decoding is stuff that you do external to manipulating the
arrays of characters.

> Use either unicode or str, but don't mix them. That should keep you out of
> trouble.

Well, I got this string as the filename of some kind of Macintosh file
(I'm on Linux but I'm working with an archive that contains some pre-X
Mac stuff) while calling some os and os.path functions.  So I'm taking
strings from a Python library function (and using % to stuff them into
strings that will end up on the web, which should preserve
unicode-type-ness, right?) and then .join-ing them.

I didn't go into the whole story when posting, because I tried to boil
the question down.  Perhaps I should have.

Thanks; I am often struck by how helpful this group is,
Jim



More information about the Python-list mailing list