Why does the "".join(r) do this?
Jim Hefferon
jhefferon at smcvt.edu
Thu May 20 20:45:21 EDT 2004
Peter Otten <__peter__ at web.de> wrote
> So why doesn't it just concatenate? Because there is no way of knowing how
> to properly decode chr(174) or any other non-ascii character to unicode:
>
> >>> chr(174).decode("latin1")
> u'\xae'
> >>> chr(174).decode("latin2")
> u'\u017d'
> >>>
Forgive me, Peter, but you've only rephrased my question: I'm going to
decode them later, so why does the concatenator insist on decoding
them now? As I understand it (perhaps this is my error),
encoding/decoding is stuff that you do external to manipulating the
arrays of characters.
> Use either unicode or str, but don't mix them. That should keep you out of
> trouble.
Well, I got this string as the filename of some kind of Macintosh file
(I'm on Linux but I'm working with an archive that contains some pre-X
Mac stuff) while calling some os and os.path functions. So I'm taking
strings from a Python library function (and using % to stuff them into
strings that will end up on the web, which should preserve
unicode-type-ness, right?) and then .join-ing them.
I didn't go into the whole story when posting, because I tried to boil
the question down. Perhaps I should have.
Thanks; I am often struck by how helpful this group is,
Jim
More information about the Python-list
mailing list