[Tutor] unicode issue? (fwd)

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed Mar 22 01:41:29 CET 2006


> A friend of mine showed me where the unicode is showing up but we still
> can't get script to work right. We tried encoding the appropriate
> variable but it still is spitting back the error. Would I just have:
> u'\u201c'.encode('utf8') in my script or should it be
> md['vote-desc'][0].encode('utf8')


[Keeping tutor in CC.  Please do not send replies only to me; you need to
give other people on Tutor the opportunity to answer as well.


Hi Matt,


I should clarify: when I said:

> What's going on is that one of the strings in the string concatenation
> above contains a Unicode string.

I was a bit imprecise.  What I should really have said was:

    What's going on is that at least one --- there could be more! ---
    strings in the string concatenation contains a Unicode string.

I'm positive you have more than one Unicode string in there.  *grin*


Rather than hunt-and-peck for all the places where Unicode is coming from,
it's probably a better approach to just wholesale encode() the whole
string after you do the concatenation and before passing it off to
write().

Alternative, take a close look at the codecs example I showed before near
the bottom of the last message: it handles encoding and decoding for you.
Using the codecs module is probably the better approach here, since
otherwise you have to look at every file write(), and that can be a bit
tiring.



More information about the Tutor mailing list