[Tutor] Encode problem

Kent Johnson kent37 at tds.net
Mon May 4 22:10:18 CEST 2009


On Mon, May 4, 2009 at 3:54 PM, Sander Sweers <sander.sweers at gmail.com> wrote:
> 2009/5/4 Kent Johnson <kent37 at tds.net>:
>> str.decode() converts a string to a unicode object. unicode.encode()
>> converts a unicode object to a (byte) string. Both of these functions
>> take the encoding as a parameter. When Python is given a string, but
>> it needs a unicode object, or vice-versa, it will encode or decode as
>> needed. The encode or decode will use the system default encoding,
>> which as you have discovered is ascii. If the data being encoded or
>> decoded contains non-ascii characters, you get an error that you are
>> familiar with. These errors indicate that you are not correctly
>> handling encoded data.
>
> Very interesting read Kent!
>
> So if I get it correctly you are saying the join() is joining strings
> of str and unicode type? Then would it help to add a couple of "print
> type(the_string), the_string" before the .join() help finding which
> string is not unicode or is unicode where it shouldn't?

I think that was the original problem though I haven't seen enough
code to be sure. The current problem is (I tihnk) that he is writing
encoded data to a codec writer that expects unicode input, so it is
trying to convert str to unicode (so it can convert back to str!) and
failing.

Kent


More information about the Tutor mailing list