Webpy and UnicodeDecodeError
Dave Angel
davea at ieee.org
Fri Dec 18 22:18:34 EST 2009
Oscar Del Ben wrote:
> <snip>
>> You'll notice that one of the strings is a unicode one, and another one
>> has the character 0x82 in it. Once join() discovers Unicode, it needs
>> to produce a Unicode string, and by default, it uses the ASCII codec to
>> get it.
>>
>> If you print your 'l' list (bad name, by the way, looks too much like a
>> '1'), you can see which element is Unicode, and which one has the \xb7
>> in position 42. You'll have to decide which is the problem, and solve
>> it accordingly. Was the fact that one of the strings is unicode an
>> oversight? Or did you think that all characters would be 0x7f or less?
>> Or do you want to handle all possible characters, and if so, with what
>> encoding?
>>
>> DaveA
>>
>
> Thanks for your reply DaveA.
>
> Since I'm dealing with file uploads, I guess I should only care about
> those. I understand the fact that I'm trying to concatenate a unicode
> string with a binary, but I don't know how to deal with this. Perhaps
> the uploaded file should be encoded in some way? I don't think this is
> the case though.
>
>
You have to decide what the format of the file is to be. If you have
some in bytes, and some in Unicode, you have to be explicit about how
you merge them. And that depends who's going to use the file, and for
what purpose.
Before you try to do a join(), you have to do a conversion of the
Unicode string(s) to bytes. Try str.encode(), where you get to specify
what encoding to use.
In general, you want to use the same encoding for all the bytes in a
given file. But as I just said, that's entirely up to you.
DaveA
More information about the Python-list
mailing list