Unicode conversion problem (codec can't decode)
gagsl-py2 at yahoo.com.ar
Fri Apr 4 08:25:29 CEST 2008
En Fri, 04 Apr 2008 01:35:08 -0300, Eric S. Johansson <esj at harvee.org>
> I'm having a problem (Python 2.4) converting strings with random 8-bit
> characters into an escape form which is 7-bit clean for storage in a
> Here's an example:
> body = meta['mini_body'].encode('unicode-escape')
> when given an 8-bit string, (in meta['mini_body']), the code fragment
> yields the error below.
> 'ascii' codec can't decode byte 0xe1 in position 13: ordinal not in
Because unicode-escape expects an unicode object as input; if you pass a
byte string, it tries to convert it to unicode using the default encoding
(ascii) and fails.
> I've read a lot of stuff about Unicode and Python and I'm pretty
> with how you can convert between different encoding types. What I don't
> understand is how to go from a byte string with 8-bit characters to an
> string where 8-bit characters are turned into two character hexadecimal
Almost there: use string-escape instead; it takes a byte string and
returns another byte string in ASCII.
> I really don't care about the character set used. I'm looking for a
> matched set
> of operations that converts the string to a seven bits a form and back
> to its
> original form.
Ok, string-escape should work. But which database are you using that can't
handle 8bit strings?
More information about the Python-list