UTF-8 problem encoding and decoding in Python3

Almar Klein almar.klein at gmail.com
Tue Oct 12 17:28:38 EDT 2010


>    So if you can, you could make sure to send the file as just bytes,
>>    or if it must be a string, base64 encoded. If this is not possible
>>    you can try the code below to obtain the bytes, not a very fast
>>    solution, but it should work (Python 3):
>>
>>
>>    MAP = {}
>>    for i in range(256):
>>         MAP[tmp] = eval("'\\u%04i'" % i)
>>
> >
> >     # Let's say 'a' is your string
> >     b''.join([MAP[c] for c in a])
> >
>
> I don't know what you're trying to do here.
>
> 1. 'tmp' is the same for every iteration of the 'for' loop.
>
> 2. A Unicode escape sequence expects 4 hexadecimal digits; the 'i'
> format gives a decimal number.
>
> 3. Using 'eval' to make a string this way is the long (and wrong) way
> to do it; chr(i) would have the same effect.
>
> 4. The result of the eval is a string, but you're performing a join
> with a bytestring, hence the exception.


Mmm, you're right. I didn't look at this carefully enough, and then made an
error in copying the source code. Sorry for that ...

Here's a solution that should work (if I understand your problem correctly):
your_bytes = bytes([ord(c) for c in your_string])

  Almar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20101012/49be5170/attachment.html>


More information about the Python-list mailing list