UTF-8 problem encoding and decoding in Python3
Almar Klein
almar.klein at gmail.com
Mon Oct 11 04:27:35 EDT 2010
On 10 October 2010 23:01, Hidura <hidura at gmail.com> wrote:
> I try to encode a binary file what was upload to a server and is
> extract from the wsgi.input of the environ and comes as an unicode
> string.
>
Firstly, UTF-8 is not meant to encode arbitrary binary data. But I guess you
could have a Unicode string in which the character index represents a byte
number. (But it's ugly!)
So if you can, you could make sure to send the file as just bytes, or if it
must be a string, base64 encoded. If this is not possible you can try the
code below to obtain the bytes, not a very fast solution, but it should work
(Python 3):
MAP = {}
for i in range(256):
MAP[tmp] = eval("'\\u%04i'" % i)
# Let's say 'a' is your string
b''.join([MAP[c] for c in a])
Cheers,
Almar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20101011/567c1199/attachment.html>
More information about the Python-list
mailing list