Unicode problems, yet again
Fredrik Lundh
fredrik at pythonware.com
Sun Apr 24 03:07:51 EDT 2005
Ivan Voras wrote:
> I have a string fetched from database, in iso8859-2, with 8bit
> characters, and I'm trying to send it over the network, via a socket:
>
> File "E:\Python24\lib\socket.py", line 249, in write
> data = str(data) # XXX Should really reject non-string non-buffers
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u0161' in
> position 123: ordinal not in range(128)
>
> The other end knows it should expect this encoding, so how to send it?
>
> (Does anyone else feel that python's unicode handling is, well...
> suboptimal at least?)
you mean it should be able to automagically infer that you want your
Unicode strings to be shipped in ISO-8859-2 when you write them
to a socket? wouldn't that annoy everyone using more common en-
codings, such as ISO-8859-1, UTF-8, and EUC-JP?
(the only "suboptimal" thing with Python's Unicode system is that it
forces you to learn that text is not just a bunch of bytes. for some
reason, some programmers find that being extremely hard -- and
for some reason, the same programmers usually have no problems
understanding that python integers, floats, and other objects are not
just a bunch of bytes. go figure...)
</F>
More information about the Python-list
mailing list