unicode and socket

Christos TZOTZIOY Georgiou tzot at sil-tec.gr
Thu Mar 3 09:51:48 EST 2005


On 18 Feb 2005 19:10:36 -0800, rumours say that zyqnews at 163.net might have
written:

>It's really funny, I cannot send a unicode stream throuth socket with
>python while all the other languages as perl,c and java can do it.

I don't know about perl.  What I think you mean by unicode in C most probably is
the wchar_t, which is Unicode encoded as 'ucs-2' or 'utf-16' (little or big
endian, depending on your platform) or maybe a 4-byte int, for which I don't
know a Python equivalent.  And I /assume/ in Java that Unicode is equivalent to
'utf-16' encoded strings when input/output.

Perhaps Unicode encoded as 'utf-16' is what you're after.  However, Unicode
encoded as 'utf-8' (like others also suggested) might be what you /should/ be
using, given that this encoding has some attractive properties (no null bytes,
no spurious control characters etc).

Don't interpret as weakness the explicitness requested from Python.
-- 
TZOTZIOY, I speak England very best.
"Be strict when sending and tolerant when receiving." (from RFC1958)
I really should keep that in mind when talking with people, actually...



More information about the Python-list mailing list