unicode 3 digit decimal conversion
"Martin v. Löwis"
martin at v.loewis.de
Sun Sep 28 16:03:45 EDT 2003
Rune Hansen wrote:
> Stalker told me to send the letter "ø" as \248 or as xf8 (notice the
> missing "\"). At this point I'm sending
> quoteattr(unicode('string',"iso-8859-1).encode("utf-8")) which is
> neither of the above.(..?).
Correct: UTF-8 works differently. I find it surprising that anybody
actually proposes to send non-ASCII characters using xHH, as this
byte sequence my coincidently happen in ASCII text as well.
> Anyway, the server is still happy, and the data views correctly in the
> web interface.
It is relatively easy to recognize UTF-8 in the input; it is unlikely
that "real" data look like UTF-8 by coincidence (unlike \-escaping
or x-escaping). So it might be that the server studies the input to
guess the encoding. This is bad style, of course - the protocol should
be clear about encodings (this protocol couldn't be published in an
IETF RFC).
> Stalker provides a perl and java API for the telnet server. I don't
> read perl code very well, and the java API is distributed as .class
> files(nothing new there, it's java after all) so I really don't know how
> Stalker is handling it.
Even then, you could only find out what the perl and java clients do -
you couldn't tell, from that, what other options the server might support.
Regards,
Martin
More information about the Python-list
mailing list