Jython: How to import escaped Unicode and export utf-8?
Martin von Loewis
loewis at informatik.hu-berlin.de
Mon Apr 30 10:22:02 EDT 2001
Maurice Bauhahn <bauhahnm at clara.net> writes:
> It appears that the problem is that programmers have killed a
> substantial part of the Unicode side of Jython. The README.txt file
> which accompanies Jython 2.1.a1
I can't see why anything has been killed, here.
> - Text files will pass data read and written through the default
> codecs for the JVM. Binary files will write only the lower eight
> bits of each unicode character.
Sure, this is almost the same as CPython: writing a Unicode object to
a file will encode it with the default encoding. Writing a byte string
to a file will write the bytes.
Since Jython uses Java strings both for Unicode and byte strings, it
has an "extra" byte in each element of a byte string, which is not
written to the file.
In any case, you have to encode Unicode data with an explicit encoding
before writing them to files.
> - The \x escape have changed, now it will eat two hex characters
> but never more. The behaviour matches CPython2.0
>
> Presumeably the first item is only referring to the default 'ASCII'...which
> can be changed. The second is, however, disasterous, if I understand it
> propeprly.
I think you don't understand it properly. Why is it disasterous?
Regards,
Martin
More information about the Python-list
mailing list