Python and Jython inconsistencies when encoding strings
Samuele Pedroni
pedronis at bluewin.ch
Fri Sep 6 11:43:45 EDT 2002
Martin v. Löwis <loewis at informatik.hu-berlin.de> wrote in message
j4ofbbclfp.fsf at informatik.hu-berlin.de...
> >>> s
> u"\u0153"
>
> Now, U+0153 is LATIN SMALL LIGATURE OE. It so happens that \x9c (what
> the terminal sends) is U+0153 in CP 1252 (which is the ANSI code page
> on your Windows installation). This might be a bug in Java, which
> assumes that bytes sent by the terminal are in the ANSI code page,
> when they are really in the OEM code page.
no it's more the Jython parser that does that, things can be fixed running
Jython as
jython -Dpython.console.encoding=cp850
on the other hand output seems buggy for:
print s.encode("cp850")
[I have reported that on our SF bug tracker]
> > Does anybody know what is causing this inconsistency? Is there any way
to
> > avoid it?
>
> Yes. Don't use the console.
sticking to ascii there can avoid some troubles :).
regards
More information about the Python-list
mailing list