unicode(s, enc).encode(enc) == s ?
"Martin v. Löwis"
martin at v.loewis.de
Thu Dec 27 13:37:17 EST 2007
> Given no UnicodeErrors, are there any cases for the following not to
> be True?
>
> unicode(s, enc).encode(enc) == s
Certainly. ISO-2022 is famous for having ambiguous encodings. Try
these:
unicode("Hallo","iso-2022-jp")
unicode("\x1b(BHallo","iso-2022-jp")
unicode("\x1b(JHallo","iso-2022-jp")
unicode("\x1b(BHal\x1b(Jlo","iso-2022-jp")
or likewise
unicode("\x1b$@BB","iso-2022-jp")
unicode("\x1b$BBB","iso-2022-jp")
In iso-2022-jp-3, there are even more ways to encode the same string.
Regards,
Martin
More information about the Python-list
mailing list