encoding problem CP850 to ISO_8859_9

P. Alejandro Lopez-Valencia dradul at yahoo.com
Thu May 23 10:52:59 EDT 2002


"v.wehren" <v.wehren at home.nl> escribió en el mensaje
news:625H8.29184$48.2236753 at zwoll1.home.nl...
> When trying to convert the encoding of a file originally in CP850 to
> ISO_8859_2 (Latin9), there are some mappings causing a UnicodeError
which is
> rather unexpected, especially since the characters are available in
both
> encodings (unlike most of the "box drawing" stuff, and so on, where a
"maps
> to <undefined>" is  logical. The offensive characters (plus the values
they
> eventually should be mapped to) are:

I suggest you check your target encodings first and ask again. Your
suppositions are incorrect. ISO-8859-9 is Latin 5 (Western Europe plus
Turkish) and ISO-8859-15 *is* Latin 9 (Western Europe plus Euro Currency
Sign) --- that is, the nineth Latin encoding (== using the Roman
alphabet) among all the character encodings defined under the ISO-8859
standard.

Furthermore, the characters you report as missing, Eth, Thorn, *do not*
exist in ISO-8859-2 (Latin 2 Central Europe). They exist only in the
ISO-8859-1 (Latin 1 Western Europe) encoding.




More information about the Python-list mailing list