More charset troubles (Re: Codecs for ISO 8859-11 (Thai) and 8859-16 (Romanian))

Peter Jacobi peter_jacobi at gmx.net
Tue Aug 3 10:17:15 CEST 2004


Hi Martin, All,

"Martin v. Löwis" <martin at v.loewis.de> wrote in message 
> Therefore, it would be a protocol violation (strictly speaking)
> if one would use iso-8859-11 in, say, a MIME charset= header.

Strictly speaking, there are some more dark corners to check.
All ISO charsets should be, strictly speaking, qualified by year. And
in fact there were some prominent changes, e.g. in 8859-7 (greek).
What to do of them?

Looking around:
- the RFC references a fixed year old version
- Unicode mapping files and libiconv track the newest version
- IBM ICU4C provides all versions
- Python (not by planning, I assume) has a "middle" version with
some features of the old mapping table (no currency signs) and some
features of the new (0xA1=0x2018, 0xA2=0x2019)

Weird.

Best Regards,
Peter Jacobi



More information about the Python-list mailing list