[IronPython] IronPython codec names not compatible with CPython
John Machin
sjmachin at lexicon.net
Sun Oct 8 04:54:00 CEST 2006
CPython recognises both 'gbk' and 'cp936' i.e. unicode('some string',
'gbk') does what you'd expect.
IronPython 1.0.1 recognises only 'cp936'.
CPython recognises 'mac_roman', 'mac_greek', etc.
IronPython doesn't.
After a [rare] flash of inspiration, I tried 'cp10000', 'cp10006', etc
and IronPython recognises these, which CPython doesn't.
The "differences" document says: """
IronPython's _codecs module implementation is incomplete. There are
several replace_error/lookup_error handlers that IronPython does not
implement.
"""
It is not apparent whether this is intended to mean that missing error
handlers is the *only* known deficiency.
IronPython Bug #3214 mentions "import encodings" as fixing a
LookupError. Well, you learn something new every day:
1. CPython permits one to import encodings, but it's not documented
AFAICT, and it's *not* necessary in order to use 'gbk', 'mac_roman', etc.
2. After import encodings, IronPython recognises 'mac_roman' and
'mac_greek', but still not 'gbk'.
How much of the above is bug and how much is feature? What is this
mysterious encodings module anyway? Does this mean the CPython test
suite doesn't cover the above cases? Are the equivalences (mac_roman,
cp10000) etc correct and official? Should I just dump all of the above
into the IronPython Issue Tracker?
Cheers,
John
More information about the Ironpython-users
mailing list