[I18n-sig] Re: [XML-SIG] Character encodings and expat

M.-A. Lemburg mal@lemburg.com
Sat, 28 Oct 2000 15:54:19 +0200

"Martin v. Loewis" wrote:
> > Yup.  I plan to teach xmlproc the IANA registry, so that this should
> > not be a problem with xmlproc.
> With due respect, I hope this is not the way it that is done. Instead,
> I think codecs.lookup should know the IANA registry. It may be that
> this information comes with PyXML only for now, but it should be
> available to all Python applications. E.g. xml/__init__.py could
> do
> codecs.register(iana_lookup)
> where iana_lookup simply maps encodings to the "normalized" form.

That would be another option (this codec search function design
turns out to be far more useful than originally though ;-)...
> I agree with MAL that this should eventually end-up in Python proper.
> In any case, knowing the official aliases should not be restricted to
> xmlproc.

Right. Python's encodings package should know at least about all
common aliases used for the provided codecs.

Do you have a pointer to a list of IANA aliases ?
> > However, it is a problem that Python does not support any of the Far
> > East encodings yet.  Does anyone know if there are any plans to change
> > that?
> Again, I'd see no problem including Tamito Kajiyama's code in PyXML,
> if he wants us to ship it - or we could recommend JapaneseCodecs as an
> valuable addition to PyXML; this package also uses the distutils, so
> it is quite easy to install.

I think it should distributed as separate package: the codecs
are useful in a lot of contexts -- not only XML.
Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/