[I18n-sig] IANA names for character set encodings?

M.-A. Lemburg mal@lemburg.com
Sat, 09 Feb 2002 12:19:49 +0100


Tom Emerson wrote:
> 
> M.-A. Lemburg writes:
> > Adding all of them seems overkill though... and cumbersome, e.g.
> > nobody uses names like ANSI_X3.4-1968 -- us-ascii is the
> > common name.
> 
> Sure, but I've seen machine generated markup/documents that make use
> of the ANSI_X3.4-1968 name, particularly those coming out of
> Government agencies.
> 
> If we are going to support the IANA names, then there is no reason not
> to support all of them. Picking and choosing those that we think
> aren't used is asking for a bug report.
> 
> This is a no brainer.

How large would such an alias dictionary be ? 

Looking at the IANA listing it seems rather lengthy. What I'm
worried about is that Python startup time will get worse for
programs using codecs (I sometimes wish Python had a builtin
on-disk registry where we could put static data like this).

Anyway, if you all think this is a non-issue, fine with me.

Bill, can you parse the IANA listing into dictionary 
definition ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/