Programmatically discovering encoding types supported by codecs module

python at python at
Sun Mar 28 12:48:52 CEST 2010


Thank you for your analysis - very interesting. Enjoyed your fromlist
choice of names. I'm still in my honeymoon phase with Python so I only
know the first part :)


----- Original message -----
From: "Gabriel Genellina" <gagsl-py2 at>
To: python-list at
Date: Wed, 24 Mar 2010 19:50:11 -0300
Subject: Re: Programmatically discovering encoding types supported by
codecs    module

En Wed, 24 Mar 2010 14:58:47 -0300, <python at> escribió:

>> After looking at how things are done in codecs.c and  
>> encodings/  I think you should enumerate all modules in the  
>> encodings package that define a getregentry function. Aliases come from  
>> encodings.aliases.aliases.
> Thanks for looking into this for me. Benjamin Kaplan made a similar
> observation. My reply to him included the snippet of code we're using to
> generate the actual list of encodings that our software will support
> (thanks to Python's codecs and encodings modules).

I was curious as whether both methods would give the same results:

py> modules=set()
py> for name in glob.glob(os.path.join(encodings.__path__[0], "*.py")):
...   name = os.path.basename(name)[:-3]
...   try: mod = __import__("encodings."+name,  
...   except ImportError: continue
...   if hasattr(mod, 'getregentry'):
...     modules.add(name)
py> fromalias = set(encodings.aliases.aliases.values())
py> fromalias - modules
py> modules - fromalias

There is a missing 'tactis' encoding (?) and about twenty without alias.

Gabriel Genellina


More information about the Python-list mailing list