[Python-Dev] Some thoughts on the codecs...

Mark Hammond mhammond@skippinet.com.au
Wed, 17 Nov 1999 13:57:48 +1100


> You will need to provide a way for a module (in the "codec"
> package) to
> state *beforehand* that it should be loaded for the X, Y, and
...

> The alternative would be to have stub modules like:

Actually, I was thinking even more radically - drop the codec registry
all together, and use modules with "well-known" names  (a slight
precedent, but Python isnt adverse to well-known names in general)

eg:
iso-8859-1.py:

import unicodec
def encode(...):
  ...
def decode(...):
  ...

iso-8859-2.py:
from iso-8859-1 import *

The codec registry then is trivial, and effectively does not exist
(cant get much more trivial than something that doesnt exist :-):

def getencoder(encoding):
  mod = __import__( "encodings." + encoding )
  return getattr(mod, "encode")


> I believe that encoding names are legitimate file names, but
> they aren't
> necessarily Python identifiers. That kind of bungs up "import
> codec.iso-8859-1".

Agreed - clients should never need to import them, and codecs that
wish to import other codes could use "__import__"

Of course, I am not adverse to the idea of a registry as well and
having the modules manually register themselves - but it doesnt seem
to buy much, and the logic for getting a codec becomes more complex -
ie, it needs to determine the module to import, then look in the
registry - if it needs to determine the module anyway, why not just
get it from the module and be done with it?

Mark.