[I18n-sig] Codecs

M.-A. Lemburg mal@lemburg.com
Mon, 05 Jun 2000 19:38:38 +0200

Andy Robinson wrote:
> Replying to MAL slightly out of order:
> > Note that you can easily add you own wrappers of codecs.lookup()
> > which then give you an object instead of the tuple.
> >
> > The extensibility argument is a problem with the current
> > solution, but is there really such a great need for extra
> > codec APIs ? (Please remember that all codec writers would
> > have to implement these new APIs -- there more you put in
> > there the more difficult and less attractive it gets...)
> I'm proposing a place to put non-standard extensions.
> The whole point is that these are things which are useful
> for multi-byte codecs and non-European languages, but will
> certainly not exist for all codecs.  These could be exposed
> as functions within the relevant codec module, but it seems
> clean if codecs module provides the lookup functionality,
> and the particular codec can provide new 'services' itself.

That's already possible via the stream writer/reader object.
The two extra functions encode/decode are really only there
to enhance performance of the builtin encoding machinery
(which only needs stateless converters).
You can easily add new methods to the stream writer and
reader objects. They also allow you to keep state -- which
a simple entry in a codec registry object would not.

Perhaps I'm missing something ?

> > Here are some:
> >
> > * The tuple entries have two different flavours: the first
> > two are readily usable encode/decode APIs, while the last
> > two point to factory functions which can be used to create
> > new objects.
> >
> > * Tuples are much easier to create and query at C level than
> > Python objects having a certain interface.
> >
> > * The tuples can easily be cached and this is what the codec
> > registry currently does to enhance performance. Object lookups
> > are slower than tuple entry lookups (ok, no so much an argument,
> > because the conversion itself is likely to cause much more
> > overhead).
> >
> > * There is quite a lot of code in the dist which already uses
> > the tuple value (all codecs, the codec registry, sample apps,
> > etc.).
> >
> > * Who's going to write the code and produce the patches ?
> I did argue for this originally at least twice but got
> ignored by everyone. 

Could be that we were too busy with other things, e.g. 
the source code encoding debate ;-)

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/