[I18n-sig] Codecs

Paul Prescod paul@prescod.net
Mon, 05 Jun 2000 10:21:11 -0500

"M.-A. Lemburg" wrote:
> > ...
> > Are there any good reasons to prefer getting a tuple back from codecs.lookup()?
> Here are some:
> * The tuple entries have two different flavours: the first
> two are readily usable encode/decode APIs, while the last
> two point to factory functions which can be used to create
> new objects.

Right, and with an object syntax you can only deal with the properties
you are interested in, not with all four, all of the time.

> * Tuples are much easier to create and query at C level than
> Python objects having a certain interface.

I don't see that as very important!

> * The tuples can easily be cached and this is what the codec
> registry currently does to enhance performance. Object lookups
> are slower than tuple entry lookups (ok, no so much an argument,
> because the conversion itself is likely to cause much more
> overhead).

I agree that this is not much of an argument. :)

> * There is quite a lot of code in the dist which already uses
> the tuple value (all codecs, the codec registry, sample apps,
> etc.).
> * Who's going to write the code and produce the patches ?

These two are important arguments but we need to decide what we want
before we start deciding whether it is doable.

> The extensibility argument is a problem with the current
> solution, but is there really such a great need for extra
> codec APIs ? 

I don't know yet. If we knew now, we'd add them now. :)

> (Please remember that all codec writers would
> have to implement these new APIs -- there more you put in
> there the more difficult and less attractive it gets...)

I think that Andy was thinking that codecs might be a useful place to
"hang" arbitrary encoding-related methods -- whether or not they are
standardized. Python is dynamically typed so we don't need to conform to
a restrictive interface definition.

Anyhow, more than the extensibility, returning structured objects is
just more Pythonic. I hate having to remember the position of tuple
return values.

> encoder = codecs.encoder('utf-8')
> # dito for .decoder, .streamwriter, .streamreader

That might be an acceptable compromise on the syntactic issue....but....

It doesn't seem much more work to just make a version of "lookup" that
wraps tuples in objects. If we took this half-step then we could decide
to move to "full objects" in the future and break a lot less code.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Simplicity does not precede complexity, but follows it. 
	- http://www.cs.yale.edu/~perlis-alan/quotes.html