[Python-3000] PEP 3138- String representation in Python 3000

Sun May 18 05:05:59 CEST 2008

Greg Ewing writes:

 > So it already needs some application-specific notion of
 > what constitutes a probable compression method built
 > into it, and if that list is to be extensible, it needs
 > an application-specific registry to manage it. Once
 > you've got that, the general codec registry doesn't
 > help you much.

Excuse me?  The codec-and-transform registry tells whether the codec
or transform is available in this Python; that's all it is supposed to
do.  Even if you do need an application-specific registry of
compressors, some Python-level registry is required to determine
whether a desired one is actually available and where it lives.

True, this could be done through the usual module mechanisms, but that
won't require any less coding than using the usual codec mechanism.
And I find Nick's rationale for a flat namespace of strings quite
convincing given that it won't cost any more.

I also suspect that it may make sense to allow various "standard
deobfuscations" of codec names as in glibc (whose version of iconv
considers "utf8", "UTF-8", and "Utf_8" to be equivalent names for
"Unicode UTF-8" according to rules which canonicalize case and strip
punctuation), as well as aliasing.  (These aren't strong reasons for
using a flat string registry, but they come more or less for free if
we do use it.)