[Distutils] Re: [I18n-sig] Codecs for Big Five and GB 2312

M.-A. Lemburg mal@lemburg.com
Sat, 28 Oct 2000 15:59:40 +0200

"Martin v. Loewis" wrote:
> > If you are interested, the codec is available at:
> > http://pseudo.grad.sccs.chukyo-u.ac.jp/~kajiyama/python/iso_2022_7bit.py.gz
> I just had a look, and it seems like an interesting package. I'm
> slightly confused about the installation procedure, though.
> Installing into python2.0/encodings/{euc_jp,shift_jis,japanese}
> doesn't look right to me - add-on packages should be capable of
> installing into site-packages by default.
> I believe it would actually work if you just install without any
> arguments to setup.py. euc_jp would then end-up in
> python2.0/site-packages. Later, when you do
>   u"Hello".encode("euc-jp")
> it looks for a codec. Here, encodings.__init__.search_function do
>     modname = encoding.replace('-', '_')
>     modname = aliases.aliases.get(modname,modname)
>     try:
>         mod = __import__(modname,globals(),locals(),'*')
>     except ImportError,why:
>         _cache[encoding] = None
>         return None
> First, encoding becomes euc_jp. With no registered aliases, it would
> then call __import__ with "euc_jp", which will find the codec in
> site-packages.

The "right" way to install new codec packages is by placing them
inside a package which then registers a new search function in the
codec registry.

Tamito's other does this AFAIR.

To be able to use the codecs, a Python script must then import
the codecs package (which then registers the search function).

Having to import the package has two benefits:
1. the need for another codec package is visible in the source code
2. registering the search function is delayed until the codec
   package is first used
> In the long run, I'd hope that distutils provides a mean to install
> additional codecs, e.g via
> setup( ...
>       codecs = ['japanese']
>       ...)
> Then, distutils would collect all these strings, and importing codecs
> would roughly do
> for package in distutils.registered_codec_packages:
>   p=__import__(package,global(),locals(),"*")
>   p.register()
> japanese/__init__.py would provide a register function which registers
> another search_function, which would load euc_jp and shift_jis on
> demand. That way, users could install additional codecs which are
> available to everybody on the system, without having to hack the
> Python library proper.

Hmm, not sure here: programs which rely on non-standard
codecs should have an explicit "import myCodecs" at the top
of the file.

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/