[I18n-sig] codec aliases

Tamito KAJIYAMA kajiyama@grad.sccs.chukyo-u.ac.jp
Tue, 12 Dec 2000 11:01:15 +0900


M.-A. Lemburg wrote:
|
| > One problem is that the alias "japanese.jis-7" does not work
| > unless the corresponding original name "japanese.iso-2022-jp"
| > have been referred once.  This is because the alias is defined
| > by means of getaliases() in japanese/iso_2022_jp.py, and this
| > module is not imported when the first time the original name is
| > referred.  Is there a work-around for this problem?
| 
| The only "work-around" I know of (which doesn't involve some
| kind of boot code) is by defining aliases via almost empty
| module which redirect the search function to the correct
| codec, e.g.
| 
| codec_alias.py:
| ---------------
| from codec_alias_target import *

I'm not sure how your work-around works.  How is codec_alias.py
used?  Is that intended to be imported in site.py?

I also think that aliases cannot be defined only by importing
a codec module, since the aliases are defined by means of
getaliases(), and this function is not invoked until the
original name corresponding to the aliases is looked up first.

I wonder if I need to put a call of codecs.register() somewhere
in the modularized codecs...

| > The other problem is that hyphens and underscores are
| > significant in an alias, although they are not in an original
| > name.  A work-around is to define all combinations of hyphens
| > and underscores for an alias (e.g. defining both
| > "japanese.jis-7" and "japanese.jis_7"), but this seems not a
| > good idea for me.
| 
| Codec aliases returned by codec.getaliases() must always use 
| the underscore naming scheme.
| 
| The standard search function will convert hyphens to underscores
| *before* applying the alias mapping, so there's no need to worry
| about different combinations of hyphens and underscores in
| the alias names (unless I've overlooked something here).

Returning names with underscores in getaliases() seems not
sufficient.  In encodings/__init__.py:

def search_function(encoding):
    ...
    # Cache the encoding and its aliases
    _cache[encoding] = entry
    try:
        codecaliases = mod.getaliases()
    except AttributeError:
        pass
    else:
        for alias in codecaliases:
            _cache[alias] = entry
    return entry

The names returned by mod.getaliases() are put into _cache as it 
is, so equivalent names with hyphens will not be defined.

Regards,

-- 
KAJIYAMA, Tamito <kajiyama@grad.sccs.chukyo-u.ac.jp>