[Python-Dev] Adding Japanese Codecs to the distro

M.-A. Lemburg mal@lemburg.com
Wed, 22 Jan 2003 14:33:03 +0100


Hye-Shik Chang wrote:
> On Wed, Jan 22, 2003 at 01:06:47PM +0100, M.-A. Lemburg wrote:
> [snip]
> 
>>degas site-packages/japanese# du
>>337     ./c
>>1252    ./mappings
>>88      ./python
>>8       ./aliases
>>1790    .
>>
>>Hisao's Python codec is only 85kB in size.
>>
>>Now, if we took the only the C version of Tamito's codec, we'd
>>end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of
>>5...
>>
>>I wonder whether it wouldn't be possible to use the same tricks
>>Hisao used in his codec for a C version.
> 
> The trick must not be used in C version.

Why not ? Anything that can trim down the memory footprint
as well as the installation size is welcome :-)

> Because C codecs need to keep
> both of encoding and decoding maps as constants so that share texts
> inter processes and load the data only once in the whole system.
> This does matter for multiprocess daemons especially.

Indeed, that's why the Unicode database is also stored this
way.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/