Re: [Python-Dev] Adding Japanese Codecs to the distro

22 Jan 2003

      Atsuo Ishimoto wrote:
...
On Wed, 22 Jan 2003 13:06:47 +0100
"M.-A. Lemburg"  wrote:
...
Now, if we took the only the C version of Tamito's codec, we'd
end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of
5...
Please try
   strip ./c/_japanese_codecs.so
In my linux box, this reduces size of _japanese_codecs.so from 530 KB
into 135 KB. I think this is reasonable size because it contains more
tables than Hisao's version.
Ok, we're finally approaching a very reasonable size :-)

BTW, why is it that Hisao can use one table for all supported
encodings where Tamito uses 6 tables ?
...
...
Hisao's approach uses a single table which fits into 58kB Python
source code. Boil that down to a static C table and you'll end up
with something around 10-20kB for static C data. Hisao does
still builds a dictionary using this data, but perhaps that step
could be avoided using the same techniques that Fredrik used
in boiling down the size of the unicodedata module (which holds
the Unicode Database).
Thank you for your advice. I will try it later, if you still think
JapaneseCodec is too large.
That would be great, thanks !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/