[Python-Dev] Adding Japanese Codecs to the distro
M.-A. Lemburg
mal@lemburg.com
Wed, 22 Jan 2003 14:37:08 +0100
Atsuo Ishimoto wrote:
> On Wed, 22 Jan 2003 13:06:47 +0100
> "M.-A. Lemburg" <mal@lemburg.com> wrote:
>
>>Now, if we took the only the C version of Tamito's codec, we'd
>>end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of
>>5...
>>
>
> Please try
> strip ./c/_japanese_codecs.so
>
> In my linux box, this reduces size of _japanese_codecs.so from 530 KB
> into 135 KB. I think this is reasonable size because it contains more
> tables than Hisao's version.
Ok, we're finally approaching a very reasonable size :-)
BTW, why is it that Hisao can use one table for all supported
encodings where Tamito uses 6 tables ?
>>Hisao's approach uses a single table which fits into 58kB Python
>>source code. Boil that down to a static C table and you'll end up
>>with something around 10-20kB for static C data. Hisao does
>>still builds a dictionary using this data, but perhaps that step
>>could be avoided using the same techniques that Fredrik used
>>in boiling down the size of the unicodedata module (which holds
>>the Unicode Database).
>
> Thank you for your advice. I will try it later, if you still think
> JapaneseCodec is too large.
That would be great, thanks !
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting: http://www.egenix.com/
Python Software: http://www.egenix.com/files/python/