Atsuo Ishimoto wrote:
On Wed, 22 Jan 2003 13:06:47 +0100 "M.-A. Lemburg"
wrote: Now, if we took the only the C version of Tamito's codec, we'd end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of 5...
Please try strip ./c/_japanese_codecs.so
In my linux box, this reduces size of _japanese_codecs.so from 530 KB into 135 KB. I think this is reasonable size because it contains more tables than Hisao's version.
Ok, we're finally approaching a very reasonable size :-) BTW, why is it that Hisao can use one table for all supported encodings where Tamito uses 6 tables ?
Hisao's approach uses a single table which fits into 58kB Python source code. Boil that down to a static C table and you'll end up with something around 10-20kB for static C data. Hisao does still builds a dictionary using this data, but perhaps that step could be avoided using the same techniques that Fredrik used in boiling down the size of the unicodedata module (which holds the Unicode Database).
Thank you for your advice. I will try it later, if you still think JapaneseCodec is too large.
That would be great, thanks ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/