[Python-Dev] Re: [I18n-sig] Planned updates for cjkcodecs before
2.4a1
Hye-Shik Chang
perky at i18n.org
Wed Jun 16 07:16:52 EDT 2004
On Wed, Jun 16, 2004 at 11:33:59AM +0200, M.-A. Lemburg wrote:
> Hye-Shik Chang wrote:
[snip]
> >2. Merge two or three simliar C codecs into one. We have one C
> > codec for every each python codecs currently. I have got an
> > idea to merge them into several similar groups and many common
> > part of .so binaries will be saved:
> >
> > _codecs_jacodecs_1.so: euc-jp, shift-jis, iso-2022-jp,
> > iso-2022-jp-1, iso-2022-jp-ext
> > _codecs_jacodecs_2.so: euc-jisx0213, shift-jisx0213, iso-2022-jp-3,
> > euc-jis-2004, shift-jis-2004,
> > iso-2022-jp-2004
> > _codecs_jacodecs_3.so: iso-2022-jp-2
> > _codecs_kocodecs_1.so: euc-kr, johab, iso-2022-kr
> > _codecs_kocodecs_2.so: cp949
> > _codecs_zhcodecs_1.so: gb2312, gbk, gb18030, hz
> > _codecs_zhcodecs_2.so: big5, cp950
>
>
> +1, but why not put all Japanese codecs into one module and
> dito for the Korean and Chinese ones ?
>
> Note that todays OS linkers will only mmap those pieces
> of code into the process memory that are actually needed
> by the application, so even though the size of the modules
> increases, the application process memory foot-print is
> likely not to increase.
Okay. But how about embedded, freezed environments or statically
compiled into python by uncommenting from Modules/Setup? If somebody
need to support only legacy Japanese encodings, he will want to
include a legacy mapping(70K) but will not want JIS X 0213(85K) and
KS X 1001, GB2312 mappings(200K, for iso-2022-jp-2). And he may
want to save spaces by just erasing files. In fact, I don't know
how real Japanese developers use but just guessed it. :)
[snip]
>
> If you don't believe this, compare the resident size of
> Python with and without unicodedata loaded. The difference
> on my machine is a measily 30kB, not the 250kB of the complete
> module.
I do believe this. This is also why I wrote cjkcodecs in not pure
Python but C extensions.
Hye-Shik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040616/66a5d767/attachment.bin
More information about the Python-Dev
mailing list