[I18n-sig] Planned updates for cjkcodecs before 2.4a1

M.-A. Lemburg mal at egenix.com
Wed Jun 16 07:56:06 EDT 2004

Hye-Shik Chang wrote:
> On Wed, Jun 16, 2004 at 11:33:59AM +0200, M.-A. Lemburg wrote:
>>Hye-Shik Chang wrote:
> [snip]
>>>2. Merge two or three simliar C codecs into one.  We have one C
>>>  codec for every each python codecs currently.  I have got an
>>>  idea to merge them into several similar groups and many common
>>>  part of .so binaries will be saved:
>>>    _codecs_jacodecs_1.so: euc-jp, shift-jis, iso-2022-jp,
>>>                           iso-2022-jp-1, iso-2022-jp-ext
>>>    _codecs_jacodecs_2.so: euc-jisx0213, shift-jisx0213, iso-2022-jp-3,
>>>			    euc-jis-2004, shift-jis-2004,
>>>			    iso-2022-jp-2004
>>>    _codecs_jacodecs_3.so: iso-2022-jp-2
>>>    _codecs_kocodecs_1.so: euc-kr, johab, iso-2022-kr
>>>    _codecs_kocodecs_2.so: cp949
>>>    _codecs_zhcodecs_1.so: gb2312, gbk, gb18030, hz
>>>    _codecs_zhcodecs_2.so: big5, cp950
>>+1, but why not put all Japanese codecs into one module and
>>dito for the Korean and Chinese ones ?
>>Note that todays OS linkers will only mmap those pieces
>>of code into the process memory that are actually needed
>>by the application, so even though the size of the modules
>>increases, the application process memory foot-print is
>>likely not to increase.
> Okay. But how about embedded, freezed environments or statically
> compiled into python by uncommenting from Modules/Setup?  

Same thing: the OS will only load those parts that are actually
needed into memory. The only downside with having e.g. all
modules statically linked into the python binary is the file
size. OTOH, using static linking improves performance.

> If somebody
> need to support only legacy Japanese encodings, he will want to
> include a legacy mapping(70K) but will not want JIS X 0213(85K) and
> KS X 1001, GB2312 mappings(200K, for iso-2022-jp-2).  And he may
> want to save spaces by just erasing files.  In fact, I don't know
> how real Japanese developers use but just guessed it. :)

Is this a common enough use case to warrant the added
complexity of having to find the right _[123] mapping for
the codec in question ?

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Jun 16 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

More information about the I18n-sig mailing list