[Python-Dev] Planned updates for cjkcodecs before 2.4a1
Hye-Shik Chang
perky at i18n.org
Wed Jun 16 05:17:18 EDT 2004
I have planned few things to update in cjkcodecs before 2.4 alpha1
is out. If you have any opionions or objections, please tell me.
1. Update JIS X 0213 to its first amendment (a.k.a JIS X 0213:2004)
This will introduce three new encodings; euc-jis-2004, shift_jis-2004
and iso-2022-jp-2004. It's not so different from their each
preceding encodings but we may need to keep both of versions due
to incompatibilities and encoding name change. (This won't bloat
code size a lot. I expect it around 3~5K.)
2. Merge two or three simliar C codecs into one. We have one C
codec for every each python codecs currently. I have got an
idea to merge them into several similar groups and many common
part of .so binaries will be saved:
_codecs_jacodecs_1.so: euc-jp, shift-jis, iso-2022-jp,
iso-2022-jp-1, iso-2022-jp-ext
_codecs_jacodecs_2.so: euc-jisx0213, shift-jisx0213, iso-2022-jp-3,
euc-jis-2004, shift-jis-2004,
iso-2022-jp-2004
_codecs_jacodecs_3.so: iso-2022-jp-2
_codecs_kocodecs_1.so: euc-kr, johab, iso-2022-kr
_codecs_kocodecs_2.so: cp949
_codecs_zhcodecs_1.so: gb2312, gbk, gb18030, hz
_codecs_zhcodecs_2.so: big5, cp950
3. Split some mapping keeper modules to few group-based modules. This
will save memory and spaces for who need only legacy codecs like
"euc-kr only".
_codecs_mapdata_ko_KR ->
_codecs_komapdata_1.so: KS X 1001
_codecs_komapdata_2.so: cp949
_codecs_mapdata_ja_JP ->
_codecs_jamapdata_1.so: JIS X 0208, JIS X 0212
_codecs_jamapdata_2.so: JIS X 0213:2000 and :2004
_codecs_mapdata_zh_CN ->
_codecs_zhmapdata_1.so: gb2312, gbk, gb18030
_codecs_mapdata_zh_TW ->
_codecs_zhmapdata_2.so: big5, cp950
If these sound acceptable for python-dev people, they will be
implemented as CJKCodecs 1.1 first and imported into python later
(before 2.4a1).
Hye-Shik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040616/b576eb76/attachment.bin
More information about the Python-Dev
mailing list