[Patches] [ python-Patches-873597 ] The cjkcodecs integration
SourceForge.net
noreply at sourceforge.net
Sat Jan 17 09:47:11 EST 2004
Patches item #873597, was opened at 2004-01-09 16:55
Message generated for change (Comment added) made by perky
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=873597&group_id=5470
Category: Library (Lib)
Group: Python 2.4
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Hye-Shik Chang (perky)
Summary: The cjkcodecs integration
Initial Comment:
(finally :)
CJKCodecs includes support for many East Asian legacy
encodings:
* Chinese (PRC): gb2312 gbk gb18030 hz
* Chinese (ROC): big5 cp950
* Japanese: cp932 shift-jis shift-jisx0213 euc-jp
euc-jisx0213 iso-2022-jp iso-2022-jp-1 iso-2022-jp-2
iso-2022-jp-3 iso-2022-jp-ext
* Korean: cp949 euc-kr johab iso-2022-kr
CJKCodecs integration to main python will make CJK
users more comfortable with the default installation
package.
And it's not as big as you might guess. :)
It bloats only 2% by source size:
% du -d0 -k python
37714 python
% du -d0 -k python+cjkcodecs
38504 python+cjkcodecs
And it bloats only 4% by source lines:
% echo `find python.cjkcodecs -type f -exec cat {}
\;|wc -l` "*100/" `find python -type f -exec cat {}
\;|wc -l` "-100" | bc
4
----------------------------------------------------------------------
>Comment By: Hye-Shik Chang (perky)
Date: 2004-01-17 23:47
Message:
Logged In: YES
user_id=55188
Okay. Committed as:
Modified files:
Doc/lib/libcodecs.tex 1.27
Lib/email/test/test_email_codecs.py 1.5
Lib/encodings/aliases.py 1.21
Modules/Setup.dist 1.43
Lib/test/regrtest.py 1.151
setup.py 1.181
Added files:
Lib/encodings/big5.py
Lib/encodings/cp932.py
Lib/encodings/cp949.py
Lib/encodings/cp950.py
Lib/encodings/euc_jisx0213.py
Lib/encodings/euc_jp.py
Lib/encodings/euc_kr.py
Lib/encodings/gb18030.py
Lib/encodings/gb2312.py
Lib/encodings/gbk.py
Lib/encodings/iso2022_jp.py
Lib/encodings/iso2022_jp_1.py
Lib/encodings/iso2022_jp_2.py
Lib/encodings/iso2022_jp_3.py
Lib/encodings/iso2022_jp_ext.py
Lib/encodings/iso2022_kr.py
Lib/encodings/johab.py
Lib/encodings/shift_jis.py
Lib/encodings/shift_jisx0213.py
Lib/test/cjkencodings_test.py
Lib/test/test_codecencodings_cn.py
Lib/test/test_codecencodings_jp.py
Lib/test/test_codecencodings_kr.py
Lib/test/test_codecencodings_tw.py
Lib/test/test_codecmaps_cn.py
Lib/test/test_codecmaps_jp.py
Lib/test/test_codecmaps_kr.py
Lib/test/test_codecmaps_tw.py
Lib/test/test_multibytecodec.py
Lib/test/test_multibytecodec_support.py
Modules/cjkcodecs/README
Modules/cjkcodecs/_big5.c
Modules/cjkcodecs/_cp932.c
Modules/cjkcodecs/_cp949.c
Modules/cjkcodecs/_cp950.c
Modules/cjkcodecs/_euc_jisx0213.c
Modules/cjkcodecs/_euc_jp.c
Modules/cjkcodecs/_euc_kr.c
Modules/cjkcodecs/_gb18030.c
Modules/cjkcodecs/_gb2312.c
Modules/cjkcodecs/_gbk.c
Modules/cjkcodecs/_hz.c
Modules/cjkcodecs/_iso2022_jp.c
Modules/cjkcodecs/_iso2022_jp_1.c
Modules/cjkcodecs/_iso2022_jp_2.c
Modules/cjkcodecs/_iso2022_jp_3.c
Modules/cjkcodecs/_iso2022_jp_ext.c
Modules/cjkcodecs/_iso2022_kr.c
Modules/cjkcodecs/_johab.c
Modules/cjkcodecs/_shift_jis.c
Modules/cjkcodecs/_shift_jisx0213.c
Modules/cjkcodecs/alg_iso8859_1.h
Modules/cjkcodecs/alg_iso8859_7.h
Modules/cjkcodecs/alg_jisx0201.h
Modules/cjkcodecs/cjkcommon.h
Modules/cjkcodecs/codeccommon.h
Modules/cjkcodecs/codecentry.h
Modules/cjkcodecs/iso2022common.h
Modules/cjkcodecs/map_big5.h
Modules/cjkcodecs/map_cp932ext.h
Modules/cjkcodecs/map_cp949.h
Modules/cjkcodecs/map_cp949ext.h
Modules/cjkcodecs/map_cp950ext.h
Modules/cjkcodecs/map_gb18030ext.h
Modules/cjkcodecs/map_gb18030uni.h
Modules/cjkcodecs/map_gb2312.h
Modules/cjkcodecs/map_gbcommon.h
Modules/cjkcodecs/map_gbkext.h
Modules/cjkcodecs/map_jisx0208.h
Modules/cjkcodecs/map_jisx0212.h
Modules/cjkcodecs/map_jisx0213.h
Modules/cjkcodecs/map_jisx0213_pairs.h
Modules/cjkcodecs/map_jisxcommon.h
Modules/cjkcodecs/map_ksx1001.h
Modules/cjkcodecs/mapdata_ja_JP.c
Modules/cjkcodecs/mapdata_ko_KR.c
Modules/cjkcodecs/mapdata_zh_CN.c
Modules/cjkcodecs/mapdata_zh_TW.c
Modules/cjkcodecs/multibytecodec.c
Modules/cjkcodecs/multibytecodec.h
Modules/cjkcodecs/tweak_gbk.h
Thank you! :-)
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2004-01-10 06:24
Message:
Logged In: YES
user_id=21627
These changes look good to me, please apply them.
As for the regrtest modification, please change the tests to
provide a skip_expected setting, which is computed depending
on the presence of the test data - see test_normalization.py
for an example.
It would be good if the header files containing large tables
would contain an indication on how these tables have been
created (e.g. what data source have been used, and what
modification had been applied after the tables where created
from the sources).
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2004-01-10 06:10
Message:
Logged In: YES
user_id=21627
Can you please make that server report the file type as
application/octet-stream?
----------------------------------------------------------------------
Comment By: Hye-Shik Chang (perky)
Date: 2004-01-09 17:00
Message:
Logged In: YES
user_id=55188
Hmm. SF seems not to accept big patches. (385KB)
I uploaded the patch to
http://people.freebsd.org/~perky/pythoncjkcodecs.diff.bz2
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=873597&group_id=5470
More information about the Patches
mailing list