[Python-Dev] Adding Japanese Codecs to the distro
Hye-Shik Chang
perky@fallin.lv
Thu, 16 Jan 2003 20:38:55 +0900
On Thu, Jan 16, 2003 at 11:05:55AM +0100, Martin v. L?wis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
>
> > Thoughts ?
>
> I'm in favour of adding support for Japanese codecs, but I wonder
> whether we shouldn't incorporate the C version of the Japanese codecs
> package instead, despite its size.
And, the most important merit that C version have but Pure version
doesn't is sharing library texts inter processes. Most modern OSes can
share them and C version is even smaller than Python version in case of
KoreanCodecs 2.1.x (on CVS)
Here's process status on FreeBSD 5.0/i386 with Python 2.3a1(of 2003-01-15)
system.
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
perky 56713 0.0 1.2 3740 3056 p3 S+ 8:11PM 0:00.08 python
: python without any codecs
perky 56739 6.3 5.7 15376 14728 p3 S+ 8:17PM 0:04.02 python
: python with python.cp949 codec
perky 56749 0.0 1.2 3884 3196 p3 S+ 8:20PM 0:00.06 python
: python with c.cp949 codec
alice(perky):/usr/pkg/lib/python2.3/site-packages/korean% size _koco.so
text data bss dec hex filename
122861 1844 32 124737 1e741 _koco.so
On C codec, processes shares 122861 bytes on system-wide and consumes only
1844 bytes each, besides on Pure codec consumes 12 Mega bytes each. This
must concerned very seriously for launching time of have "# encoding: euc-jp"
or something CJK encodings.
> I would also suggest that it might be more worthwhile to expose
> platform codecs, which would give us all CJK codecs on a number of
> major platforms, with a minimum increase in the size of the Python
> distribution, and with very good performance.
KoreanCodecs is tested on {Free,Net,Open}BSD, Linux, Solaris, HP-UX,
Windows{95,98,NT,2000,XP}, Cygwin without any platform #ifdef's.
I sure that any CJK codecs can be ported into any platforms that Python is
ported.
Regards,
Hye-Shik =)