[I18n-sig] CJKCodecs 1.0b1 is released
Hye-Shik Chang
perky@i18n.org
Sun, 13 Jul 2003 04:33:35 +0900
On Sat, Jul 12, 2003 at 09:14:11PM +0200, M.-A. Lemburg wrote:
> Hye-Shik Chang wrote:
> >And, I created utf-8 and utf-16 codec for cjkcodecs just for fun.
> >I shipped them because they are somewhat faster than Python's equivalents.
>
> That's interesting. How did you achieve the speedups ? The
> Python codecs for these are already rather well optimized.
>
Ahh. Sorry for incorrect statement. After my some tests, I found
Python's codecs are lots faster than CJKCodecs's for .encode() and
.decode() functions. (2x ~ 4x) CJKCodecs's codecs were faster than
Python's for StreamReader/Writers only. (by similar ratio)
> >(StreamReader benchmarks with a usual 10Kbyte chinese text)
> >(all values are in iterates/sec)
> >
> > Python CJKCodecs
> >read(16) 14 187
> >read(256) 221 1645
> >read(512) 468 1990
> >readline 361 921
> >readlines 785 1193
> >
> >They are not so big and don't replace Python's codecs by default.
> >(distributed as commented out on cjkcodecs/aliases.py)
> >So, I think they are not so useless comparing to their size.
>
> Ah, I think I know what's causing this: you are measuring
> Python function calls (.read() and readlines() for UTF-8/16
> are Python functions implemented in codecs.py) against
> C type methods.
Agreed.
I'm considering removing utf-{8,16} from 1.0 release and leave
utf-7 only. :)
Regards,
Hye-Shik =)