[Python-Dev] Unicode charmap decoders slow

M.-A. Lemburg mal at egenix.com
Thu Oct 6 11:13:50 CEST 2005


Hye-Shik Chang wrote:
> On 10/6/05, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>>Hye-Shik, could you please provide some timeit figures for
>>the fastmap encoding ?
>>

Thanks for the timings.

> (before applying Walter's patch, charmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 100 loops, best of 3: 3.35 msec per loop
> 
> (applied the patch, improved charmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.11 msec per loop
> 
> (the fastmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.04 msec per loop
> 
> (utf-8 decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "s.decode(e)"
> 1000 loops, best of 3: 851 usec per loop
> 
> Walter's decoder and the fastmap decoder run in mostly same way.
> So the performance difference is quite minor.  Perhaps, the minor
> difference came from the existence of wrapper function on each codecs;
> the fastmap codec provides functions usable as Codecs.{en,de}code
> directly.
> 
> (encoding, charmap codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "u.encode(e)"
> 100 loops, best of 3: 3.51 msec per loop
> 
> (encoding, fastmap codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "u.encode(e)"
> 1000 loops, best of 3: 536 usec per loop
> 
> (encoding, utf-8 codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "u.encode(e)"
> 1000 loops, best of 3: 1.5 msec per loop

I wonder why the UTF-8 codec is slower than the fastmap
codec in this case.

> If the encoding optimization can be easily done in Walter's approach,
> the fastmap codec would be too expensive way for the objective because
> we must maintain not only fastmap but also charmap for backward
> compatibility.

Indeed. Let's go with a patched charmap codec then.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 06 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list