On 10/6/05, M.-A. Lemburg
Hye-Shik, could you please provide some timeit figures for the fastmap encoding ?
(before applying Walter's patch, charmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "s.decode(e)" 100 loops, best of 3: 3.35 msec per loop (applied the patch, improved charmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "s.decode(e)" 1000 loops, best of 3: 1.11 msec per loop (the fastmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; u=unicode(s, e)" "s.decode(e)" 1000 loops, best of 3: 1.04 msec per loop (utf-8 decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, e)" "s.decode(e)" 1000 loops, best of 3: 851 usec per loop Walter's decoder and the fastmap decoder run in mostly same way. So the performance difference is quite minor. Perhaps, the minor difference came from the existence of wrapper function on each codecs; the fastmap codec provides functions usable as Codecs.{en,de}code directly. (encoding, charmap codec) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "u.encode(e)" 100 loops, best of 3: 3.51 msec per loop (encoding, fastmap codec) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; u=unicode(s, e)" "u.encode(e)" 1000 loops, best of 3: 536 usec per loop (encoding, utf-8 codec) % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, e)" "u.encode(e)" 1000 loops, best of 3: 1.5 msec per loop If the encoding optimization can be easily done in Walter's approach, the fastmap codec would be too expensive way for the objective because we must maintain not only fastmap but also charmap for backward compatibility. Hye-Shik