
The function the module below, xlate.xlate, doesn't quite do what "".decode does. (mostly that characters that don't exist are mapped to u+fffd always, instead of having the various behaviors avilable to "".decode) It builds the fast decoding structure once per call, but when decoding 53kb of data that overhead is small enough to make it much faster than s.decode('mac-roman'). For smaller buffers (I tried 53 characters), s.decode is two times faster. (43us vs 21us) $ timeit.py -s "s='a'*53*1024; import xlate" "s.decode('mac-roman')" 100 loops, best of 3: 12.8 msec per loop $ timeit.py -s "s='a'*53*1024; import xlate, encodings.mac_roman" \ "xlate.xlate(s, encodings.mac_roman.decoding_map)" 1000 loops, best of 3: 573 usec per loop Jeff