Unicode charmap decoders slow

"Martin v. Löwis" martin at v.loewis.de
Mon Oct 3 22:11:33 CEST 2005


Tony Nelson wrote:
> I had seen iconv.  Even if my system supports it and it is faster than 
> Python's charmap decoder, it might not be available on other systems.  
> Requiring something unusual in order to do a trivial LUT task isn't an 
> acceptable solution.  If I write a charmap decoder as an extension 
> module in Pyrex I can include it with the program.  I would prefer a 
> solution that doesn't even need that, preferably in pure Python.  Since 
> Python does all the hard wark so fast it certainly could do it, and it 
> can almost do it with "".translate().

Well, did you try a pure-Python version yourself?

table = [chr(i).decode("mac-roman","replace") for i in range(256)]

def decode_mac_roman(s):
     result = [table[ord(c)] for c in s]
     return u"".join(result)

How much faster than the standard codec is that?

Regards,
Martin



More information about the Python-list mailing list