[Python-Dev] Unicode charmap decoders slow

"Martin v. Löwis" martin at v.loewis.de
Fri Oct 14 09:14:23 CEST 2005


Tony Nelson wrote:
> I have written my fastcharmap decoder and encoder.  It's not meant to be
> better than the patch and other changes to come in a future version of
> Python, but it does work now with the current codecs.

It's an interesting solution.

> To use, hook each codec to be speed up:
> 
>     import fastcharmap
>     help(fastcharmap)
>     fastcharmap.hook('name_of_codec')
>     u = unicode('some text', 'name_of_codec')
>     s = u.encode('name_of_codec')
> 
> No codecs were rewritten.  It took me a while to learn enough to do this
> (Pyrex, more Python, some Python C API), and there were some surprises.
> Hooking in is grosser than I would have liked.  I've only used it on Python
> 2.3 on FC3.

Indeed, and I would claim that you did not completely achieve your "no 
changes necessary" goal: you still have to install the hooks explicitly.
I also think overriding codecs.charmap_{encode,decode} is really ugly.

Even if this could be simplified if you would modify the existing
codecs, I still don't think supporting changes to the encoding dict
is worthwhile. People will probably want to update the codecs in-place,
but I don't think we need to make a guarantee that that such an approach
works independent of the Python version. People would be much better off
writing their own codecs if they think the distributed ones are
incorrect.

Regards,
Martin


More information about the Python-Dev mailing list