[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

Thu Feb 24 17:04:50 CET 2011

Marc-Andre Lemburg <mal at egenix.com> added the comment:

Steffen Daode Nurpmeso wrote:
> 
> Steffen Daode Nurpmeso <sdaoden at googlemail.com> added the comment:
> 
> .. i don't have actually invented this algorithm (but don't ask me where i got the idea from years ago), i've just implemented the function you see.  The algorithm itself avoids some pitfalls in respect to combining numerics and significantly reduces the number of possible normalization cases:
> 
>         "ISO-8859-1", "ISO8859-1", "ISO_8859-1", "LATIN1"
>         (+ think of additional mispellings)
> all become
>         "iso 8859 1", "latin 1"
> in the end

Please don't forget that the shortcuts in questions are *optimizations*.

Programmers who don't use the encoding names triggering those
optimizations will still have a running program, it'll only be
a bit slower and that's perfectly fine.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11303>
_______________________________________