[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

Thu Feb 24 17:36:39 CET 2011

Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

On Thu, Feb 24, 2011 at 11:31 AM, Marc-Andre Lemburg
<report at bugs.python.org> wrote:
..
> I think rather than removing any hyphens, spaces, etc. the
> function should additionally:
>
>  * add hyphens whenever (they are missing and) there's switch
>   from [a-z] to [0-9]
>

This will do the wrong thing to the "cs" family of aliases:

"""
The aliases that start with "cs" have been added for use with the
IANA-CHARSET-MIB as originally defined in RFC3808, and as currently
maintained by IANA at http://www.iana.org/assignments/ianacharset-mib.
Note that the ianacharset-mib needs to be kept in sync with this
registry.  These aliases that start with "cs" contain the standard
numbers along with suggestive names in order to facilitate applications
that want to display the names in user interfaces.  The "cs" stands
for character set and is provided for applications that need a lower
case first letter but want to use mixed case thereafter that cannot
contain any special characters, such as underbar ("_") and dash ("-").
"""

----------
title: b'x'.decode('latin1') is much slower than	b'x'.decode('latin-1') -> b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11303>
_______________________________________