flaming vs accuracy [was Re: Performance of int/long in Python 3]
Chris Angelico
rosuav at gmail.com
Thu Mar 28 10:38:07 EDT 2013
On Fri, Mar 29, 2013 at 1:12 AM, jmfauth <wxjmfauth at gmail.com> wrote:
> This flexible string representation is so absurd that not only
> "it" does not know you can not write Western European Languages
> with latin-1, "it" penalizes you by just attempting to optimize
> latin-1. Shown in my multiple examples.
PEP393 strings have two optimizations, or kinda three:
1a) ASCII-only strings
1b) Latin1-only strings
2) BMP-only strings
3) Everything else
Options 1a and 1b are almost identical - I'm not sure what the detail
is, but there's something flagging those strings that fit inside seven
bits. (Something to do with optimizing encodings later?) Both are
optimized down to a single byte per character.
Option 2 is optimized to two bytes per character.
Option 3 is stored in UTF-32.
Once again, jmf, you are forgetting that option 2 is a safe and
bug-free optimization.
ChrisA
More information about the Python-list
mailing list