[Python-ideas] INSANE FLOAT PERFORMANCE!!!

Chris Angelico rosuav at gmail.com
Thu Oct 13 02:58:07 EDT 2016


On Thu, Oct 13, 2016 at 5:17 PM, Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
> Chris Angelico writes:
>
>  > I'm not sure what you mean by "strcmp-able"; do you mean that the
>  > lexical ordering of two Unicode strings is guaranteed to be the same
>  > as the byte-wise ordering of their UTF-8 encodings?
>
> This is definitely not true for the Han characters.  In Japanese, the
> most commonly used lexical ordering is based on the pronunciation,
> meaning that there are few characters (perhaps none) in common use
> that has a unique place in lexical ordering (most individual
> characters have multiple pronunciations, and even many whole personal
> names do).

Yeah, and even just with Latin-1 characters, you have (a) non-ASCII
characters that sort between ASCII characters, and (b) characters that
have different meanings in different languages, and should be sorted
differently. So lexicographical ordering is impossible in a generic
string sort.

ChrisA


More information about the Python-ideas mailing list