[Tutor] sort() method and non-ASCII
Cameron Simpson
cs at zip.com.au
Sun Feb 5 18:25:01 EST 2017
On 05Feb2017 16:31, boB Stepp <robertvstepp at gmail.com> wrote:
>On Sat, Feb 4, 2017 at 10:50 PM, Random832 <random832 at fastmail.com> wrote:
>> On Sat, Feb 4, 2017, at 22:52, boB Stepp wrote:
>>> Does the list sort() method (and other sort methods in Python) just go
>>> by the hex value assigned to each symbol to determine sort order in
>>> whichever Unicode encoding chart is being implemented?
>>
>> By default. You need key=locale.strxfrm to make it do anything more
>> sophisticated.
>>
>> I'm not sure what you mean by "whichever unicode encoding chart". Python
>> 3 strings are unicode-unicode, not UTF-8.
>
>As I said in my response to Steve just now: I was looking at
>http://unicode.org/charts/ Because they called them charts, so did I.
>I'm assuming that despite this organization into charts, each and
>every character in each chart has its own unique hexadecimal code to
>designate each character.
You might want to drop this term "hexadecimal"; they're just ordinals (plain
old numbers). Though Unicode ordinals are often _written_ in hexadecimal for
compactness and because various character grouping are aligned on ranges based
on power-of-2 multiples. Like ASCII has the upper case latin alphabet at 64
(2^6) and lower case at 96 (2^6 + 2^32). Those values look rounder in base 16:
0x40 and 0x60.
Cheers,
Cameron Simpson <cs at zip.com.au>
More information about the Tutor
mailing list