Alphabetical sorts

Ron Adam rrr at ronadam.com
Mon Oct 16 23:22:47 EDT 2006


Neil Cerutti wrote:
> On 2006-10-16, Ron Adam <rrr at ronadam.com> wrote:
>> I have several applications where I want to sort lists in
>> alphabetical order. Most examples of sorting usually sort on
>> the ord() order of the character set as an approximation.  But
>> that is not always what you want.
> 
> Check out strxfrm in the locale module.
> 
>>>> a = ["Neil", "Cerutti", "neil", "cerutti"]
>>>> a.sort()
>>>> a
> ['Cerutti', 'Neil', 'cerutti', 'neil']
>>>> import locale
>>>> locale.setlocale(locale.LC_ALL, '')
> 'English_United States.1252'
>>>> a.sort(key=locale.strxfrm)
>>>> a
> ['cerutti', 'Cerutti', 'neil', 'Neil']

Thanks, that helps.

The documentation for local.strxfrm() certainly could be more complete.  And the 
name isn't intuitive at all.  It also coorisponds to the C funciton for 
translating strings which isn't the same thing.

For that matter locale.strcoll() isn't documented any better.



I see this is actually a very complex subject.  A littler searching, found the 
following link on Wikipedia.

http://en.wikipedia.org/wiki/Alphabetical_order#Compound_words_and_special_characters

And from there a very informative report:

      http://www.unicode.org/unicode/reports/tr10/


It looks to me this would be a good candidate for a configurable class. 
Something preferably in the string module where it could be found easier.

Is there anyway to change the behavior of strxfrm or strcoll?  For example have 
caps before lowercase, instead of after?


Cheers,
    Ron



More information about the Python-list mailing list