Alphabetical sorts
Ron Adam
rrr at ronadam.com
Mon Oct 16 23:22:47 EDT 2006
Neil Cerutti wrote:
> On 2006-10-16, Ron Adam <rrr at ronadam.com> wrote:
>> I have several applications where I want to sort lists in
>> alphabetical order. Most examples of sorting usually sort on
>> the ord() order of the character set as an approximation. But
>> that is not always what you want.
>
> Check out strxfrm in the locale module.
>
>>>> a = ["Neil", "Cerutti", "neil", "cerutti"]
>>>> a.sort()
>>>> a
> ['Cerutti', 'Neil', 'cerutti', 'neil']
>>>> import locale
>>>> locale.setlocale(locale.LC_ALL, '')
> 'English_United States.1252'
>>>> a.sort(key=locale.strxfrm)
>>>> a
> ['cerutti', 'Cerutti', 'neil', 'Neil']
Thanks, that helps.
The documentation for local.strxfrm() certainly could be more complete. And the
name isn't intuitive at all. It also coorisponds to the C funciton for
translating strings which isn't the same thing.
For that matter locale.strcoll() isn't documented any better.
I see this is actually a very complex subject. A littler searching, found the
following link on Wikipedia.
http://en.wikipedia.org/wiki/Alphabetical_order#Compound_words_and_special_characters
And from there a very informative report:
http://www.unicode.org/unicode/reports/tr10/
It looks to me this would be a good candidate for a configurable class.
Something preferably in the string module where it could be found easier.
Is there anyway to change the behavior of strxfrm or strcoll? For example have
caps before lowercase, instead of after?
Cheers,
Ron
More information about the Python-list
mailing list