Trouble sorting lists (unicode/locale related?)
Peter Otten
__peter__ at web.de
Sun Sep 21 08:10:16 EDT 2003
Erlend Fuglum wrote:
> Hi everyone,
>
> I'm having some trouble sorting lists. I suspect this might have
> something to do with locale settings and/or character
> encoding/unicode.
>
> Consider the following example, text containing norwegian special
> characters æ, ø and å.
>
>>>> liste = ["ola", "erlend", "trygve", "Ærlige anders", "Lars",
>>>> "Øksemorderen", "Åsne", "Akrobatiske Anna", "leidulf"] liste.sort()
>>>> liste
> ['Akrobatiske Anna', 'Lars', 'erlend', 'leidulf', 'ola', 'trygve',
> '\xc5sne', '\xc6rlige anders', '\xd8ksemorderen']
>
> There are a couple of issues for me here:
> * The sorting method apparently places strings starting with uppercase
> characters before strings staring with lowercase. I would like to
> treat them them equally when sorting. OK, this could probably be fixed
> by hacking with .toupper() or something, but isn't it possible to
> achieve this in a more elegant way?
>
> * The norwegian special characters are sorted in a wrong way.
> According to our alphabet the correct order is (...) x, y, z, æ, ø å.
> Python does it this way: (...) x, y, z, å, æ, ø ?
>
> I would really appreciate any help and suggestions - I have been
> fiddling with this mess for quite some time now :-)
Try setting the appropriate locale first:
import locale
locale.setlocale(locale.LC_ALL, ("no", None))
Then for a case-insensitive sort:
wordlist.sort(locale.strcoll)
should do (disclaimer: all untested).
Peter
More information about the Python-list
mailing list