case-insensitive and internationalized sort
Martin v. Löwis
martin at v.loewis.de
Thu Dec 19 18:26:06 EST 2002
"Kevin Altis" <altis at semi-retired.com> writes:
> That looks good, but I'm a bit slow. I'm not seeing how to create just a
> compare function so I can sort in place. This is the closest I could I do
> and it requires a standalone function and the temporary creation of another
> list:
Correct. Proper sorting of strings often requires that you transform
them, both for correctness and speed. As a result, sorting won't be
in-place, unless you make the transformation in-place:
def caseinsensitive_inplace(list):
for index, value in enumerate(list): # using 2.3's enumerate
list[index] = value.upper(),value
list.sort()
for index, (_, value) in enumerate(list): # using 2.3's enumerate
list[index] = value
> Is there a simpler way I'm just not seeing one that avoids a lot of
> extra upper() calls?
It appears you are not aware of what "a lot" is, here. This routine
uses n .upper calls, if len(list) is n. Doing the upper in the compare
function causes n*log(n) .upper calls on average, and, depending on
the Python version, n*n .upper calls in the worst case.
> The list creation overhead may be worse than extra upper
> calls. I didn't have any succes trying to incorporate locale.strcoll or
> locale.strxfrm. I kept getting an exception
>
> UnicodeError: ASCII encoding error: ordinal not in range(128)
Ah, so you *are* trying to compare Unicode strings? You should encode
them to the locale's encoding first. How to find out what the locale's
encoding is depends a lot on the operating system you are using (and,
to a lesser degree, on the Python version).
Regards,
Martin
More information about the Python-list
mailing list