Sorting a list of Unicode strings?
Steve Holden
steve at holdenweb.com
Sun Aug 19 20:45:02 EDT 2007
Alex Martelli wrote:
> oliver at obeattie.com <oliver at obeattie.com> wrote:
> ...
>>>> Maybe I'm missing something fundamental here, but if I have a list of
>>>> Unicode strings, and I want to sort these alphabetically, then it
>>>> places those that begin with unicode characters at the bottom.
> ...
>> Anyway, I know _why_ it does this, but I really do need it to sort
>> them correctly based on how humans would look at it.
>
> Depending on the nationality of those humans, you may need very
> different sorting criteria; indeed, in some countries, different sorting
> criteria apply to different use cases (such as sorting surnames versus
> sorting book titles, etc; sorry, I don't recall specific examples, but
> if you delve on sites about i18n issues you'll find some).
>
Just one example from my own experience. When sorting names in Scotland
(and technically in the rest of the UK too in deference to Scotland,
though this is often ignored) named beginning with "Mc" have to be
sorted /as though/ they began with "Mac". Since the two prefixes are
indistinguishable phonetically it would otherwise mean twice as much
work to look up one of those names.
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
More information about the Python-list
mailing list