Aug. 10, 2010
10:01 p.m.
I am not 100% happy with this because I am sure people will keep discovering that the order in the file does not match the order suggested by their favorite sort program. I was also hoping to learn from this discussion what the state of the art in in sorting unicode words is. I believe this issue is addressed by some obscure parts of the unicode standard, but I am not familiar with them.
Actually, it's not. Rather, Unicode acknowledges that collation depends on the locale, see http://unicode.org/reports/tr10/ Of course, it would be possible to follow the Default Unicode Collation Element Table (DUCET). Regards, Martin