[Python-Dev] Python3 regret about deleting list.sort(cmp=...)
Glenn Linderman
v+python at g.nevcal.com
Sun Mar 13 04:52:24 CET 2011
On 3/12/2011 7:21 PM, Terry Reedy wrote:
> (Ok, I assumed that the 'word' field does not include any of
> !"#$%&'()*+. If that is not true, replace comma with space or even a
> control char such as '\a' which even precedes \t and \n.)
OK, I agree the above was your worst assumption, although you need to
add "," to your list also, because that allows for the data puns.
You also rewrote Guido's text from "shortstring" to "word" and assumed
it had certain content semantics, but since only integer is after the
",", rsplit would work to separate the fields even if shortstring
contains ",".
And the choice of delimiter really determines whether data puns can
exist. If and only if you know that there is a character that is lower
in sort order than any of the characters in the sort strings, can you
"cheat" and put a variable length string into a sort key field, by
terminating it with such a character. The safest such character is \0,
unless you are coding in C, then \a as you now suggest, but only if you
can be 100% sure it is not found in the data. If you cannot guarantee
the data doesn't contain them, there will be the possibility of data
puns among variable length strings, and the algorithms will sort wrong
in pathological cases.
I wouldn't have called you on this, except that it really is important
not to give people the idea that you can blithely use a variable length
string anywhere except at the tail of a multi-field sort string. In
general, you can't. I've long since lost track of the number of times
I've helped people understand the fix to programs that tried that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110312/57631b52/attachment.html>
More information about the Python-Dev
mailing list