python 3's adoption

Thu Jan 28 22:15:53 EST 2010

Steven D'Aprano wrote:
> On Thu, 28 Jan 2010 17:38:23 -0800, mdj wrote:
> 
>> On Jan 29, 9:47 am, Paul Boddie <p... at boddie.org.uk> wrote:
>>> On 27 Jan, 13:26, Xah Lee <xah... at gmail.com> wrote:
>>>
>>>
>>>
>>>> So, for practical reasons, i think a “key” parameter is fine. But
>>>> chopping off “cmp” is damaging. When your data structure is complex,
>>>> its order is not embedded in some “key”. Taking out “cmp” makes it
>>>> impossible to sort your data structure.
>>> What would annoy me if I used Python 3.x would be the apparent lack of
>>> the __cmp__ method for conveniently defining comparisons between
>>> instances of my own classes. Having to define all the rich comparison
>>> methods frequently isn't even as much fun as it sounds.
>> OT, but you can always define the other operators in terms of a cmp and
>> mix it in, restoring the original behaviour. Unfortunately it won't
>> restore the original performance until someone comes to their senses and
>> restores __cmp__
> 
> "Comes to their senses"?
> 
> There's nothing you can do with __cmp__ that you can't do better with 
> rich comparisons, and plenty that rich comparisons can do that __cmp__ is 
> utterly incapable of dealing with. __cmp__ is crippled since it can only 
> be used for defining classes where the operators < etc return flags. It 
> can't be used if you want to implement some other behaviour for the 
> operators. E.g. here's a silly example:
> 
>>>> class Silly(object):
> ...     def __init__(self):
> ...             self.link = None
> ...     def __gt__(self, other):
> ...             self.link = other
> ...
>>>> x = Silly()
>>>> x > Silly()
>>>> x.link
> <__main__.X object at 0xb7cda74c>
> 
> 
> More importantly, __cmp__ is only suitable for classes that implement 
> total ordering. If you have a data type that does not have total 
> ordering, for example sets, you can't implement it using __cmp__.
> 
> E.g.:
> 
>>>> s = set([1, 2, 3, 4])
>>>> t = set([3, 4, 5, 6])
>>>> s < t
> False
>>>> s > t
> False
>>>> s == t
> False
>>>> cmp(s, t)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: cannot compare sets using cmp()
> 
> 
> Sets have partial ordering, and __cmp__ is simply not up to the job of 
> dealing with it.
> 
> Having two mechanisms for implementing comparisons is unnecessary. It 
> adds complications to the language that we are better off without. The 
> only advantage of the obsolete __cmp__ is that lazy programmers only need 
> to write one method instead of six. This is an advantage, I accept that 
> (hey, I'm a lazy programmer too, that's why I use Python!) but it's not a 
> big advantage. If you really care about it you can create a mixin class, 
> a decorator, or a metaclass to simplify creation of the methods. For 
> example, a quick and dirty decorator:
> 
> 
>>>> def make_comparisons(cls):
> ...     cls.__gt__ = lambda self, other: self.__cmp__(other) == 1
> ...     cls.__ge__ = lambda self, other: self.__cmp__(other) >= 0
> ...     cls.__eq__ = lambda self, other: self.__cmp__(other) == 0
> ...     cls.__ne__ = lambda self, other: self.__cmp__(other) != 0
> ...     cls.__le__ = lambda self, other: self.__cmp__(other) <= 0
> ...     cls.__lt__ = lambda self, other: self.__cmp__(other) == -1
> ...     return cls
> ...
>>>> @make_comparisons
> ... class BiggerThanEverything(object):
> ...     def __cmp__(self, other):
> ...             return 1
> ...
>>>> x = BiggerThanEverything()
>>>> x > 1000
> True
>>>> x < 0
> False
> 
> 
While I am fully aware that premature optimization, etc., but I cannot
resist an appeal to efficiency if it finally kills off this idea that
"they took 'cmp()' away" is a bad thing.

Passing a cmp= argument to sort provides the interpreter with a function
that will be called each time any pair of items have to be compared. The
key= argument, however, specifies a transformation from [x0, x1, ...,
xN] to [(key(x0), x0), (key(x1), x1), ..., (key(xN), xN)] (which calls
the key function precisely once per sortable item).

>From a C routine like sort() [in CPython, anyway] calling out from C to
a Python function to make a low-level decision like "is A less than B?"
turns out to be disastrous for execution efficiency (unlike the built-in
default comparison, which can be called directly from C in CPython).

If your data structures have a few hundred items in them it isn't going
to make a huge difference. If they have a few million thenit is already
starting to affect performance ;-)

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010  http://us.pycon.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/