python 3's adoption
Steve Holden
steve at holdenweb.com
Thu Jan 28 22:15:53 EST 2010
Steven D'Aprano wrote:
> On Thu, 28 Jan 2010 17:38:23 -0800, mdj wrote:
>
>> On Jan 29, 9:47 am, Paul Boddie <p... at boddie.org.uk> wrote:
>>> On 27 Jan, 13:26, Xah Lee <xah... at gmail.com> wrote:
>>>
>>>
>>>
>>>> So, for practical reasons, i think a “key” parameter is fine. But
>>>> chopping off “cmp” is damaging. When your data structure is complex,
>>>> its order is not embedded in some “key”. Taking out “cmp” makes it
>>>> impossible to sort your data structure.
>>> What would annoy me if I used Python 3.x would be the apparent lack of
>>> the __cmp__ method for conveniently defining comparisons between
>>> instances of my own classes. Having to define all the rich comparison
>>> methods frequently isn't even as much fun as it sounds.
>> OT, but you can always define the other operators in terms of a cmp and
>> mix it in, restoring the original behaviour. Unfortunately it won't
>> restore the original performance until someone comes to their senses and
>> restores __cmp__
>
> "Comes to their senses"?
>
> There's nothing you can do with __cmp__ that you can't do better with
> rich comparisons, and plenty that rich comparisons can do that __cmp__ is
> utterly incapable of dealing with. __cmp__ is crippled since it can only
> be used for defining classes where the operators < etc return flags. It
> can't be used if you want to implement some other behaviour for the
> operators. E.g. here's a silly example:
>
>>>> class Silly(object):
> ... def __init__(self):
> ... self.link = None
> ... def __gt__(self, other):
> ... self.link = other
> ...
>>>> x = Silly()
>>>> x > Silly()
>>>> x.link
> <__main__.X object at 0xb7cda74c>
>
>
> More importantly, __cmp__ is only suitable for classes that implement
> total ordering. If you have a data type that does not have total
> ordering, for example sets, you can't implement it using __cmp__.
>
> E.g.:
>
>>>> s = set([1, 2, 3, 4])
>>>> t = set([3, 4, 5, 6])
>>>> s < t
> False
>>>> s > t
> False
>>>> s == t
> False
>>>> cmp(s, t)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: cannot compare sets using cmp()
>
>
> Sets have partial ordering, and __cmp__ is simply not up to the job of
> dealing with it.
>
> Having two mechanisms for implementing comparisons is unnecessary. It
> adds complications to the language that we are better off without. The
> only advantage of the obsolete __cmp__ is that lazy programmers only need
> to write one method instead of six. This is an advantage, I accept that
> (hey, I'm a lazy programmer too, that's why I use Python!) but it's not a
> big advantage. If you really care about it you can create a mixin class,
> a decorator, or a metaclass to simplify creation of the methods. For
> example, a quick and dirty decorator:
>
>
>>>> def make_comparisons(cls):
> ... cls.__gt__ = lambda self, other: self.__cmp__(other) == 1
> ... cls.__ge__ = lambda self, other: self.__cmp__(other) >= 0
> ... cls.__eq__ = lambda self, other: self.__cmp__(other) == 0
> ... cls.__ne__ = lambda self, other: self.__cmp__(other) != 0
> ... cls.__le__ = lambda self, other: self.__cmp__(other) <= 0
> ... cls.__lt__ = lambda self, other: self.__cmp__(other) == -1
> ... return cls
> ...
>>>> @make_comparisons
> ... class BiggerThanEverything(object):
> ... def __cmp__(self, other):
> ... return 1
> ...
>>>> x = BiggerThanEverything()
>>>> x > 1000
> True
>>>> x < 0
> False
>
>
While I am fully aware that premature optimization, etc., but I cannot
resist an appeal to efficiency if it finally kills off this idea that
"they took 'cmp()' away" is a bad thing.
Passing a cmp= argument to sort provides the interpreter with a function
that will be called each time any pair of items have to be compared. The
key= argument, however, specifies a transformation from [x0, x1, ...,
xN] to [(key(x0), x0), (key(x1), x1), ..., (key(xN), xN)] (which calls
the key function precisely once per sortable item).
>From a C routine like sort() [in CPython, anyway] calling out from C to
a Python function to make a low-level decision like "is A less than B?"
turns out to be disastrous for execution efficiency (unlike the built-in
default comparison, which can be called directly from C in CPython).
If your data structures have a few hundred items in them it isn't going
to make a huge difference. If they have a few million thenit is already
starting to affect performance ;-)
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS: http://holdenweb.eventbrite.com/
More information about the Python-list
mailing list