Hello there. I was surprised to find recently that the heapq module is still a pure python implementation. A few years ago we wrote our own in C for use in Eve-online, and we usually do a import blue.heapq as heapq. Having a python implementation of it almost completely negates any benefit of using that in place of sort() unles the comparison is really expensive. I'd be happy to donate an implementation if there is any interest. I also recently came across the recommendation that one should use the "min" and "max" builtins instead of using heapq or sort() when trying to find a single smallest element. Well, this is also not true, for simple comparisons, because currently "min" and "max" use the iterator protocol, whereas sort() (and blue.heapq.heapify) use direct list access, thus halving the number of python method calls approximately. s=""" import random l = [random.random() for i in xrange(10000)] """ timeit.Timer("(l.sort(), l[-1])", s).timeit(1000) 0.29406761513791935 timeit.Timer("max(l)", s).timeit(1000) 0.4760155766743992
Now, this is easy enough to rectify, by providing a list specialization for min_max(). This would also make sure that "min" is truly the fastest way to find the minimum member of a list. Would there be interest in this? (I will be patching the CCP version of python for it anyway). We are using 2.5.3, but these changes should be directly applicable to 2.6 too. K
Kristján Valur Jónsson
timeit.Timer("(l.sort(), l[-1])", s).timeit(1000)
0.29406761513791935
This is clearly wrong. l.sort() will sort the list in place when it is first invoked, and therefore will be very fast in subsequent calls. Compare with: timeit.Timer("sorted(l)[-1]", s).timeit(1000) which really does what you are looking for (sorting then taking the max *on every iteration*), and is much slower. As for heapq, its critical parts are already implemented in C as of Python 2.5.2:
import heapq, _heapq _heapq
heapq.heapify is _heapq.heapify True
Ooops, you are right. Silly error on my part. Still, my comment about heapq still stands, and [hack, slash] 0.39713821814841893 (old) 0.35184029691278162 (hakced, for special list treatment) So, there is a 12% performance boost to be had by specializing for lists. How about it? K
-----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com@python.org [mailto:python-dev-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Antoine Pitrou Sent: Wednesday, October 22, 2008 14:06 To: python-dev@python.org Subject: Re: [Python-Dev] heapq, min and max
This is clearly wrong. l.sort() will sort the list in place when it is first invoked, and therefore will be very fast in subsequent calls.
Kristján Valur Jónsson
0.39713821814841893 (old) 0.35184029691278162 (hakced, for special list treatment)
So, there is a 12% performance boost to be had by specializing for lists. How about it?
It depends on the added code complexity. In any case, you should open an entry on the tracker and post your patch there. Regards Antoine.
Ok. And sorry, I missed your part about heapq now having a c implementation. This is indeed good, I was misled by the presence of heapq.py. However, our own heapify() implementation is some 10% faster on a 10000 element list of floats than the _heapq.heapify() version. I'll investigate the difference. K
-----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com@python.org [mailto:python-dev-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Antoine Pitrou Sent: Wednesday, October 22, 2008 14:41 To: python-dev@python.org Subject: Re: [Python-Dev] heapq, min and max
Kristján Valur Jónsson
writes: 0.39713821814841893 (old) 0.35184029691278162 (hakced, for special list treatment)
So, there is a 12% performance boost to be had by specializing for
lists.
How about it?
It depends on the added code complexity. In any case, you should open an entry on the tracker and post your patch there.
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python- dev/kristjan%40ccpgames.com
I have submitted a patch for min() and max(): http://bugs.python.org/issue4174
-----Original Message-----
It depends on the added code complexity. In any case, you should open an entry on the tracker and post your patch there.
Kristján Valur Jónsson wrote:
Ok. And sorry, I missed your part about heapq now having a c implementation. This is indeed good, I was misled by the presence of heapq.py.
Duplicate Python/C code will probably become more common. Even if the Python is not used for prototyping (which I believe it was for heapq), it serves to document the intention of the C code and to be a ready to go version for non-C implementations. And it can be used as a basis for modification by Pythoneers who want slightly different behavior.
I Kt just occurred to me: Even though l.sort() is sorting a presorted array, it still must be doing 10000-1 RichCompares minimum, just like max. So how do we explain the large difference?
-----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com@python.org [mailto:python-dev-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Antoine Pitrou Sent: Wednesday, October 22, 2008 14:06 To: python-dev@python.org Subject: Re: [Python-Dev] heapq, min and max
Kristján Valur Jónsson
writes: timeit.Timer("(l.sort(), l[-1])", s).timeit(1000)
0.29406761513791935
This is clearly wrong. l.sort() will sort the list in place when it is first invoked, and therefore will be very fast in subsequent calls.
I think this should be taken off of python-dev until
you have some quality measurements,
know what's going on, and have an actionable idea.
Aside from list specialization versus a general iterator
protocol, there is no fat in the min/max implementation.
It loops, it compares, it returns.
If we wanted to go the distance and type specialize,
it is possible to use the same ideas that used in
Py2.6's builtin sum() to speed-up compares for particular types.
Personally, I think that would be a waste and would rather
keep the code simple.
Also, a primary use case for min/max is with just two inputs.
We don't want to slow that down in order to provide negligible
improvements to min/max for long sequences.
Raymond
----- Original Message -----
From: "Kristján Valur Jónsson"
I Kt just occurred to me: Even though l.sort() is sorting a presorted array, it still must be doing 10000-1 RichCompares minimum, just like max. So how do we explain the large difference?
-----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com@python.org [mailto:python-dev-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Antoine Pitrou Sent: Wednesday, October 22, 2008 14:06 To: python-dev@python.org Subject: Re: [Python-Dev] heapq, min and max
Kristján Valur Jónsson
writes: timeit.Timer("(l.sort(), l[-1])", s).timeit(1000)
0.29406761513791935
This is clearly wrong. l.sort() will sort the list in place when it is first invoked, and therefore will be very fast in subsequent calls.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/python%40rcn.com
Raymond Hettinger
If we wanted to go the distance and type specialize, it is possible to use the same ideas that used in Py2.6's builtin sum() to speed-up compares for particular types.
Would it be useful to create a macro that packs some of the optimizations in http://bugs.python.org/issue3106 , so that interested code can get the optimizations for free simply by using the macro?
On Wed, Oct 22, 2008 at 3:51 PM, Raymond Hettinger
Aside from list specialization versus a general iterator protocol, there is no fat in the min/max implementation. It loops, it compares, it returns.
Also, a primary use case for min/max is with just two inputs. We don't want to slow that down in order to provide negligible improvements to min/max for long sequences.
If min/max were specialized for tuples, it would speed up the common case where a handful of arguments are passed in. Since tuples are immutable, evil comparison functions can't permute the input, either. I threw together a quick modification with tuple specialization, here are the results: 2.6-pristine 2.6-patched 0.38usec 0.243usec -36% two integers 284usec 198usec -30% 5000 integers 0.455usec 0.308usec -32% two datetime objects 600usec 536usec -11% 5000 datetime objects Not bad :-) Raw commands and results are below: Cashew:~/src-other/Python-2.6-pristine$ ./python.exe Lib/timeit.py 'max(2,8)' 1000000 loops, best of 3: 0.38 usec per loop Cashew:/tmp/Python-2.6$ ./python.exe Lib/timeit.py 'max(2,8)' 1000000 loops, best of 3: 0.243 usec per loop Cashew:~/src-other/Python-2.6-pristine$ ./python.exe Lib/timeit.py -s 'x=tuple(xrange(5000))' 'max(x)' 1000 loops, best of 3: 284 usec per loop Cashew:/tmp/Python-2.6$ ./python.exe Lib/timeit.py -s 'x=tuple(xrange(5000))' 'max(x)' 10000 loops, best of 3: 198 usec per loop Cashew:~/src-other/Python-2.6-pristine$ ./python.exe Lib/timeit.py -s 'import datetime' -s 'x = datetime.datetime.now()' -s 'y = datetime.datetime.now()' 'max(x,y)' 1000000 loops, best of 3: 0.455 usec per loop Cashew:/tmp/Python-2.6$ ./python.exe Lib/timeit.py -s 'import datetime' -s 'x = datetime.datetime.now()' -s 'y = datetime.datetime.now()' 'max(x,y)' 1000000 loops, best of 3: 0.308 usec per loop Cashew:~/src-other/Python-2.6-pristine$ ./python.exe Lib/timeit.py -s 'import datetime' -s 'x = tuple(datetime.datetime.now() for x in range(5000))' 'max(x)' 1000 loops, best of 3: 600 usec per loop Cashew:/tmp/Python-2.6$ ./python.exe Lib/timeit.py -s 'import datetime' -s 'x = tuple(datetime.datetime.now() for x in range(5000))' 'max(x)' 1000 loops, best of 3: 536 usec per loop (the one in /tmp/ is patched) -- Daniel Stutzbach, Ph.D. http://stutzbachenterprises.com/
participants (5)
-
Antoine Pitrou
-
Daniel Stutzbach
-
Kristján Valur Jónsson
-
Raymond Hettinger
-
Terry Reedy