What about an EXPLICIT naming scheme for built-ins?
tim.peters at gmail.com
Sat Sep 4 23:14:46 CEST 2004
[Alex Martelli, to Carlos Ribeiro]
> If you need to loop on the first few smallest items of a long sequence,
> you'll find good solutions in the cookbook (and better ones in the 2nd
> printed edition I'm supposed to be coediting rather than surfing
> c.l.py...:-), but the sorted builtin is no use for that.
Note that Raymond Hettinger added efficient nlargest() and nsmallest()
functions to heapq for 2.4. Because heapq implements a min-heap, only
nlargest() is really "natural" for this module, but nsmallest() does
what it can. Neither works as a generator. There are elegant ways to
implement mergesort as a cascade of (O(N)!) recursive generators, but
they're horrendously slow in comparison.
> Again, you're assuming (implicitly) heapsort and ignoring the huge
> performance benefit of Peters' "natural mergesort" as implemented in
> the current sorted built-in (and list.sort method). In all the many
> cases in which the whole sorted sequence is wanted, natural
> mergesort shines, particularly in the common practical cases in
> which the input sequence already has some pieces ordered, while
> heapsort gets no prize at all.
It's actually the "adaptive" in "adaptive natural mergesort" that
supplies the real wins. The "natural" part does pay in some cases,
but not nearly as often as the "adaptive" part. There's an extensive
writeup of the algorithm in a Python source distribution's
Objects/listsort.txt. The adaptive part requires creating long runs
in order to, well, find something to adapt *to*. It's not suited to
generator-like behavior because of this (trying to do the least work
possible to deliver "the next" result would castrate its most
generally effective go-fast trick).
I'll note that no algorithm can know the smallest element before it's
seen all the elements, so no generator-like gimmick can deliver the
first element of the sorted result before materializing the entire
input and doing O(N) work on it. The recursive-generator mergesorts
mentioned above deliver the first result after doing exactly N-1
compares (which achieves the theoretical minimum). But the builtin
sort() can usually sort the whole list before that gimmick can deliver
its first result.
More information about the Python-list