list.sort(func) speed

mackstann mack at incise.org
Sun Aug 31 06:07:18 EDT 2003


On Sun, Aug 31, 2003 at 11:21:59AM +0200, Peter Otten wrote:
> > (The other issue (decorated sorting) is more relevent of course.)
> D'accord. Namely, basename() is not the bottleneck here.

Not sure exactly what the bottleneck is, but take this code:

print time.time()
songs = [ pair[::-1] for pair in
          enumerate(map(os.path.basename, self.songs)) ]
print time.time()
songs.sort()
print time.time()
self.songs = [ self.songs[index] for basename,index in songs ]
print time.time()

and the output is:

1062322841.2
1062322841.25
1062322841.25
1062322841.26

So it seems that the first list comp with [::-1] and basename and
whatnot is the bottleneck, although I'm not sure exactly which aspect of
it is the bottleneck.  The time between the first two time.time()
outputs is always either .05 or .06 seconds.  I added this little
function:

def f(fname):
  place = fname.rfind("/")
  if place != -1:
    return fname[place:]
  else:
    return fname

And changed the aforementioned line to this:

songs = [ pair[::-1] for pair in
          enumerate(map(f, self.songs)) ]

This reduced it to .04 seconds.  So I can shave about 10-20 milliseconds
using this, although I tend to avoid doing things like this, since it
involves more code for me to worry about, and I definitely like to take
advantage of Python's "batteries" that they say are included. :) (and
generally work well and are silly to duplicate)

So basename() is not the bottleneck, but accounts for perhaps 1/6th of
the time needed overall.  I just wonder if the other 5/6th could be
reduced further by doing something that I'm not thinking of.

-- 
m a c k s t a n n  mack @ incise.org  http://incise.org
The sooner all the animals are dead, the sooner we'll find their
money.
		-- Ed Bluestone, "The National Lampoon"





More information about the Python-list mailing list