Program very slow to finish

Tim Peters at
Tue Nov 6 00:14:45 CET 2001

[Roeland Rengelink]
> Thanks for your comprehensive answer. I hope you'll bear with me, for
> I'm looking for some clarification

I should have said "I expect you're just measuring your platform C's free()
performance", not "cache".  I agree that cache effects should have little
bearing on this (beyond very small dicts that actually fit in cache), and
will give some evidence from my Win98SE box below.

> ...
> Here are my numbers again, just for comparison (and I do think you're
> showing an increase in destruction time too, just not as bad)

Yes, but not nearly as bad.

> The algo:
> -  create dictionary d with N 7-char keys [all mapping to the same value]
> -  execute d=None
> All times in micro-seconds per item. For the code see end of this post.
> (Linux 2.2.14, 128M RAM, Cel 333 MHz)
> size:   10000, creation: 31.00, destruction:  1.49
> size:   20000, creation: 31.10, destruction:  1.57
> size:   50000, creation: 32.77, destruction:  1.76
> size:  100000, creation: 32.00, destruction:  1.92
> size:  200000, creation: 32.59, destruction:  2.38
> size:  500000, creation: 32.12, destruction:  4.35
> size: 1000000, creation: 32.25, destruction: 10.47
> ...
> ["head argument" about cache effects elided]
> Now, this may be a very naive argument. But, hey, I'm an astronomer, not
> a computer scientist.

No, it was a good first approximation, but measurement speaks louder than
argument <wink>.  I fiddled Python's dict dealloc routine like so:

	for (ep = mp->ma_table; fill > 0; ep++) {
		if (ep->me_key) {
			/* New code follows. */
			ep->me_key->ob_refcnt -= 1;
			if (ep->me_value)
				ep->me_value->ob_refcnt -= 1;

This does everything the DECREF macros did *except* to call the platform
free() on the now-dead string-object keys.  So all that memory leaks, and
creation times start to zoom as my box runs out of RAM.  "destruction" times
stay very steady, though (using time.clock() here, a much better timer on
Windows than time.time()):

size:   10000, destruction:  0.20 (usec/elem)
size:   20000, destruction:  0.20 (usec/elem)
size:   50000, destruction:  0.23 (usec/elem)
size:  100000, destruction:  0.22 (usec/elem)
size:  200000, destruction:  0.22 (usec/elem)
size:  400000, destruction:  0.23 (usec/elem)
size:  600000, destruction:  0.21 (usec/elem)
size:  800000, destruction:  0.22 (usec/elem)

I can't push it much beyond that, as "the usual" result of provking Win9X
into disk thrashing is a system crash (indeed, I endured 4 reboots before
giving up on letting this run thru a million).

So, on this box, all the nastiness is due to MS's free() implementation.

> ...
> You've convinced/reassured me that it can't be Python. I'm not convinced
> it's the cache. That basically leaves free(). I'm curious if anybody
> else has seen this behaviour. I hope some more people try this

You can be sure that everyone who suggested to exit the program via
os._exit() has seen similar behavior in anger <wink>.  Alas, every platform
acts differently -- even different flavors of Windows have radically
different malloc/free performance characteristics.  Win9X is usually the
worst of the lot, so I'm amazed that your flavor of Linux's performance made
Win9X look stellar in comparison.

I can't make time for it now, but "someone should" try this test using
Vladimir Marangozov's PyMalloc package (shipped with Python 2.1 but turned
off by default -- see the 2.1 NEWS file).

malloc-works-best-when-it-isn't-used<wink>-ly y'rs  - tim

More information about the Python-list mailing list