[Python-bugs-list] [ python-Bugs-477963 ] Bad gc performance

noreply@sourceforge.net noreply@sourceforge.net
Sun, 04 Nov 2001 14:16:25 -0800


Bugs item #477963, was opened at 2001-11-04 02:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=477963&group_id=5470

Category: Python Interpreter Core
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: Roeland Rengelink (rengelink)
>Assigned to: Tim Peters (tim_one)
Summary: Bad gc performance

Initial Comment:
The attached program demonstrates bad (maybe O(n**2))
performance of the gc when collecting garbage

The program first creates a dictionary d with N
7-character keys referring a unique integer (2N
objects) the dictionary is then set to None. The
program times the creation of the dictionary (in
micro-sec/item) and then the time it takes to execute
d=None also expressed in (usec/item). Results are the
following:

(Python2.2b1, default conf, Linux 2.2.14, Suse 6.2)

~>python timing.py 
size:   10000, creation: 33.67 (usec/elem),
destruction:  1.59 (usec/elem)
size:   20000, creation: 33.18 (usec/elem),
destruction:  1.67 (usec/elem)
size:   50000, creation: 35.48 (usec/elem),
destruction:  1.83 (usec/elem)
size:  100000, creation: 34.81 (usec/elem),
destruction:  2.01 (usec/elem)
size:  200000, creation: 34.63 (usec/elem),
destruction:  2.50 (usec/elem)
size:  400000, creation: 35.12 (usec/elem),
destruction:  3.89 (usec/elem)
size:  600000, creation: 34.16 (usec/elem),
destruction:  6.07 (usec/elem)
size:  800000, creation: 34.90 (usec/elem),
destruction:  8.10 (usec/elem)
size: 1000000, creation: 34.53 (usec/elem),
destruction: 11.14 (usec/elem)

This test does not exceed main memory (128M) on my
machine and CPU utilization remains 95+% throughout. If
the test is extended to 2000000 objects on my machine
(requires VM) gc leads to an extended period of
disk-thrashing (CPU at 1%) requiring many minutes to
clean up the garbage.

A similar problem was reported by Fred in

http://mail.python.org/pipermail/python-list/2001-November/070667.html

I'm not sure if this is a bug. My naive expectation is
that cleanup should be O(n), but then I don't really
kmow what the gc is up to.

Let me know if you need more info

Roeland

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-11-04 14:16

Message:
Logged In: YES 
user_id=31435

My c.l.py response:

"""
[Roeland Rengelink]
> I'm pretty sure this is a problem with the garbage 
collector, I did some
> measurements showing bad (O(n**2)) performance when 
collecting large
> numbers of objects at once (as happens when your dicts 
are deleted at
> the end of your program). I filed a bug report (#477963) 
on it. See:
> 
> <http://sourceforge.net/tracker/index.php?
func=detail&aid=477963&group_id=5470&atid=105470>
>
> for more details

I expect you're just measuring your platform's cache 
performance.  Here's the output from running on a Win98SE 
box w/ 256MB RAM (I'll leave out the creation times because 
they aren't interesting; note too that time.time() on 
Windows has poor precision, so the numbers are mostly 
nonsense for the smallest dicts):

size:   10000, destruction:  0.00 (usec/elem)
size:   20000, destruction:  2.50 (usec/elem)
size:   50000, destruction:  1.20 (usec/elem)
size:  100000, destruction:  1.10 (usec/elem)
size:  200000, destruction:  1.10 (usec/elem)
size:  400000, destruction:  1.25 (usec/elem)
size:  600000, destruction:  1.37 (usec/elem)
size:  800000, destruction:  1.45 (usec/elem)
size: 1000000, destruction:  1.54 (usec/elem)

So buy more RAM <0.5 wink>.

This is the code that deallocates dicts:

	for (ep = mp->ma_table; fill > 0; ep++) {
		if (ep->me_key) {
			--fill;
			Py_DECREF(ep->me_key);
			Py_XDECREF(ep->me_value);
		}
	}

That is, it's a single scan over the key+value pairs, 
decrementing the refcounts on each.  Note that while a left-
to-right scan of the container is cache-friendly, dicts 
work hard to "randomize" the offsets at which keys are 
stored, so the memory pointed to *by* ep->me_key 
(containing your string objects) has no useful relation to 
the order in which you created your string objects:  you're 
going to get mountains of cache misses during deallocation.

It's a mystery why deallocation takes so much longer than 
allocation when things go bad (look at it:  the 
deallocation code is dead simple).  I tracked down one of 
those long ago in hideous detail, and it turned out to be 
due to the platform free() trashing like mad trying to 
coalesce adjacent free'd areas.  I suspect every platform 
has its own tale of woe to relate here.
"""

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=477963&group_id=5470