[Python-bugs-list] [ python-Bugs-477963 ] Bad gc performance
noreply@sourceforge.net
noreply@sourceforge.net
Sun, 04 Nov 2001 14:16:25 -0800
Bugs item #477963, was opened at 2001-11-04 02:35
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=477963&group_id=5470
Category: Python Interpreter Core
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: Roeland Rengelink (rengelink)
>Assigned to: Tim Peters (tim_one)
Summary: Bad gc performance
Initial Comment:
The attached program demonstrates bad (maybe O(n**2))
performance of the gc when collecting garbage
The program first creates a dictionary d with N
7-character keys referring a unique integer (2N
objects) the dictionary is then set to None. The
program times the creation of the dictionary (in
micro-sec/item) and then the time it takes to execute
d=None also expressed in (usec/item). Results are the
following:
(Python2.2b1, default conf, Linux 2.2.14, Suse 6.2)
~>python timing.py
size: 10000, creation: 33.67 (usec/elem),
destruction: 1.59 (usec/elem)
size: 20000, creation: 33.18 (usec/elem),
destruction: 1.67 (usec/elem)
size: 50000, creation: 35.48 (usec/elem),
destruction: 1.83 (usec/elem)
size: 100000, creation: 34.81 (usec/elem),
destruction: 2.01 (usec/elem)
size: 200000, creation: 34.63 (usec/elem),
destruction: 2.50 (usec/elem)
size: 400000, creation: 35.12 (usec/elem),
destruction: 3.89 (usec/elem)
size: 600000, creation: 34.16 (usec/elem),
destruction: 6.07 (usec/elem)
size: 800000, creation: 34.90 (usec/elem),
destruction: 8.10 (usec/elem)
size: 1000000, creation: 34.53 (usec/elem),
destruction: 11.14 (usec/elem)
This test does not exceed main memory (128M) on my
machine and CPU utilization remains 95+% throughout. If
the test is extended to 2000000 objects on my machine
(requires VM) gc leads to an extended period of
disk-thrashing (CPU at 1%) requiring many minutes to
clean up the garbage.
A similar problem was reported by Fred in
http://mail.python.org/pipermail/python-list/2001-November/070667.html
I'm not sure if this is a bug. My naive expectation is
that cleanup should be O(n), but then I don't really
kmow what the gc is up to.
Let me know if you need more info
Roeland
----------------------------------------------------------------------
>Comment By: Tim Peters (tim_one)
Date: 2001-11-04 14:16
Message:
Logged In: YES
user_id=31435
My c.l.py response:
"""
[Roeland Rengelink]
> I'm pretty sure this is a problem with the garbage
collector, I did some
> measurements showing bad (O(n**2)) performance when
collecting large
> numbers of objects at once (as happens when your dicts
are deleted at
> the end of your program). I filed a bug report (#477963)
on it. See:
>
> <http://sourceforge.net/tracker/index.php?
func=detail&aid=477963&group_id=5470&atid=105470>
>
> for more details
I expect you're just measuring your platform's cache
performance. Here's the output from running on a Win98SE
box w/ 256MB RAM (I'll leave out the creation times because
they aren't interesting; note too that time.time() on
Windows has poor precision, so the numbers are mostly
nonsense for the smallest dicts):
size: 10000, destruction: 0.00 (usec/elem)
size: 20000, destruction: 2.50 (usec/elem)
size: 50000, destruction: 1.20 (usec/elem)
size: 100000, destruction: 1.10 (usec/elem)
size: 200000, destruction: 1.10 (usec/elem)
size: 400000, destruction: 1.25 (usec/elem)
size: 600000, destruction: 1.37 (usec/elem)
size: 800000, destruction: 1.45 (usec/elem)
size: 1000000, destruction: 1.54 (usec/elem)
So buy more RAM <0.5 wink>.
This is the code that deallocates dicts:
for (ep = mp->ma_table; fill > 0; ep++) {
if (ep->me_key) {
--fill;
Py_DECREF(ep->me_key);
Py_XDECREF(ep->me_value);
}
}
That is, it's a single scan over the key+value pairs,
decrementing the refcounts on each. Note that while a left-
to-right scan of the container is cache-friendly, dicts
work hard to "randomize" the offsets at which keys are
stored, so the memory pointed to *by* ep->me_key
(containing your string objects) has no useful relation to
the order in which you created your string objects: you're
going to get mountains of cache misses during deallocation.
It's a mystery why deallocation takes so much longer than
allocation when things go bad (look at it: the
deallocation code is dead simple). I tracked down one of
those long ago in hideous detail, and it turned out to be
due to the platform free() trashing like mad trying to
coalesce adjacent free'd areas. I suspect every platform
has its own tale of woe to relate here.
"""
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=477963&group_id=5470