Following up on my own message...
qrunner isn't the only culprit. senddigests is worse, since it doesn't stop itself from running indefinitely.
There is a definite memory leak or inefficiency somewhere. Just the following fragment of code, edited from senddigests, is enough to send the memory usage sky-high:
def main(): for listname in Utils.list_names(): mlist = MailList.MailList(listname, lock=0) del mlist
The "del mlist" doesn't help.
I've noticed that one pathological list has a 12MB config.db file. If loading config.db is inefficient by a factor of eleven, then that could explain the swelling to 135MB. With that list removed, the memory usage peak is just 89MB.
Is there a way to tell if Python is deleting the MailList object when mlist gets reassigned, so I can find out if there is a leak each time a list is loaded? Otherwise it's just inefficient memory usage in proportion to the size of config.db
Thanks, Kevin
----- Forwarded message from sigma@pair.com -----
From mailman-developers-admin@python.org Thu Dec 07 12:00:16 2000 Delivered-To: sigma@smx.pair.com Delivered-To: sigma@pair.com Delivered-To: mailman-developers@new.python.org Delivered-To: mailman-developers@python.org Message-ID: <20001207115954.1364.qmail@smx.pair.com> From: sigma@pair.com To: mailman-developers@python.org X-Mailer: ELM [version 2.4ME+ PL40 (25)] Subject: [Mailman-Developers] Huge qrunner process Sender: mailman-developers-admin@python.org Errors-To: mailman-developers-admin@python.org X-BeenThere: mailman-developers@python.org X-Mailman-Version: 2.0 Precedence: bulk List-Help: <mailto:mailman-developers-request@python.org?subject=help> List-Post: <mailto:mailman-developers@python.org> List-Subscribe: <http://www.python.org/mailman/listinfo/mailman-developers>, <mailto:mailman-developers-request@python.org?subject=subscribe> List-Id: Mailman mailing list developers <mailman-developers.python.org> List-Unsubscribe: <http://www.python.org/mailman/listinfo/mailman-developers>, <mailto:mailman-developers-request@python.org?subject=unsubscribe> List-Archive: <http://www.python.org/pipermail/mailman-developers/> Date: Thu, 7 Dec 2000 06:59:54 -0500 (EST)
Can anyone enlighten me about why the qrunner process might need a tremendous amount of memory? Running Mailman 2.0 on a FreeBSD 4.1.1-STABLE server with Python 2.0. There are about 1500 lists, but the qfiles directory only has 32 files in it. Nonetheless, each minute when qrunner runs, it looks like this in top:
59606 mailman -2 20 135M 74456K getblk 0:01 8.93% 3.52% python
135 MB? It seems excessive. Any insight would be appreciated :)
Thanks, Kevin
Mailman-Developers mailing list Mailman-Developers@python.org http://www.python.org/mailman/listinfo/mailman-developers
----- End of forwarded message from sigma@pair.com -----
On Thu, Dec 07, 2000 at 12:51:49PM -0500, sigma@pair.com wrote:
There is a definite memory leak or inefficiency somewhere. Just the following fragment of code, edited from senddigests, is enough to send the memory usage sky-high:
def main(): for listname in Utils.list_names(): mlist = MailList.MailList(listname, lock=0) del mlist
The "del mlist" doesn't help.
Is there a way to tell if Python is deleting the MailList object when mlist gets reassigned, so I can find out if there is a leak each time a list is loaded? Otherwise it's just inefficient memory usage in proportion to the size of config.db
Python uses reference-counting, so the mailinglist should go away as soon as all references to it go away. However:
python -i bin/withlist mailman-devel-test
import sys sys.getrefcount(m) 12
There are 12 references to the mailinglist object. One is the argument passed to 'getrefcount', one is the local variable 'm', but the other 10 are unaccounted for. I think it's safe to say there's a reference cycle in there somewhere ;) The easiest way to fix this is probably to install Python 2.0 with the garbage collector. It's a new feature, which tries to collect as much cyclic garbage as possible. If anything, it can help figure out where those cycles exist.
Barry ? Would it be a good idea, in the mean time, to explicitly break the cycle in some way, say 'mlist._release()' or some such, document it as internal, and use it wisely in senddigests/qrunner ? That would require finding the cycles, of course ;P
-- Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
Python uses reference-counting, so the mailinglist should go away as soon as all references to it go away. However:
python -i bin/withlist mailman-devel-test
import sys sys.getrefcount(m) 12
I suspected as much :(
There are 12 references to the mailinglist object. One is the argument passed to 'getrefcount', one is the local variable 'm', but the other 10 are unaccounted for. I think it's safe to say there's a reference cycle in there somewhere ;) The easiest way to fix this is probably to install Python 2.0 with the garbage collector. It's a new feature, which tries to collect as much cyclic garbage as possible. If anything, it can help figure out where those cycles exist.
We are already running Python 2.0 on this machine :( I suppose I could litter the code with debugging statements and see how the reference count goes up.
I don't see any kind of destroy-no-matter-what function for objects.
Thanks, Kevin
On Fri, Dec 08, 2000 at 06:16:10AM -0500, sigma@pair.com wrote:
I don't see any kind of destroy-no-matter-what function for objects.
There isn't. There should be no reason to have it, object disappear when their references go away, and you shouldn't want to destroy one that is still being referenced -- it would lead to nasty crashes.
-- Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
Except in the case where you're certain that the object is really unreferenced, like the circular reference case.
Perhaps as a quick hack, I could rewrite qrunner and/or senddigests to launch a new script for each list in the loop. That would workaround the memory problem.
Kevin
On Fri, Dec 08, 2000 at 06:16:10AM -0500, sigma@pair.com wrote:
I don't see any kind of destroy-no-matter-what function for objects.
There isn't. There should be no reason to have it, object disappear when their references go away, and you shouldn't want to destroy one that is still being referenced -- it would lead to nasty crashes.
-- Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
On Fri, Dec 08, 2000 at 07:06:10AM -0500, sigma@pair.com wrote:
Except in the case where you're certain that the object is really unreferenced, like the circular reference case.
No, it *is* referenced. That it's referenced by yourself, or by an object that you yourself reference, or how long the circle is, is unimportant. Just deallocating it could still lead to crashes.
Perhaps as a quick hack, I could rewrite qrunner and/or senddigests to launch a new script for each list in the loop. That would workaround the memory problem.
Probably, yes. It would require spawning a new python interpreter each time though :P
-- Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
One ther thing to notice about qrunner. It keeps a cache of MailList objects referenced by name (see _listcache). It does this to avoid the overhead of having to reinstantiate the MailList object each time it finds a message destined for a particular list.
You could try to redefine open_list() so that it doesn't cache the objects. I don't know how much that'll help, but it's worth a try.
-Barry
"TW" == Thomas Wouters <thomas@xs4all.net> writes:
TW> Barry ? Would it be a good idea, in the mean time, to
TW> explicitly break the cycle in some way, say 'mlist._release()'
TW> or some such, document it as internal, and use it wisely in
TW> senddigests/qrunner ? That would require finding the cycles,
TW> of course ;P
Which isn't something I want to spend a lot of time on. One of the beauties of zodb is that it tracks usage of objects, moving them from memory out into disk storage when they're unreferenced (depending on various tuning parameters). I think using zodb/zeo here would help a lot, or at least make it manageable.
-Barry
participants (3)
-
barry@digicool.com
-
sigma@pair.com
-
Thomas Wouters