Queue not releasing memory

Tim Peters tim_one at email.msn.com
Sun Sep 12 05:43:07 EDT 1999


[nathan at islanddata.com]
> After filling a Queue class instance with large amounts of data, the
> memory footprint of python grows dramatically.  I would expect, then,
> that this footprint would decrease in size when the queue is emptied,
> but it does not.

Python has no control over when your OS decides that released memory is
available to something else.  If you name your OS, perhaps someone will know
how to influence it reliably.  But before going down that path, does it
really matter?  Unused VM isn't costing you anything except an artificially
large high-water mark.  I wouldn't care unless it's consuming *so* much
phantom VM that it prevents other processes from starting.

> The memory use drops instantly when the empty queue is delete, however.

That suggests your libc realloc doesn't release any memory to the OS, but
your libc free does.  So it goes.

> I've examined the Queue.py code and cannot see where this data may be
> cached, etc.

You're not mistaken; Queue.py is not holding on to it; it's the way your
libc and your OS happen to work.

> ...
> The memory taken up by this program does not decrease until q is
> deleted.  Can anyone tell me why this happens and how I can fix it
> *without* deleting the queue and creating a new one?

You can't fix it portably, since it's a platform libc/OS thing.  Under your
particular combo, it seems that you could return memory to the system e.g.
whenever the Queue became empty by subclassing Queue and overriding the
get() method:

    # Get an item from the queue
    def _get(self):
        item = self.queue[0]
        del self.queue[0]
        # next two lines are new
        if not self.queue:
            self.queue = []   # exorcise the ghost of the old queue
        return item

By the way, look at that code carefully:  the default Queue implementation
wasn't meant to handle memory-busting amounts of data!  That "del
self.queue[0]" line makes a physical copy of the entire data remaining every
time.  If you're using this as a general-purpose Queue class, you're getting
fooled -- the attraction of Queue is its thread-safety, not its, umm,
queueness.  It works great for managing modest work queues in client/server
kinds of threaded apps.  For a queue that doesn't need thread safety, and
expects to grow to massive size, you're probably much better off coding a
traditional two-pointer circular buffer (or subclassing Queue and doing
that).

we-need-more-classes-named-in-honor-of-the-alphabet-ly y'rs  - tim






More information about the Python-list mailing list