getting MemoryError with dicts; suspect memory fragmentation

Bryan bryanjugglercryptographer at yahoo.com
Fri Jun 4 12:06:09 EDT 2010


Emin.shopper wrote:
> dmtr wrote:
> > I'm still unconvinced that it is a memory fragmentation problem. It's
> > very rare.
>
> You could be right. I'm not an expert on python memory management. But
> if it isn't memory fragmentation, then why is it that I can create
> lists which use up 600 more MB but if I try to create a dict that uses
> a couple more MB it dies? My guess is that python dicts want a
> contiguous chunk of memory for their hash table. Is there a reason
> that you think memroy fragmentation isn't the problem?

Your logic makes some sense. You wrote that you can create a dict with
1300 items, but not 1400 items. If my reading of the Python source is
correct, the dict type decides it's overloaded when 2/3 full, and
enlarges by powers of two, so the 1366'th item will trigger allocation
of an array of 4096 PyDictEntry's.
http://svn.python.org/view/python/branches/release25-maint/Objects/dictnotes.txt?view=markup
http://svn.python.org/view/python/branches/release25-maint/Objects/dictobject.c?view=markup

On the other hand PyDictEntry is 12 bytes (on 32-bit Python), so the
memory chunk needed is just 48 K. It doesn't seem plausible that you
have have hundreds of megabytes available but can't allocate 48K in
one chunk.

Plus, unless I'm misreading the code, a Python list also uses one
contiguous chunk of memory to store all the item references. I'm
looking at PyList_New() and list_resize() in:
http://svn.python.org/view/python/branches/release25-maint/Objects/listobject.c?view=markup
and the memory allocators in:
http://svn.python.org/view/python/branches/release25-maint/Include/pymem.h?view=markup

> What else could it be?

Unfortunately several things, most of them hard to diagnose. I'd
suggest checking easy stuff first. Make sure 'dict' is still <type
'dict'>. If you can test again in the debugger in the error case, see
how large a set you can make, as the set implementation is similar to
dict except the hash table entries are one pointer shorter at 8 bytes.


--
--Bryan Olson



More information about the Python-list mailing list