[Python-Dev] Problems with the Python Memory Manager
Travis Oliphant
oliphant at ee.byu.edu
Thu Nov 17 11:00:10 CET 2005
>>
>> Bingo. Yes, definitely allocating new _types_ (an awful lot of
>> them...)
>> --- that's what the "array scalars" are: new types created in C.
>
>
> Do you really mean that someArray[1] will create a new type to represent
> the second element of someArray? I would guess that you create an
> instance of a type defined in your extension.
O.K. my bad. I can see that I was confusing in my recent description
and possibly misunderstood the questions I was asked. It can get
confusing given the dynamic nature of Python.
The array scalars are new statically defined (in C) types (just like
regular Python integers and regular Python floats). The ndarray is
also a statically defined type. The ndarray holds raw memory
interpreted in a certain fashion (very similar to Python's array
module). Each ndarray can have a certain data type. For every data
type that an array can be, there is a corresponding "array scalar"
type. All of these are statically defined types. We are only talking
about instances of these defined types.
When the result of a user operation with an ndarray is a scalar, an
instance of the appropriate "array scalar" type is created and passed
back to the user. Previously we were using PyObject_New in the
tp_alloc slot and PyObject_Del in the tp_free slot of the typeobject
structure in order to create and destroy the memory for these instances.
In this particular application, the user ended up creating many, many
instances of these array scalars and then deleting them soon after.
Despite the fact that he was not retaining any references to these
scalars (PyObject_Del had been called on them), his application crawled
to a halt after only several hunderd iterations consuming all of the
available system memory. To verify that indeed no references were
being kept, I did a detailed analysis of the result of sys.getobjects()
using a debug build of Python.
When I replaced PyObject_New (with malloc and PyObject_Init) and
PyObject_Del (with free) for the "array scalars" types in scipy core,
the users memory problems magically disappeared.
I therefore assume that the problem is the memory manager in Python.
Initially, I thought this was the old problem of Python not freeing
memory once it grabs it. But, that should not have been a problem here,
because the code quickly frees most of the objects it creates and so
Python should have been able to re-use the memory.
So, I now believe that his code (plus the array scalar extension type)
was actually exposing a real bug in the memory manager itself. In
theory, the Python memory manager should have been able to re-use the
memory for the array-scalar instances because they are always the same
size. In practice, the memory was apparently not being re-used but
instead new blocks were being allocated to handle the load.
His code is quite complicated and it is difficult to replicate the
problem. I realize this is not helpful for fixing the Python memory
manager, and I wish I could be more helpful. However, replacing
PyObject_New with malloc does solve the problem for us and that may help
anybody else in this situation in the future.
Best regards,
-Travis
More information about the Python-Dev
mailing list