GMPY: divm() memory leak revisited

mensanator at aol.com mensanator at aol.com
Fri Nov 4 19:52:38 EST 2005


Since the following discussion took place (unresolved),

<http://groups.google.com/group/comp.lang.python/browse_frm/thread/c3bd08ef3e4c478a/2b54deb522c9b9d9?lnk=st&q=divm()+memory+leak+group:comp.lang.python+author:mensanator&rnum=1&hl=en#2b54deb522c9b9d9>

I've kept it in the back of my mind as I've been
learning to use the base gmp library in c. Now I believe
I know what the problem is.

First, divm() is not a gmp function. It is a derived
function created in gmpy. It is derived from the
gmp invert() function (which I know from my testing
does not leak memory). So it's a gmpy specific bug.

Second, I learned from the gmp c library that temporary
mpz objects must be freed to prevent memory leak. Aha!
The gmpy source code is probably not freeing some temporary
mpz it created.

Third, we have the smoking gun:

  divm(a,b,m): returns x such that b*x==a modulo m,
               or else raises a ZeroDivisionError
               exception if no such value x exists
               (a, b and m must be mpz objects,
                or else get coerced to mpz)

Of course, to "coerce" means to create temporary variables
to pass to the gmp library. It would appear that these
temporary variables are not being freed. Now if I'm right,
then I can prove this by eliminating the need to coerce
the operands by passing mpz's to the divm() function.

#
# if the parameters are already mpz's...
#
z = gmpy.mpz(81287570543)
x = gmpy.mpz(8589934592)
y = gmpy.mpz(3486784401)
tot = 0
while True:
   n = input('How many more divm: ')
   if n<=0: break
   print '%d more...' % n,
#
#   ...then they won't need to be coerced
#
   for i in xrange(n): gmpy.divm(z,x,y)
   tot += n
   print '...total %d' % tot


With coercing, I get

C:\Python23\user\the_full_monty>python gmpytest.py
How many more divm: 10000000
10000000 more...Fatal Python error: mp_allocate failure
abnormal program termination
peak Commit Charge (K): 792556

Without needing to coerce, the test ran to completion with
flat memory usage.

Unfortunately, c is still somewhat greek to me, but even so,
the problem appears obvious.

static PyObject *
Pygmpy_divm(PyObject *self, PyObject *args)
{
    PympzObject *num, *den, *mod, *res;
    if(!PyArg_ParseTuple(args, "O&O&O&",
        Pympz_convert_arg, &num,
        Pympz_convert_arg, &den,
        Pympz_convert_arg, &mod))
    {
        return last_try("divm", 3, 3, args);
    }
    if(!(res = Pympz_new()))
        return NULL;
    if(mpz_invert(res->z, den->z, mod->z)) { /* inverse exists */
        mpz_mul(res->z, res->z, num->z);
        mpz_mod(res->z, res->z, mod->z);
        if(options.ZM_cb && mpz_sgn(res->z)==0) {
            PyObject* result;
            if(options.debug)
                fprintf(stderr, "calling %p from %s for %p %p %p %p\n",
                    options.ZM_cb, "divm", res, num, den, mod);
            result = PyObject_CallFunction(options.ZM_cb, "sOOOO",
"divm",
                                           res, num, den, mod);
            if(result != Py_None) {
                Py_DECREF((PyObject*)res);
                return result;
            }
        }
        return (PyObject*)res;
    } else {
        PyObject* result = 0;
        if(options.ZD_cb) {
            result = PyObject_CallFunction(options.ZD_cb,
                "sOOO", "divm", num, den, mod);
        } else {
            PyErr_SetString(PyExc_ZeroDivisionError, "not invertible");
        }
        Py_DECREF((PyObject*)res);
        return result;
    }
}

Note that 4 PympzObjects get created but only res gets passed to
Py_DECREF (which seems to be the method by which it's freed, not
100% sure about this). But I notice that other functions that
coerce variables call Py_DECREF on each of the coerced variables:

static PyObject *
Pygmpy_gcd(PyObject *self, PyObject *args)
{
    PympzObject *a, *b, *c;
    TWO_ARG_CONVERTED("gcd", Pympz_convert_arg,&a,&b);
    assert(Pympz_Check((PyObject*)a));
    assert(Pympz_Check((PyObject*)b));
    if(!(c = Pympz_new())) {
        Py_DECREF((PyObject*)a); Py_DECREF((PyObject*)b);
        return NULL;
    }
    mpz_gcd(c->z, a->z, b->z);
    Py_DECREF((PyObject*)a); Py_DECREF((PyObject*)b);
    return (PyObject*)c;
}

Here the PympzObject c is not freed because it is returned
as the result of the function, but the coerced variables a and b
are.

So the fix may be to simply add

Py_DECREF((PyObject*)num);
Py_DECREF((PyObject*)den);
Py_DECREF((PyObject*)mod);

to divm(). (Assuming it's that simple, I could have overlooked
something.)

Unfortunately, I don't have any means of testing this theory.

I did see the reference to

<http://www.vrplumber.com/programming/mstoolkit>

which I will be trying eventually, but in case I can't get that
to work I wanted to have this posted in case someone else wants
to take a crack at it.




More information about the Python-list mailing list