[Python-Dev] Memory size overflows

Mon, 14 Oct 2002 00:23:34 +0200

Hello Tim,

Let me quote from stringobject.c the "heartbreaking code such as
string_repeat's":

        nbytes = size * sizeof(char);
        if (nbytes / sizeof(char) != (size_t)size ||
            nbytes + sizeof(PyStringObject) <= nbytes)
                <OverflowError>
        op = (PyStringObject *)
                PyObject_MALLOC(sizeof(PyStringObject) + nbytes);
        if (op == NULL)
                <MemoryError>

With the proper macro, the whole code above can be written as:

        nbytes = Py_MUL_ADD(size, sizeof(char), sizeof(PyStringObject));
        op = (PyStringObject *) PyObject_MALLOC(nbytes);
        if (op == NULL)
                <MemoryError>

which seems much cleaner to me.

For this reason I do not believe it is a good idea here to define a
macro a la Zope C:

    DO_MULADD_OR(nbytes, size, sizeof(char), sizeof(PyStringObject),
goto Overflow);

for two reasons: there is no real need for a special "Overflow" case, as
long as the returned 'nbytes' is guaranteed to be un-malloc-able;
besides, no existing macro (e.g. _PyObject_VAR_SIZE) can support an
extra "goto Overflow" case without breaking C code that depends on it.

There is however a difference between the two above code fragments, in
that the former will raise an OverflowError for "really too large
values" and a MemoryError for "reasonable value but still too large for
me to allocate right now".  Whether this is a correct behavior is
debatable.  I would argue that raising MemoryError in both cases is
fine: after all, "123"*9999999999999999 could be perfectly fine in
another Python implementation with lazy strings.  It is thus only a
memory limit that makes this fail in the current implementation.

Note that if we define MAX_SIZE to ((size_t)-1), the overflow check in
the present stringobject.c could be more efficiently written down as:

        if (size > (MAX_SIZE - sizeof(PyStringObject)) / sizeof(char))
                <OverflowError>

where everything at the right of '>' reduces to a compile-time
constant.  The Py_MUL_ADD macro can be implemented with this idea in
mind, which works best if the 2nd and 3rd arguments are constants.

> note that anything/non_constant can be incredibly expensive!

Yes.  Couldn't a better overflow-checking algorithm be designed for
these cases?  For example, to multiply the non-constants 'a' and 'b', we
can be sure that there is no overflow if both 'a' and 'b' are small
enough (which is the common case anyway).  I believe I can even think
about a single Py_MUL_ADD macro that is as efficient as above with
constant arguments, and otherwise uses this trick.

About using 64-bit computations everywhere (as suggested by Christian):
if the length of objects (e.g. lists) is still constrained to 31 bits,
then it's fine; otherwise, the problem is the same.  Do all supported
64-bit platforms have a 32-bit 'int', and thus a 31-bit limit on object
lengths?

By the way, this makes me think of a good efficient overflow-detection
method for platforms with a "long long" type longer than size_t:

  unsigned long long bytes = ((unsigned long long)a)*((unsigned long
long)b)+c;
  if (bytes > ((size_t)-1))
    <overflow>;

This compiles to very good assembly code on x86.

Mmmh, in summary I would recommend that we define a common
"multiply-and-add" macro, returning a size_t or OVERFLOW_VALUE in case
of overflow, which is implemented in several different ways on different
platforms, as selected by #if's.  The same macro is used for just
"multiply" or just "add" by relying on the compiler's strengh reduction.

Armin