[Python-Dev] Memory size overflows

Sat, 12 Oct 2002 20:06:37 -0400

[Armin Rigo]
> All around the C code there are potential problems with objects of very
> large sizes (http://www.python.org/sf/618623).  The problem is that to
> allocate a variable-sized object of type 't' with 'n' elements we
> compute 'n*t->tp_itemsize', which can overflow even if 'n' is a
> perfectly legal value.

I confess I always ignore these until one pops up in real life.  Checking
slows the code, and that causes me pain <0.5 wink>.

Note that the multiplication isn't the only problem:  we usually go on to
add in the size of the object header, and that can overflow too.  This leads
to heartbreaking code such as string_repeat's:

	nbytes = size * sizeof(char);
	if (nbytes / sizeof(char) != (size_t)size ||
	    nbytes + sizeof(PyStringObject) <= nbytes) {
		PyErr_SetString(PyExc_OverflowError,
			"repeated string is too long");
		return NULL;
	}

Making it worse, while malloc takes an unsigned arg, Python usually uses a
signed type to hold these things.  When I fixed the code above (it did pop
up in real life there), I made sure nbytes was declared size_t, so that the
compiler could do the division-by-constant with a simple shift (well, OK,
sizeof(char) is necessarily 1U, so the division is pointless there anyway;
in other contexts it's not, and signed/power_of_2 is more expensive than
unsigned/power_of_2 because most C compilers don't allow the former to
compute the natural floor; note that anything/non_constant can be incredibly
expensive!).

> If the truncated result is small, the subsequent malloc() suceeds, and
> we run into a segfault by trying to access more memory than reserved.
> The same problem exists at other places -- more or less everywhere we
> add or multiply something to a number that could be user-supplied.

Yup -- and that, in a nutshell, is why I usually ignore these <wink>.

Note that it's not enough even to ensure malloc gets a sensible argument:
on at least two flavors of Unix, malloc() returning a non-NULL pointer
doesn't guarantee you can access the memory without segfaulting.  In the
end, it's a game that can't be won.  It may be nicer to do a little better,
but there are real costs too.

> ...
> To fix this I suggest introducing a few new macros in pymem.h that
> compute things about sizes with overflow checking.  I can see a couple
> of approaches based on special values that mean "overflow":

If we're going to this, I suggest an ubiquitous trick from Zope's C code:

    DO_SOMETHING_OR(RESULT_LVALUE, INPUT1, ..., ON_ERROR_BLOCK);

That is, the result goes into RESULT_LVALUE, unless there's an overflow; in
the latter case, ON_ERROR_BLOCK is executed.  This is usually something like
"return NULL;" or "goto Overflow;".  This has the nice property of pushing
the error-case code out of the normal flow.  A goto is especially nice,
because the label provides a handy place to set a debugger breakpoint
(always a practical problem when code hides in macro expansions).

> ...
> 3) we compute all sizes with signed integers (int or long),
> as is currently (erroneously?) done at many places.  Any
> negative integer is regarded as "overflow", but the
> multiplication macro returns the largest negative integer in
> case of overflow, so that as above no addition macro is
> needed for the simple cases.

There's really never been anything special about "the sign bit", and some
32-bit boxes can allocate 1UL<<31 bytes.  This becomes more of a problem as
Python gets applied to larger problems, and 2 gigabytes is no longer an
insane amount of address space.

> ...
> Also, approaches 2 and 3 require fixes to ensure that
> 'malloc(any-overflow-size)' always fail, for any of the several
> implementations of malloc found in the code.

pymalloc is already suitably paranoid, even to the extent that the debug
pymalloc makes sure adding 16 byes for its padding doesn't overflow.

> Even with approach 1, I would not trust the platform malloc to
> correctly fail on malloc(-1) -- I guess it might "round up" the
> value to 0 before it proceed...

malloc() can't get an argument of -1:  it takes a size_t argument, and
size_t is guaranteed to be unsigned.  What *does* happen is that some
platform mallocs don't do what the debug pymalloc does, meaning that they
don't check that adding in their own overhead doesn't overflow.  So, e.g.,
if they need a 4-byte header for bookkeeping, they'll add 4 to the argument
and not even notice if that "wraps around" to 0, 1, 2, or 3.  Disaster
ensues.  But it's not really Python's job to fix broken platform libraries.

lukewarmly y'rs  - tim