[Pythonmac-SIG] Re: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

Tim Peters tim.peters at gmail.com
Mon Jan 3 06:13:22 CET 2005


[Bob Ippolito]
> Quite a few notable places in the Python sources expect realloc(...) to
> relinquish some memory if the requested size is smaller than the
> currently allocated size.

I don't know what "relinquish some memory" means.  If it means
something like "returns memory to the OS, so that the reported process
size shrinks", then no, nothing in Python ever assumes that.  That's
simply because "returns memory to the OS" and "process size" aren't
concepts in the C standard, and so nothing can be said about them in
general -- not in theory, and neither in practice, because platforms
(OS+libc combos) vary so widely in behavior here.

As a pragmatic matter, I *expect* that a production-quality realloc()
implementation will at least be able to reuse released memory,
provided that the amount released is at least half the amount
originally malloc()'ed (and, e.g., reasonable buddy systems may not be
able to do better than that).

> This is definitely not true on Darwin, and possibly other platforms.  I have tested
> this on OpenBSD and Linux, and the implementations on these platforms do
> appear to relinquish memory,

As above, don't know what this means.

> but I didn't read the implementation.  I haven't been able to find any
> documentation that states that realloc should make this guarantee,

realloc() guarantees very little; it certainly doesn't guarantee
anything, e.g., about OS interactions or process sizes.

> but I figure Darwin does this as an "optimization" and because Darwin
> probably can't resize mmap'ed memory (at least it can't from Python,
> but this probably means it doesn't have this capability at all).
>
> It is possible to "fix" this for Darwin,

I don't understand what's "broken".  Small objects go thru Python's
own allocator, which has its own realloc policies and its own
peculiarities (chiefly that pymalloc never free()s any memory
allocated for small objects).

> because you can ask the default malloc zone how big a particular
> allocation is, and how big an allocation of a given size will actually
> be (see: <malloc/malloc.h>).
> The obvious place to put this would be PyObject_Realloc, because this
> is at least called by _PyString_Resize (which will fix
> <http://python.org/sf/1092502>).

The diagnosis in the bug report seems to leave it pointing at
socket.py's _fileobject.read(), although I suspect the real cause is
in socketmodule.c's sock_recv().  We've had other reports of various
problems when people pass absurdly large values to socket recv().  A
better fix here would probably amount to rewriting sock_recv() to
refuse to pass enormous numbers to the platform recv() (it appears
that many platform recv() implementations simply don't expect a recv()
argument to be much bigger than the native network buffer size, and
screw up when that's not so).

> Should I write up a patch that "fixes" this?  I guess the best thing to
> do would be to determine whether the fix should be used at runtime, by
> allocating a meg or so, resizing it to 1 byte, and see if the size of
> the allocation changes.  If the size of the allocation does change,
> then the system realloc can be trusted to do what Python expects it to
> do, otherwise realloc should be done "cleanly" by allocating a new
> block (returning the original on failure, because it's good enough and
> some places in Python seem to expect that shrink will never fail),

Yup, that assumption (that a non-growing realloc can't fail) is all
over the place.

> memcpy, free, return new block.
>
> I wrote up a small hack that does this realloc indirection to CVS
> trunk, and it doesn't seem to cause any measurable difference in
> pystone performance.
> 
> Note that all versions of Darwin that I've looked at (6.x, 7.x, and
> 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have
> this "issue", but it might go away by Mac OS X 10.4 or some later
> release.
> 
> This URL points to the sf bug and Darwin 7.7's realloc(...)
> implementation:
> http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/

It would be good to rewrite sock_recv() more defensively in any case. 
Best I can tell, this implementation of realloc() is
standard-conforming but uniquely brain dead in its downsize behavior. 
I don't expect the latter will last (as you say on your page,
"probably plenty of other software" also makes the same pragmatic
assumptions about realloc downsize behavior), so I'm not keen to gunk
up Python to worm around it.


More information about the Pythonmac-SIG mailing list