[Python-Dev] Memory Allocator Part 2: Did I get it right?

Wed Feb 16 05:26:18 CET 2005

[Tim Peters]
>> As I said before, I don't think we need to support this any more.
>> More, I think we should not -- the support code is excruciatingly
>> subtle, it wasted plenty of your time trying to keep it working, and
>> if we keep it in it's going to continue to waste time over the coming
>> years (for example, in the short term, it will waste my time reviewing
>> it).

[Evan Jones]
> I do not have nearly enough experience in the Python world to evaluate
> this decision. I've only been programming in Python for about two years
> now, and as I am sure you are aware, this is my first patch that I have
> submitted to Python. I don't really know my way around the Python
> internals, beyond writing basic extensions in C. Martin's opinion is
> clearly the opposite of yours.

?  This is all I recall Martin saying about this:

    http://mail.python.org/pipermail/python-dev/2005-January/051265.html

    I'm not certain it is acceptable to make this assumption. Why is it
    not possible to use the same approach that was previously used (i.e.
    leak the arenas array)?

Do you have something else in mind?  I'll talk with Martin about it if
he still wants to.  Martin, this miserable code must die!

> Basically, the debate seems to boil down to maintaining backwards
> compatibility at the cost of making the code in obmalloc.c harder to
> understand.

The "let it leak to avoid thread problems" cruft is arguably the
single most obscure bit of coding in Python's code base.  I created
it, so I get to say that <wink>.  Even 100 lines of comments aren't
enough to make it clear, as you've discovered.  I've lost track of how
many hours of my life have been pissed away explaining it, and its
consequences (like how come this or that memory-checking program
complains about the memory leak it causes), and the historical madness
that gave rise to it in the beginning.  I've had enough of it -- the
only purpose this part ever had was to protect against C code that
wasn't playing by the rules anyway.  BFD.  There are many ways to
provoke segfaults with C code that breaks the rules, and there's just
not anything that special about this way _except_ that I added
objectionable (even at the time) hacks to preserve this kind of broken
C code until authors had time to fix it.  Time's up.

> The particular case that is being supported could definitely be viewed
> as a "bug" in the code that using obmalloc. It also likely is quite rare.
> However, until now it has been supported, so it is hard to judge exactly
> how much code would be affected.

People spent many hours searching for affected code when it first went
in, and only found a few examples then, in obscure extension modules. 
It's unlikely usage has grown.  The hack was put it in for the dubious
benefit of the few examples that were found then.

> It would definitely be a minor barrier to moving to Python 2.5.

That's in part what python-dev is for.  Of course nobody here has code
that will break -- but the majority of high-use extension modules are
maintained by people who read this list, so that's not as empty as it
sounds.

It's also what alpha and beta releases are for.  Fear of change isn't
a good enough reason to maintain this code.

> Is there some sort of consensus that is possible on this issue?

Absolutely, provided it matches my view <0.5 wink>.  Rip it out, and
if alpha/beta testing suggests that's a disaster, _maybe_ put it back
in.

...

> It turns out that basically the only thing that would change would be
> removing the "volatile" specifiers from two of the global variables,
> plus it would remove about 100 lines of comments. :) The "work" was
> basically just hurting my brain trying to reason about the concurrency
> issues, not changing code.

And the brain of everyone else who ever bumps into this.  There's a
high probability that if this code actually doesn't work (can you
produce a formal proof of correctness for it?  I can't -- and I
tried), nothing can be done to repair it; and code this outrageously
delicate has a decent chance of being buggy no matter how many people
stare at it (overlooking that you + me isn't that many).  You also
mentioned before that removing the "volatile"s may have given a speed
boost, and that's believable.  I mentioned above the unending costs in
explanations, and nuisance gripes from memory-integrity tools about
the deliberate leaks.  There are many kinds of ongoing costs here, and
no _intended_ benefit anymore (it certainly wasn't my intent to cater
to buggy C code forever).

>> It was never legit to do #a without holding the GIL.  It was clear as
>> mud whether it was legit to do #b without holding the GIL.  If
>> PyMem_Del (etc) change to expand to "free" in a release build, then #b
>> can remain clear as mud without harming anyone.  Nobody should be
>> doing #a anymore.  If someone still is, "tough luck -- fix it, you've
>> had years of warning" is easy for me to live with at this stage.

> Hmm... The issue is that case #a may not be an easy problem to
> diagnose:

Many errors in C code are difficult to diagnose.  That's life.  Mixing
a PyObject call with a PyMem call is obvious now "by eyeball", so if
there is such code still out there, and it blows up, an experienced
eye has a good chance of spotting the error at once.
'
> Some implementations of free() will happily do nothing if
> they are passed a pointer they know nothing about. This would just
> result in a memory leak. Other implementations of free() can output a
> warning or crash in this case, which would make it trivial to locate.

I expect most implementations of free() would end up corrupting memory
state, leading to no symptoms or to disastrous symptoms, from 0 to a
googol cycles after the mistake was made.  Errors in using malloc/free
are often nightmares to debug.  We're not trying to make coding in C
pleasant here -- which is good, because that's unachievable <wink>.

>> I suppose the other consideration is that already-compiled extension
>> modules on non-Windows(*) systems will, if they're not recompiled,
>> continue to call PyObject_Free everywhere they had a
>> PyMem_Del/DEL/FREE call.

> Is it guaranteed that extension modules will be binary compatible with
> future Python releases? I didn't think this was the case.

Nope, that's not guarantfeed.  There's a magic number
(PYTHON_API_VERSION) that changes whenever the Python C API undergoes
an incompatible change, and binary compatibility is guaranteed across
releases if that doesn't change.  The then-current value of
PYTHON_API_VERSION gets compiled into extensions, by virtue of the
module-initialization macro their initialization function has to call.
 The guts of that function are in the Python core (Py_InitModule4()),
which raises this warning if the passed-in version doesn't match the
current version:

 "Python C API version mismatch for module %.100s:\
 This Python has API version %d, module %.100s has version %d.";

This is _just_ a warning, though.  Perhaps unfortunately for Python's
users, Guido learned long ago that most API mismatches don't actually
matter for his own code <wink>.  For example, the C API officially
changed when the signature of PyFrame_New() changed in 2001 -- but
almost no extension modules call that function.

Similarly, if we change PyMem_Del (etc) to map to the system free(),
PYTHON_API_VERSION should be bumped for Python 2.5 -- but many people
will ignore the mismatch warning, and again it will probably make no
difference (if there's code still out there that calls PyMem_DEL (etc)
without holding the GIL, I don't know about it).