[pypy-dev] Looking for clues to consistent Seg Fault in PyPy 2.6.1

Jeff Doran jdoran at lexmachina.com
Sun Oct 11 23:26:16 CEST 2015


First Armin I want to thank you for taking the time to dig into this.
 It's a  wonderful intro for me to the PyPy dev list.

Next,  while I'm admitedly a noob in the PyPy lower level,  I'm curious why
this problem hasn't been encountered more often.   It's seems that each
_Element should be responsible for deallocating it's own weakref and never
have that outsourced to any other _Element..

In any case,  thanks again and I will await a new PyPy to continue my
investigation of this new platform for our production.    Please note I
will be happy to test any proposed solutions as they occur, nightly or
otherwise.

Best

-  Jeff


On Sun, Oct 11, 2015 at 2:52 AM, Armin Rigo <arigo at tunes.org> wrote:

> Hi,
>
> On Sun, Oct 11, 2015 at 12:28 AM, Jeff Doran <jdoran at lexmachina.com>
> wrote:
> > I've run out of options trying to find a Seg Fault which happens when
> > running lxml under PyPy 2.6.1.  This problem only occurs under PyPy as
> the
> > rest of the code works fine under CPython 2.7.   I've been in contact
> with
> > the lxml dev team and they confirmed my problem, but could not determine
> > where the cause of the Seg Fault lies.
>
> After some debugging, it seems that the PyPy-specific code with
> weakrefs in "proxy.pxi" is to blame.  It seems to me that it would
> also have the same problem if it were compiled on CPython.  (I
> understand why it is there, and indeed it is necessary to do
> *something* different on PyPy.)
>
> The problem is that if you start with two C structures "xmlNode" which
> form a small tree:
>
>    XA:   xmlNode   with child XB
>    XB:   xmlNode
>
> You have two corresponding Python objects (actually cdef class
> _Element, but I think it's not important that they are Cython classes
> here):
>
>    EA:   _Element   with _c_code = XA
>    EB:   _Element   with _c_code = XB
>
> The reverse pointing is done differently on PyPy and on CPython.  On
> CPython first:
>
>    XA._private = (void *)EA
>    XB._private = (void *)EB
>
> It's a plain pointer which doesn't hold a reference.  The deallocation
> logic of _Element will reset the '_c_code._private' pointer back to
> NULL.
>
> On PyPy instead, there is an indirection: _private holds a reference
> to a weakref object.  The effect is mostly the same.  But the
> deallocation logic of _Element is subtly different as a result.  Let's
> dig:
>
> The deallocation logic of E is: we reset E._c_code._private to NULL,
> and then if all X's in the tree have _private "set to NULL", then
> delete the whole tree.  The problem is that "set to NULL" is more
> subtle in the weakref version.  It really means "contains a weakref to
> a dead object".  But weakrefs can die *before* the deallocator for
> their target is called.  This is possible in both PyPy and CPython.
> So what occurs here:
>
> * we forget both EA and EB at the same time (for CPython, it can occur
> if there are in a cycle).
>
> * both weakrefs die
>
> * we call the deallocator of EA: it thinks the whole tree is dead
> because all weakrefs are dead, and frees it
>
> * we call the deallocator of EB: it still has _c_code pointing to XB,
> but that is garbage and crashes.
>
> That's the problem.  I don't have a fix right now :-)
>
>
> A bientôt,
>
> Armin.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20151011/add7efc0/attachment.html>


More information about the pypy-dev mailing list