2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
The weakref changes are really unfortunate as they appear in one of the most performance critical spots of lxml's API: on-the-fly proxy creation.
I can understand why the original code won't work as is, but could you elaborate on why the weak references are needed? Maybe there is a faster way of doing this?
PyObject->ob_refcnt only counts the number of PyObject references to the object, not eventual references held by other parts of the pypy interpreter. For example, PyTuple_GetItem() often returns something with refcnt=1; Two calls to "PyObject *x = PyTuple_GetItem(tuple, 0); Py_DECREF(x);" will return different values for the x pointer. But this model has issues with borrowed references. For example, this code is valid CPython, but will crash with cpyext: PyObject *exc = PyErr_NewException("error", PyExc_StandardError, NULL); PyDict_SetItemString(module_dict, "error", exc); Py_DECREF(exc); // exc is now a borrowed reference, but following line crash pypy: PyObject *another = PyErr_NewException("AnotherError", exc, NULL); PyDict_SetItemString(module_dict, "AnotherError", another); Py_DECREF(exc); In CPython, the code can continue using the created object: we don't own the reference, exc is now a borrowed reference, valid as long as the containing dict is valid; The refcount is 1 when the object is created, incremented when PyDict_SetItem stores it, and 1 again after DECREF. PyPy does it differently: a dictionary does not store PyObject* pointers, but "pypy objects" with no reference counting, and which address can change with a gc collection. PyDict_SetItemString will not change exc->refcnt, which will remain 1, then Py_DECREF will free the memory pointed by exc. There are mechanisms to keep the reference a bit longer, for example PyTuple_GET_ITEM will return a "temporary" reference that will be released when the tuple loses its last cpyext reference. Another way to say this is that with cpyext, a borrowed reference has to borrow from some other reference that you own. It can be a container, or in some cases the current "context", i.e. something that have the duration of the current C call. Otherwise, weak references must be used instead. -- Amaury Forgeot d'Arc