[pypy-issue] [issue1122] [cpyext] PyVarObject builtins and definition of Py_SIZE()

Stefan Behnel tracker at bugs.pypy.org
Wed Apr 11 07:39:53 CEST 2012

Stefan Behnel <stefan_ml at behnel.de> added the comment:

> Duplicating the size field would work for immutable (or non-resizeable) objects,
> but not for lists or dicts: the storage is maintained outside the C structure.

Sure, sorry if I wasn't clear here.

> A possible fix is to use tp_itemsize to determine if the object has an ob_size:
> #define Py_SIZE(obj) \
>    (obj)->ob_type->tp_itemsize ? ((PyVarObject*)(obj))->ob_size :
> And then, turn tuple and strings into PyVarObjects.

Py_SIZE() doesn't do any error checking in CPython, but I think the above
would be reasonable way to work around the fact that it (by design) works
for more objects in CPython than it can in PyPy. I don't think it matters
if it works for more objects in PyPy than in CPython. All that really
counts is that it works for exactly the objects it supports in CPython.

One difference, if Py_SIZE() is incorrectly used on an object that doesn't
support it, the above will silently return -1 with an exception set. Well,
maybe not worse than undefined behaviour or returning an arbitrary value
that happens to reside on the heap, as CPython would. We should get away
with just ignoring that case. It's a macro...

The same reasoning applies to user provided types, IMHO - it's really the
users fault if Py_SIZE() is used on something other than a PyVarObject.

Also, Cython doesn't use Py_SIZE() directly anymore when compiling in PyPy,
so it's not a problem on that front either.

So, the only cases I can see where Py_SIZE() really needs to differ in
cpyext are mutable builtin types like list (dict is not a PyVarObject). In

$ fgrep PyObject_VAR_HEAD Include/*.h
Include/bytearrayobject.h:    PyObject_VAR_HEAD
Include/bytesobject.h:    PyObject_VAR_HEAD
Include/frameobject.h:    PyObject_VAR_HEAD
Include/listobject.h:    PyObject_VAR_HEAD
Include/longintrepr.h:  PyObject_VAR_HEAD
Include/memoryobject.h:    PyObject_VAR_HEAD
Include/object.h:/* PyObject_VAR_HEAD defines the initial segment of all
Include/object.h:#define PyObject_VAR_HEAD      PyVarObject ob_base;
Include/object.h:    PyObject_VAR_HEAD
Include/tupleobject.h:    PyObject_VAR_HEAD

Py2.7 doesn't have memoryviews but adds PyStructSequence to that list.

Tuple and bytes objects can be fixed by making them PyVarObjects, but not
bytearray and listobject. They have their own Py*_GET_SIZE() macros,
however. We could require portable code to use those instead of Py_SIZE().
Not a big deal, IMHO.

Having frameobject not working is fine with me. Anyone using that directly
should expect portability problems.

The long object keeps its sign as the sign of its size, so that might be a
problem for some code. But that should be easy to fix by making it a
PyVarObject in cpyext and setting the size to 1 or -1 based on the sign of
its constant value.

Memoryviews are also constant after creation (IMHO), so they could become
PyVarObjects as well. The same applies to PyStructSequence, which is meant to be
mostly like a tuple.

So, I propose to keep the macro as it is now and to change the object
implementations instead.

PyPy bug tracker <tracker at bugs.pypy.org>

More information about the pypy-issue mailing list