[Python-3000] Draft PEP: Dropping PyObject_HEAD
Brett Cannon
brett at python.org
Sat Apr 28 01:30:50 CEST 2007
Second PEP today. Martin is on a roll! =)
On 4/27/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> I propose the PEP below for Py3k.
>
> Regards,
> Martin
>
> PEP: 3122
> Title: Dropping PyObject_HEAD
> Version: $Revision: 54998 $
> Last-Modified: $Date: 2007-04-27 10:31:58 +0200 (Fr, 27 Apr 2007) $
> Author: Martin v. Löwis <martin at v.loewis.de>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 27-Apr-2007
> Python-Version: 3.0
> Post-History:
>
> Abstract
> ========
>
> Python currently relies on undefined C behavior, with its
> usage of PyObject_HEAD. This PEP proposes to change that
> into standard C.
>
> Rationale
> =========
>
> Standard C defines that an object must be accessed only through a
> pointer of its type, and that all other accesses are undefined
> behavior, with a few exceptions. In particular, the following
> code has undefined behavior::
>
> struct FooObject{
> PyObject_HEAD
> int data;
> };
>
> PyObject *foo(struct FooObject*f){
> return (PyObject*)f;
> }
>
> int bar(){
> struct FooObject *f = malloc(sizeof(struct FooObject));
> struct PyObject *o = foo(f);
> f->ob_refcnt = 0;
> o->ob_refcnt = 1;
> return f->ob_refcnt;
> }
>
> The problem here is that the storage is both accessed as
> if it where struct PyObject, and as struct FooObject.
>
> Historically, compilers did not cause any problems with this
Reads easier if you replace "cause" with "have".
> code. However, modern compiler use that clause as an
Probably want to pluralize "compiler".
Your use of "clause" really confused me until I realized what you were
talking about.
> optimization opportunity, finding that f->ob_refcnt and
> o->ob_refcnt cannot possibly refer to the same memory, and
> that therefore the function should return 0, without having
> to fetch the value of ob_refcnt at all in the return
> statement. For GCC, Python now uses -fno-strict-aliasing
> to work around that problem; with other compilers, it
> may just see undefined behavior. Even with GCC, using
> -fno-strict-aliasing may pessimize the generated code
> unnecessarily.
>
> Specification
> =============
>
> Standard C has one specific exception to its aliasing rules precisely
> designed to support the case of Python: a value of a struct type may
> also be accessed through a pointer to the first field. E.g. if a
> struct starts with an int, the struct\* may also be cast to an int\*,
> allowing to write int values into the first field.
>
> For Python, PyObject_HEAD and PyObject_VAR_HEAD will be dropped, and
> PyObject gets defined to contain all fields explicitly::
>
> typedef struct _object{
> _PyObject_HEAD_EXTRA
> Py_ssize_t ob_refcnt;
> struct _typeobject *ob_type;
> }PyObject;
>
> typedef struct {
> PyObject ob_base;
> Py_ssize_t ob_size;
> } PyVarObject;
>
> Types defined as fixed-size structure will then include PyObject
> as its first field; variable-sized objects PyVarObject. E.g.::
>
> typedef struct{
> PyObject ob_base;
> PyObject *start, *stop, *step;
> } PySliceObject;
>
> typedef struct{
> PyVarObject ob_base;
> PyObject **ob_item;
> Py_ssize_t allocated;
> } PyListObject;
>
> As a convention, the base field SHOULD be called ob_base. However,
> all accesses to ob_refcnt and ob_type MUST cast the object pointer
> to PyObject* (unless the pointer is already known to have that
> type), and SHOULD use the respective accessor macros. To simplify
> access to ob_type, a macro::
>
> #define Py_Type(o) (((PyObject*)o)->ob_type)
>
> is added.
>
An example of how this will change current code would be good. E.g.,
``o->ob_type->tp_name`` becomes ``PyType(o)->typ_name`` or
``o->ob_base->ob_type->tp_name``.
Otherwise I am all for cleaning up the codebase and thus support this PEP.
-Brett
More information about the Python-3000
mailing list