[Python-Dev] ANSI strict aliasing and Python

Tim Peters tim.one@comcast.net
Mon, 21 Jul 2003 23:26:31 -0400


[martin@v.loewis.de]
> That might be. What they consider wide-spread is the assumption that
> you can access the same memory with different incompatible struct
> types. Atleast the Linux kernel is known to miscompile because of this
> assumption.

When I first googled on this, I found a lot of confused hits.  It appears
that Perl and Parrot need -fno-strict-aliasing for correct operation.

> Whether most of the programs that are incorrect in this respect also
> give an opportunity for the optimizer to generate bad code, I don't
> know. Reportedly, the typical problem is not a bad write order, but a
> failure to reload a value that the compiler "knows" not to be changed.

That makes sense (alas).  We cast PyObject* to and from everything all over
the place, but the only common members are the essentially read-only (after
initialization) pointer to the type object, and the refcount field, which
latter is almost always accessed via macros that cast to PyObject* first.
The other fields can't be accessed at all except via a correctly typed
pointer to the (conceptual) subtype, while the common fields are usually
accessed via a PyObject*.  Since the access paths to a given member don't
normally use both ways, the compiler normally may as well assume the two
kinds of pointers point to non-overlapping stuff.  Perhaps the ob_size
member of var objects is more vulnerable.

> ...
> Yes. Indeed, gcc 3.3 offers a type attribute "may_alias", which causes
> a type to be treated like char* for aliasing purposes.

Sounds like we should add that to PyObject*.  Maybe <wink>.

> I still think that the non-standards-compliance for Python should be
> removed in 2.4. If we ever get a bad optimization because of aliasing
> problems, it will be very time consuming to find the real cause of the
> problem. So the problem is best avoided to begin with.

I'm not fatally opposed to standard code <wink>.  It's unclear whether this
can be done for 2.4, though, or needs to wait for Python 3.  Not *all*
accesses to ob_refcnt and ob_type go thru macros, and if those members get
stuffed behind another access component, some non-core extension modules are
going to stop compiling.  In Zope's C extensions, ob_refcnt is referenced
directly 16 times, and ob_type 174 times.  Many of those are dereferencing
PyObject* vrbls, but many are dereferencing conceptual subtypes (like
BTree*, Sized*, and Wrapper*).  For example,

typedef struct {
  PyObject_HEAD
  PyObject *obj;
  PyObject *container;
} Wrapper;

...

  Wrapper *self;

...
      assert(self->ob_refcnt == 1);
      self->ob_type=Wrappertype;

Of course creating work is hard to sell.