[Python-Dev] Alignment assumptions
Thu, 28 Feb 2002 14:42:35 -0500
[Jack, skip to the end please]
[David Abrahams, on
Include/objimpl.h:275: double dummy; /* force worst-case alignment */
> As I read the code, it affects all types (doesn't this header begin every
> object, regardless of its GC flags?)
Nope, only objects that go through _PyObject_GC_Malloc(). It could be a
nightmare if, e.g., every string and int object consumed another (at least)
> and I think that's a very unhappy circumstance for your numeric
> community. Remember, the type that raised the alarm here was just a
> long double.
The *Python* numeric community is far more likely to embed a float than a
long double, and in any case seems unlikely to build a container type
mixing long double with PyObject* members (i.e., one that ought to
participate in cyclic gc).
I expect we have a blind spot towards long double in general since Python
doesn't expose or use such a thing, all the developers run on platforms
where (as far as they know <wink>) it's the same as a double, and "long
double" was introduced after K&R (so some old-timers likely aren't even
aware C89 introduced it).
But I'll change the code here to use long double instead -- it's harmless,
as it doesn't make a lick of difference on any platform that matters <0.7
>> Only the objimpl.h trick might benefit from maximal alignment.
> I'm not actually after maximal alignment; I look for a minimally-
> sized/aligned type whose alignment is a multiple of the target
> type's alignment. In any case, I was just using the assumption that
> double was maximally aligned since I was linking with Python code
> and the EDG front-end was too slow to handle the metaprogram -- I
> figured that if the assumption was good enough for Python
Well, nobody has complained yet, but the core never needs alignment stricter
than double, and-- as above --an extension type that both did and needed to
participate in GC is unlikey.
> and my clients were depending on it anyway, it was good enough for
> my code (not!).
One of the secrets to Python's success is that we tell unreasonable users to
go away and bother the C++ committee instead.
[128-byte alignment needed for KSR's _subpage type]
> I was aware that this was a theoretical possibility, but not that it
> was a practical one. What's KSR?
Kendall Square Research, my (and Tani's, Tamah's and Steve Breit's) employer
before Dragon. The address space was carved into 128-byte "subpages", and
the hardware supported Python-style (non-owned non-reentrant) locks directly
on a per-subpage basis (Python's lock.acquire() and lock.release() were one
machine instruction each!). Subpages were also the unit for cache coherency
across processors. So use of _subpage in our system code, and in
speed-obsessed app code, was ubiquitous. I guess the main thing KSR proved
was that you can't stay in business designing custom hardware to execute
Python's semantics directly <wink>.
> Seriously, though, I think it would be reasonable to stick to aligning
> the standard builtin types, in which can you can do the test without
> calling malloc, FWIW.
I checked this in:
long double dummy; /* force worst-case alignment */
> The malloc 8-byte align argument doesn't apply, since this struct is
> used in an array.
I was composing email while asleep <wink>. Gotcha.
> This was added by Jack Jansen ages ago -- I think he did measure a
> speedup on an old Mac compiler, or he wouldn't have added it, and I
> bet there was a #define USE_CACHE_ALIGNED in his config.h then.
> But that's all history; I agree it should be deleted.
Jack, do you still want this?
fighting-code-rot-ly y'rs - tim