Michał Górny wrote:
On Mon, 2021-02-22 at 19:27 +0000, Jessica Clarke wrote:
Example: 16-bit m68k no, it's a 32bit platform with extra alignment requirements. Actually, fewer. Most architectures have alignof(x) == sizeof(x) for all the primitive types, but m68k is more relaxed and caps alignof(x) at 2. This means that assert((p & sizeof(long)) == 0) is too strict, and should instead just be assert((p & alignof(long)) == 0), which is always correct, rather than relying on implementation-defined alignment requirements. In autoconf there's AC_CHECK_ALIGNOF just as there is AC_CHECK_SIZEOF, the result of which should be used for the alignment check instead. That's the portable way to do it (and would allow the removal of the #ifdef). I agree, except that -- as I mentioned elsewhere -- the #ifdef was added
because the x86 optimization hack is actually slower on m68k. I suspect that if more benchmarking was made, it might turn out that #ifdef should actually disable it on more platforms.
I think it's more complicated than that. The code in question didn't just have a bogus assert, it actually relied on ALIGNOF_SIZE_T == SIZEOF_SIZE_T in order to work, but without much good reason other than being written in a poor style. I suspect that the slowdown was seen because the strictness of the `if` on m68k meant the optimised version wasn't used that often but was still sitting there using up space, and time to evaluate the branch, plus potentially the various consequences of additional register pressure, and that the performance hit goes away once the algorithm is fixed to be more general (in such a way that other architectures shouldn't see any performance hits, and possibly even a slight improvement). I've done this in https://github.com/python/cpython/pull/24624.
Plus, even if m68k is slightly slower, who cares? It still works, better to have working clean code than hacky slightly-faster code.