Barry Scott wrote:
On 22 Feb 2021, at 12:40, Michał Górny email@example.com wrote: I'm talking about 16-bit memory alignment which causes SIGBUS if it's not respected on m68k. I don't understand why you consider this to be a problem. After all, x86 has stronger (32-bit) alignment requirements, so m68k is actually less likely to break. On x86 you can make unaligned access to memory.
Alignment is a nice to have for performance. But on m68k you MUST align or you get a SIGBUS. Barry
That is not the problem. Many architectures (SPARC, PA-RISC, IA-64, ...) do not natively support unaligned accesses. The problem is in fact that m68k's alignment requirements are *weaker* than x86 *and they are exposed to C*. On x86, the compiler still aligns ints to 4 byte boundaries, but on m68k they are only aligned to 2 byte boundaries. This means that assert(p % sizeof(int) == 0) does not work. The code in question is #ifdef'ed out for three reasons:
1. The assert is overly strict. This is trivially fixed by changing the SIZEOF_SIZE_T to ALIGNOF_SIZE_T.
2. The `if` uses (and code within the `if` relies on it using) SIZEOF_SIZE_T to check the alignment.
3. The code is potentially slower on m68k than the simple byte-by-byte loop, though I don't think anyone's going to complain about slight performance regressions on m68k if it comes from cleaning up the code, and I imagine the supposed performance hit came from not fixing 2 properly (i.e. there was a bunch of extra code that wasn't able to be used that often due to overly-strict requirements in the algorithm).
I have filed https://github.com/python/cpython/pull/24624 to fix all these, though the first smaller commit is the only one strictly required for correctness on m68k (and any other architecture that chooses/has chosen to make the same ill-advised choices in its ABI), whilst the second larger one makes minor changes to the algorithm (that should not affect performance on any supported architecture other than m68k), and the many copies of it, in order to also cope with ALIGNOF_SIZE_T < SIZEOF_SIZE_T. This thus improves m68k support whilst removing m68k-specific hacks and making the code less reliant on implementation-defined behaviour, i.e. is how portability patches are _meant_ to be.