
On 2020-02-27 17:14, Serge Guelton wrote:
On Thu, Feb 27, 2020 at 10:51:39AM -0500, Charalampos Stratakis wrote:
Hello folks,
I recently observed a failure on the s390x fedora rawhide buildbot, on the clang builds, when clang got updated to version 10: https://bugs.python.org/issue39689
The call: struct.unpack('>?', b'\xf0') means to unpack a "native bool", i.e. native size and alignment. Internally, this does:
static PyObject * nu_bool(const char *p, const formatdef *f) { _Bool x; memcpy((char *)&x, p, sizeof x); return PyBool_FromLong(x != 0); }
i.e., copies "sizeof x" (1 byte) of memory to a temporary buffer x, and then treats that as _Bool.
While I don't have access to the C standard, I believe it says that assignment of a true value to _Bool can coerce to a unique "true" value. It seems that if a char doesn't have the exact bit pattern for true or false, casting to _Bool is undefined behavior. Is that correct?
Clang 10 on s390x seems to take advantage of this: it probably only looks at the last bit(s) so a _Bool with a bit pattern of 0xf0 turns out false. But the tests assume that 0xf0 should unpack to True.
I don't think it's specific to Clang 9, or the s390x arch. Have a look to
clang indeed just checks for the lowest bit. Is it correct? I think so. _Bool can only holds two value, 0 and 1, [0] which is different from an int whose value is true or false whether its different or equal to 0. GCC and Clang agree on that:
So yeah, according to that rule, the value set in `p` wasn't from a _Bool if it has the 0xf0 value. So you're re-interepreting memory between two different types type-punning, and that's UB.
Quick and obvious fix:
static PyObject * nu_bool(const char *p, const formatdef *f) { char x; memcpy((char *)&x, p, sizeof x); return PyBool_FromLong(x != 0); }
(This assumes size of _Bool is the same as size of char, which I guess is also UB? But I guess we can add a build-time assertion for that, and say we don't support platforms where that's not the case.) So thanks! I'm left with a question for CPython's struct experts, which is better kept to the bug tracker: https://bugs.python.org/issue39689#msg362815