[New-bugs-announce] [issue28055] pyhash's siphash24 assumes alignment of the data pointer
report at bugs.python.org
Fri Sep 9 19:29:02 EDT 2016
New submission from Matthias Klose:
pyhash's siphash24 assumes alignment of the data pointer, casting a void pointer (src) to an uint64_t pointer, increasing the required alignment from 1 to 4 bytes. That's invalid code. siphash24 can't assume that the pointer to the data to hash is 4-byte aligned.
Seen as a bus error trying to run a ARM32 binary on a AArch64 kernel.
./python -c 'import datetime; print(hash(datetime.datetime(2015, 1, 1)))'
the datetime type is defined as
#define _PyTZINFO_HEAD \
Py_hash_t hashcode; \
char hastzinfo; /* boolean flag */
unsigned char data[_PyDateTime_DATE_DATASIZE];
and data is used to calculate the hash of the object, not being 4 byte aligned, you get the bus error. Inserting three fill bytes, are making the data member 4-byte aligned solves the issue, however introducing an ABI change makes the new datetime ABI incompatible, and we don't know about the alignment of objects outside the standard library.
The solution is to use a memcpy instead of the cast to uint64_t, for now limited to the little endian ARM targets, but I don't see why the memcpy cannot always be used on little endian targets instead of the cast.
components: Interpreter Core
title: pyhash's siphash24 assumes alignment of the data pointer
versions: Python 3.5, Python 3.6
Added file: http://bugs.python.org/file44514/pyhash.diff
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce