[issue14621] Hash function is not randomized properly

STINNER Victor report at bugs.python.org
Fri Apr 20 01:26:32 CEST 2012


STINNER Victor <victor.stinner at gmail.com> added the comment:

I don't understand this issue: can you write a short script to test a collision? 

"E.g this strings collide for every prefix ending on 0xcd"

Do you mean that prefix & 0xff == 0xcd?

"0x27fd5a18, 0x26fe78fa"

Is it a byte string or an Unicode string? b'\x27\xfd\x5a\x18' and b'\x26\xfe\x78\xfa'?

--

Using PYTHONHASHSEED environment variable, it's easy to find two values generating the same _Py_HashSecret. Just one example:

PYTHONHASHSEED=3035016679:
* _Py_HashSecret = {0xcd5192eff3fd4d58, 0x3926b1431b200720}
PYTHONHASHSEED=4108758503:
*  _Py_HashSecret = {0xcd5192eff3fd4d58, 0x3926b1431b200720}

--

I wrote find_hash_collision.py to try to compute a collision, but the programs fail with:
---
Fail to generate a new seed!
# seeds = 65298
---
So it fails to generate a new random seed after testing 65298 different seeds. I ran the script with a function generating a seed, a seed generate a prefix "ending with 0xDC".

See attached program: it generates a random seed. Uncomment "seed = generate_seed_0xCD()" if the prefix must ends with 0xCD byte.

----------
Added file: http://bugs.python.org/file25281/find_hash_collision.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14621>
_______________________________________


More information about the Python-bugs-list mailing list