[Python-Dev] Hash collision security issue (now public)

Steven D'Aprano steve at pearwood.info
Fri Jan 6 02:52:45 CET 2012


Benjamin Peterson wrote:
> 2012/1/5 Nick Coghlan <ncoghlan at gmail.com>:
>> On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>>> Surely the way to verify the behaviour is to run this from the shell:
>>>
>>> python -c print(hash("abcde"))
>>>
>>> twice, and see that the calls return different values. (Or have I
>>> misunderstood the way the fix is going to work?)
>>>
>>> In any case, I wouldn't want to rely on the presence of a flag in the sys
>>> module to verify the behaviour, I'd want to see for myself that hash
>>> collisions are no longer predictable.
>> More directly, you can just check that the hash of the empty string is non-zero.
>>
>> So -1 for a flag in the sys module - "hash('') != 0" should serve as a
>> sufficient check whether or not process-level string hash
>> randomisation is in effect.
> 
> What exactly is the disadvantage of a sys attribute? That would seem
> preferable to an obscure incarnation like that.

There's nothing obscure about directly testing the hash. That's about as far 
from obscure as it is possible to get: you are directly testing the presence 
of a feature by testing the feature.

Relying on a flag to tell you whether hashes are randomised adds additional 
complexity: now you need to care about whether hashes are randomised AND know 
that there is a flag you can look up and what it is called.

And since the flag won't exist in all versions of Python, or even in all 
builds of a particular Python version, it isn't a matter of just testing the 
flag, but of doing the try...except or hasattr() dance to check whether it 
exists first.

At some point, presuming that there is no speed penalty, the behaviour will 
surely become not just enabled by default but mandatory. Python has never 
promised that hashes must be predictable or consistent, so apart from 
backwards compatibility concerns for old versions, future versions of Python 
should make it mandatory. Presuming that there is no speed penalty, I'd argue 
in favour of making it mandatory for 3.3. Why do we need a flag for something 
that is going to be always on?



-- 
Steven


More information about the Python-Dev mailing list