[Python-Dev] Hash collision security issue (now public)

martin at v.loewis.de martin at v.loewis.de
Sun Jan 1 04:36:37 CET 2012


> (Well, technically, you could use trees or some other O log n data
> structure as a fallback once you have too many collisions, for some value
> of "too many".  Seems a bit wasteful for the purpose, though.)

I don't think that would be wasteful. You wouldn't just use the tree for
the case of too many collisions, but for any collision. You might special-case
the case of a single key, i.e. start using the tree only if there is a
collision.

The issue is not the effort, but the need to support ordering if you want
to use trees. So you might restrict this to dicts that have only str keys
(which in practice should never have any collision, unless it's a deliberate
setup).

I'd use the tagged-pointer trick to determine whether a key is an object
pointer (tag 0) or an AVL tree (tag 1). So in the common case of interned
strings, the comparison for pointer equality (which is the normal case
if the keys are interned) will succeed quickly; if pointer comparison fails,
check the tag bit.

Regards,
Martin






More information about the Python-Dev mailing list