[Python-Dev] Counting collisions for the win

Donald Stufft donald.stufft at gmail.com
Fri Jan 20 20:51:16 CET 2012

On Friday, January 20, 2012 at 2:36 PM, Tres Seaver wrote:

> Hash: SHA1
> On 01/20/2012 02:04 PM, Donald Stufft wrote:
> > Even if a MemoryException is raised I believe that is still a 
> > fundamental change in the documented contract of dictionary API.
> > 
> How so? Dictionary inserts can *already* raise that error.
Because it's raising it for a fundamentally different thing. "You have plenty of memory, but we decided to add an arbitrary limit that has nothing to do with memory and pretend you are out of memory anyways".
> > I don't believe there is a way to fix this without breaking someones 
> > application. The major differences I see between the two solutions is
> > that counting will break people's applications who are otherwise 
> > following the documented api contract of dictionaries,
> > 
> Do you have a case in mind where legitimate user data (not crafted as
> part of a DoS attack) would trip the 1000-collision limit? How likely is
> it that such cases exist in already-deployed applications, compared to
> the known breakage in existing applications due to hash randomization?

I don't, but as there's never been a limit on how many collisions a dictionary can have, this would be a fundamental change in the documented (and undocumented) abilities of a dictionary. Dictionary key order has never been guaranteed, is documented to not be relied on, already changes depending on if you are using 32bit, 64bit, Jython, PyPy etc or as someone else pointed out, to any number of possible improvements to dict. The counting solution violates the existing contract in order to serve people who themselves are violating the contract. Even with their violation the method that I +1'd still serves to not break existing applications by default.
> > and randomization will break people's applications who are violating 
> > the documented api contract of dictionaries.
> > 
> > Personally I feel that the lesser of two evils is to reward those who
> > followed the documentation, and not reward those who didn't.
> > 
> Except that I think your set is purely hypothetical, while the second set
> is *lots* of deployed applications.

Which is why I believe that it should be off by default on the bugfix, but easily enabled. (Flag, env var, whatever). That allows people to upgrade to a bugfix without breaking their application, and if this vulnerability affects them, they can enable it.

I think the counting collision is at best a bandaid and not a proper fix stemmed from a desire to not break existing applications on a bugfix release which can be better solved by implementing the real fix and allowing people to control (only on the bugfix, on 3.3+ it should be forced to on always) if they have it enabled or not.
> Tres.
> - -- 
> ===================================================================
> Tres Seaver +1 540-429-0999 tseaver at palladion.com (mailto:tseaver at palladion.com)
> Palladion Software "Excellence by Design" http://palladion.com
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> iEYEARECAAYFAk8ZwlgACgkQ+gerLs4ltQ4KOACglAHDgn5wUb+cye99JbeW0rZo
> 5oAAn2ja7K4moFLN/aD4ZP7m+8WnwhcA
> =u7Mt
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org (mailto:Python-Dev at python.org)
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120120/61aad243/attachment-0001.html>

More information about the Python-Dev mailing list