<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#330033">
On 1/19/2012 8:54 PM, Carl Meyer wrote:
<blockquote cite="mid:4F18F37A.4040200@oddbird.net" type="cite">
<pre wrap="">-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Victor,
On 01/19/2012 05:48 PM, Victor Stinner wrote:
[snip]
</pre>
<blockquote type="cite">
<pre wrap="">Using a randomized hash may
also break (indirectly) real applications because the application
output is also somehow "randomized". For example, in the Django test
suite, the HTML output is different at each run. Web browsers may
render the web page differently, or crash, or ... I don't think that
Django would like to sort attributes of each HTML tag, just because we
wanted to fix a vulnerability.
</pre>
</blockquote>
<pre wrap="">
I'm a Django core developer, and if it is true that our test-suite has a
dictionary-ordering dependency that is expressed via HTML attribute
ordering, I consider that a bug and would like to fix it. I'd be
grateful for, not resentful of, a change in CPython that revealed the
bug and prompted us to fix it. (I presume that it is true, as it sounds
like you experienced it directly; I don't have time to play around at
the moment, but I'm surprised we haven't seen bug reports about it from
users of 64-bit Pythons long ago). I can't speak for the core team, but
I doubt there would be much disagreement on this point: ideally Django
would run equally well on any implementation of Python, and as far as I
know none of the alternative implementations guarantee hash or
dict-ordering compatibility with CPython.
I don't have the expertise to speak otherwise to the alternatives for
fixing the collisions vulnerability, but I don't believe it's accurate
to presume that Django would not want to fix a dict-ordering dependency,
and use that as a justification for one approach over another.
Carl
</pre>
</blockquote>
<br>
It might be a good idea to have a way to seed the hash with some
value to allow testing with different dict orderings -- this would
allow tests to be developed using one Python implementation that
would be immune to the different orderings on different
implementations; however, randomizing the hash not only doesn't
solve the problem for long-running applications, it causes
non-deterministic performance from one run to the next even with the
exact same data: a different (random) seed could cause collisions
sporadically with data that usually gave good performance results,
and there would be little explanation for it, and little way to
reproduce the problem to report it or understand it.
</body>
</html>