[Python-porting] Control of hash randomization

Mon May 28 00:49:31 CEST 2012

On Sun, May 27, 2012 at 2:24 AM,  <martin at v.loewis.de> wrote:
> But there are standard procedures to deal with that very phenomenon:
> use a proper equality function.
>
> People have written tests for years that somehow relied on the order
> of keys in a dictionary (an issue in particular for doctest). If you
> find a failed assertion, and it involves an equality test, verify that
> the comparison uses "normalized" representations of the value. If not,
> add the normalization to all related test cases.
>
> Regards,
> Martin

I appreciate what you're trying to say here, but you need to
understand that in SymPy, the result of a test is a mathematical
expression (in the most literal sense).  A "normalization" function in
this case would be a simplification function.  Expression
simplification is very difficult.  Expression normalization is even
more difficult.  Technically speaking, for a large enough class of
functions, it's impossible.  I can point you to papers that
demonstrate why expression simplification/normalization is so
difficult if you are interested.

In SymPy, we have many tests like

    assert solve(3*x+5+2**(-5*x+3), x) in [
        [-((25*log(2) - 3*LambertW(-10240*2**(Rational(1,
3))*log(2)/3))/(15*log(2)))],
        [Rational(-5, 3) + LambertW(log(2**(-10240*2**(Rational(1,
3))/3)))/(5*log(2))],
        [-Rational(5,3) +
LambertW(-10240*2**Rational(1,3)*log(2)/3)/(5*log(2))],
        [(-25*log(2) + 3*LambertW(-10240*2**(Rational(1,
3))*log(2)/3))/(15*log(2))],
        [-((25*log(2) - 3*LambertW(-10240*2**(Rational(1,
3))*log(2)/3)))/(15*log(2))],
        [-(25*log(2) - 3*LambertW(log(2**(-10240*2**Rational(1,
3)/3))))/(15*log(2))],
        [(25*log(2) - 3*LambertW(log(2**(-10240*2**Rational(1,
3)/3))))/(-15*log(2))]
        ]

This was note the test writer trying to come up with every possible
form of the expression.  At some point in time, solve() returned each
one of those answers, but some subtle change in the code somewhere
resulted in it returning a different one.  Actually, in this
particular case, we probably could normalize by expanding, but that's
a simple case.  Here's a more advanced one (heurisch performs symbolic
integration):

assert heurisch(sin(x)*cos(x), x) in [sin(x)**2 / 2, -cos(x)**2 / 2]

This is a classic example from calculus of a function the produces a
different result depending on how you integrate it.  These two
functions are not mathematically equal.  That's because heurisch() is
only guaranteed to produce an antiderivative, which can differ by a
constant.

Even if simplify() were reliable, writing "assert simplify(a - b) ==
0" instead of "assert a == b" would make our tests much slower, and
would add the possibility that any test failure anywhere is the result
of a bug in simplify().

What we really need to do is change our internal ordering to not use
hashes, and to avoid iterating through a dictionary when the result
could be different (but still mathematically the same).  At least hash
randomization will point us to the places where this is happening.

On Sun, May 27, 2012 at 8:11 AM, Chris Jerdonek
<chris.jerdonek at gmail.com> wrote:
> On Sun, May 27, 2012 at 1:07 AM, Aaron Meurer <asmeurer at gmail.com> wrote:
>> On Sun, May 27, 2012 at 12:22 AM,  <martin at v.loewis.de> wrote:
>>> Also: if a test fails due to hash randomization, it should normally
>>> be possible to find the root cause by just reviewing the code (long
>>> enough). It may not be possible to reproduce the failure, but it
>>> should be obvious if a certain piece of code would fail under hash
>>> randomization.
>>>
>> Ha!  Well, that's easy enough to say, but if all you have to work with
>> is an assertion that failed, and a very large code base, it might not
>> be so straight forward.  Furthermore, such situations are very often
>> not obvious (or else the author probably would not have written them
>> in the first place).
>
> Sorry if this is obvious, but another suggestion is to include more
> information in the assertion error message.  It's not clear from the
> discussion so far if this is being done.  That way you can learn more
> about the state of the code during the assertion failure.
>
> I also spot checked a few spots in SymPy and see this being done frequently--
>
>    def test_ratint():
>        assert ratint(S(0), x) == 0
>
> Options include using the extended form of the assert statement, or
> the msg parameter of unittest's various assertion methods.
>
> --Chris

Yes, this is basically how we do our tests, "assert
result_of_some_computation == the_correct_answer".  The solution in
this case is for us to be using py.test instead of our home-grown test
runner, which provides this output for us.

Aaron Meurer