[Python-ideas] PEP 485: A Function for testing approximate equality

Wed Jan 28 01:38:04 CET 2015

:

On Tue, Jan 27, 2015 at 01:31:25PM -0800, Chris Barker wrote:
> I'm still looking for a case where a user would likely pass the same
> value into the function in a different order -- wouldn't s/he pick an
> order (maybe arbitratily) and use that?

Sometimes you *can't* pick an order (as in my dict example), and
sometimes your data picks its own order (for example, using all_close()
or similar to check the consistency of experimental results).

On Tue, Jan 27, 2015 at 02:00:05PM -0800, Chris Barker wrote:
> On Tue, Jan 27, 2015 at 1:52 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> > The order of dictionary iteration is arbitrary. Thanks to hash
> > randomization, the order will differ each time the program is run.
> 
> duh, of course!.
> 
> But question remains -- legitimate use case, or pathological example?
> 
> This kind of points to a is_close_to() and a are are_close() function
> (better than a flag, yes?) but that really does seem like overkill.

Note that my example was an any_close() function, not an all_close()
one. I doubt anyone's seriously going to suggest adding *three* new
functions, so anyone who wants something like that will by necessity
end up rolling their own.

I admit I threw in the unpredictable iteration order of dictionaries to
make the result as surprising as I could. There was a point to that ...

Some behaviour in Python is surprising, often (as with dict iteration)
necessarily so. However, the more surprising behaviour there is in the
language, the more likely it is that two instances of that behaviour
will interact in ways that are *especially* confusing and hard to debug.

However, if you're not convinced by my dictionary shenanigans, here's
something more straightforward:

    def any_close(iterable, tol):
        pairs = itertools.combinations(iterable, 2)
        return any(is_close_to(a, b, tol) for a, b in pairs)

Notice that

    results = [2.34, 5.68, 9.99, 5.67, 0.01]
    ac1 = any_close(results, 1/568)
    results.sort()
    ac2 = any_close(results, 1/568)

will result in ac1 and ac2 being different. A workflow along the lines
of:

User 1:
- generate data
- check sanity [all_close(), any_close(), etc]
- normalize [sort]
- send to User #2 [or persistent storage]

User 2:
- recieve from User #1 [or persistent storage]
- check sanity [as above]
- process data

... is pretty reasonable IMO, and would be adversely affected by the
behaviour described above.

You might fix that by having any_close() work on a sorted copy of the
iterable, or by normalizing before sanity-checking, but that's not
necessarily obvious until you get bitten, and it's not that hard to
imagine scenarios where neither of those fixes are practical.

 -[]z.

-- 
Zero Piraeus: post scriptum
http://etiol.net/pubkey.asc