[Python-ideas] [Python-ideos] Dedicated overloadable boolean operators

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Nov 26 06:56:36 EST 2015


On 26 November 2015 at 06:42, Nathaniel Smith <njs at pobox.com> wrote:
>
> - the worst code expansion created by lack of overloading isn't
>   a == 1 and b == 2
> becoming
>   (a == 1) & (b == 2)
> but rather
>   0 < complex expression < 1
> becoming
>   tmp = complex expression
>   (0 < tmp) & (tmp < 1)
> That is, the implicit 'and' inside chained comparisons is the biggest
> pain point. And this issue is orthogonal to the proposals that involve
> adding new operators, b/c they only help with explicit 'and'/'or', not
> implicit 'and'. Which is why I was sounding you out about making
> chained comparisons eagerly evaluated at the bar at pycon this year
> ;-). I have the suspicion that the short-circuiting semantics of
> chained comparisons are more surprising and confusing than they are
> useful and we should just make them eagerly evaluated, and I know you
> have the opposite intuition, so I think the next step here would be to
> collect some data (run a survey, scan some code, ...?) to figure out
> which of us is right :-).

Regardless of which is more useful it would be a very subtle backwards
compatibility break. I imagine that code that relies on the
short-circuit here is rare. I'm confident that it is much rarer than
numpy-ish code that works around this in the way you showed above or
just by evaluating the complex expression twice as in
     (0 < complex expression) & (complex expression < 1)
which is what I normally write when the expression isn't too long.

But there's guaranteed to be some breakage and a good migration path
to mitigate that is unclear. You could add a __future__ import and a
warning mode to detect when short-circuiting happens in chained
comparisons. Unfortunately the naive implementation of the warning
mode would simply trigger on 50% of chained comparisons (every time
the left hand relation is False).

I would definitely use chained comparisons for numpy arrays if it were
possible but at the same time I don't think the status quo on this is
that bad. It's a bit of a gotcha for new numpy users to learn but
numpy is good at giving the appropriate error messages:

    >>> from numpy import array
    >>> 1 < array([1, 2, 3]) < 2
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: The truth value of an array with more than one element
is ambiguous. Use a.any() or a.all()
    >>> 1 < array([1, 2, 3]) and 3
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: The truth value of an array with more than one element
is ambiguous. Use a.any() or a.all()

If you google that error message there's lots of SO posts etc. that
can explain in more detail.

It's also worth noting since we're comparing with Matlab that Matlab
actually has both short-circuit logical && and element-wise logical &
equivalent to Python's and/& respectively. And Matlab doesn't have
chained comparisons or rather they don't do anything nearly as useful
as Python's chained comparisons i.e. in Matlab -2 < -1 < 0 is
evaluated as:
    -2 < -1 < 0
    (-2 < -1) < 0
    1 < 0
    0   (i.e. False)
which is basically useless and doesn't even give a decent error
message like numpy does.

--
Oscar


More information about the Python-ideas mailing list