[Python-ideas] [Python-ideos] Dedicated overloadable boolean operators

Guido van Rossum guido at python.org
Thu Nov 26 19:47:15 EST 2015


On Wed, Nov 25, 2015 at 10:42 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Nov 24, 2015 11:53 AM, "Guido van Rossum" <guido at python.org> wrote:
> >
> > On Tue, Nov 24, 2015 at 11:37 AM, Nathaniel Smith <njs at pobox.com> wrote:
> > > On Nov 24, 2015 10:21 AM, "Guido van Rossum" <guido at python.org> wrote:
> > >>
> > >> To everyone claiming that you can't overload and/or because they are
> > >> shortcut operators, please re-read PEP 335. It provides a clean
> > >> solution -- it was rejected because it adds an extra byte code to all
> > >> code using those operators (the majority of which don't need it).
> > >
> > > The semantic objection that I raised -- short circuiting means that you
> > > can't correctly overload 'True and numpy_array', because unlike all
> other
> > > binops the overload must be defined on the left hand argument -- does
> apply
> > > to PEP 335 AFAICT. This problem is IMHO serious enough that even if
> PEP 335
> > > were accepted today I'm not entirely sure that numpy would actually
> > > implement the overloads due to the headaches it would cause for
> teaching and
> > > code review -- we'd have to have some debate about it at least.
> >
> > OK, that's useful feedback. Is NumPy interested in coming up with an
> > alternative that works, or are you fine with the status quo?
>
> We'd certainly love it if there were a better alternative, but --
> speaking just for myself here -- I've hesitated to wade in because I
> don't have any brilliant ideas to contribute :-). The right-hand-side
> overload problem seems like an inevitable consequence of
> short-circuiting, and we certainly aren't going to switch 'and'/'or'
> to become eagerly evaluating. OTOH none of the alternative proposals
> mooted so far have struck me as very compelling or pythonic, if only
> because all the proposed spellings are ugly, but, who knows, sometimes
> something awesome appears deep in these threads.
>
> Two thoughts on places where it might be easier to make some progress...
>
> - the 'not' overloading proposed in PEP 335 doesn't seem to create any
> horrible problems - it'd be a minor thing, but maybe it's worth
> pulling out as a standalone change?
>

This seems pretty harmless, and mostly orthogonal to the rest -- except
that overloadable 'not' is not very attractive or useful by itself if we
decide not to address the others.


> - the worst code expansion created by lack of overloading isn't
>   a == 1 and b == 2
> becoming
>   (a == 1) & (b == 2)
> but rather
>   0 < complex expression < 1
> becoming
>   tmp = complex expression
>   (0 < tmp) & (tmp < 1)
> That is, the implicit 'and' inside chained comparisons is the biggest
> pain point. And this issue is orthogonal to the proposals that involve
> adding new operators, b/c they only help with explicit 'and'/'or', not
> implicit 'and'. Which is why I was sounding you out about making
> chained comparisons eagerly evaluated at the bar at pycon this year
> ;-). I have the suspicion that the short-circuiting semantics of
> chained comparisons are more surprising and confusing than they are
> useful and we should just make them eagerly evaluated, and I know you
> have the opposite intuition, so I think the next step here would be to
> collect some data (run a survey, scan some code, ...?) to figure out
> which of us is right :-).
>
> The latter idea in particular has been on my todo list for at least 6
> months and still has not bubbled up near the top, so if anyone is
> interested in pushing it forward then please feel free :-).


What to do with chaining comparisons is a very good question, but changing
them to be non-short-circuiting would cause a world of backwards
incompatible pain. For example, I've definitely written code where I
carefully arranged the comparisons so that the most expensive one comes
last (in hopes of sometimes avoiding the time spent on it if the outcome is
already determined). This is particularly easy with chained ==, but you can
sometimes also change a < b < c into c > b > a, if a happens to be the
expensive one.

However, note that PEP 335 *does* address the problem of chained
comparisons head-on -- in a < b < c, if a < b returns a numpy array (or
some other special object that's not falsey), it will then proceed to
compute b < c and combine the two using the overloading of the default
'and'; because the first result is not a simple bool, the problem you
described with `True and numpy_array` does not apply.

So, maybe you and the numpy community can ponder PEP 335 some more?
Honestly, from the general language design POV (i.e., mine :-), PEP 335
feels more acceptable than introducing new non-short-circuit and/or
operators. How common would the `True and numpy_array` problem really be? I
suppose any real occurrences would not use the literal True; `True and x`
is just a wordy way to spell x, and doing this element-wise would just
return the array x unchanged. (Or would it cast the elements to bool? That
still feels like a unary operator to me that deserves a more direct
spelling.) I don't see a use case for a literal left operator in symbolic
algebra or SQL either. But let's assume we have some scalar expression that
evaluates to a bool. Even then, `x and numpy_array` feels like a clumsy way
to spell `numpy_array if x else <an array of the same shape filled with
False>`. But I suppose you've thought about this more than I have. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20151126/7bfdd46f/attachment.html>


More information about the Python-ideas mailing list