[Python-ideas] Break the dominance of boolean values in boolean context

Mike Graham mikegraham at gmail.com
Mon Sep 12 22:44:22 CEST 2011


On Mon, Sep 12, 2011 at 4:20 PM, Lukas Lueg <lukas.lueg at googlemail.com> wrote:
> Hi,
>
> the proposal is to reduce the number of occurences where Python
> automatically and inevitably uses pure boolean objects (bool()),
> instead of relying on rich object behaviour. My goal is to give
> objects finer control over how they want to be interpreted in a
> boolean context. This reaches down as far as implementing the boolean
> operations as an object-protocol.
>
> In Python, no kind of object is in any way special. Instead of relying
> on what an object actually IS, we rely on what an object can DO. This
> approach of anti-discrimination is the source of great confidence
> about the language and it's aversion to black magic. The bool()-type
> however is a primus inter pares: In numerous locations, Python
> strongly favors this type of object either explicitly by casting to it
> or implicitly through the compiled code.
>
>
> Here are a three examples (protocol-, object- and code-based) where
> Python grinds a boolean context down to pure boolean values:
>
> In protocol behaviour: The documentation about how to implement
> __contains__ vaguely states that the function "should return true,
> false otherwise". In fact we may return any object that can be
> interpreted as a boolean, this is intented behaviour. However the
> implementation of COMPARE_OP in Python/ceval.c inevitably casts
> the result of "x in y" to Py_True or Py_False, throwing away our object
> and leaving us with this bool only.
> This kills any chance of producing useful objects in __contains__,
> even objects that may have a meaning outside the pure boolean context.
> For example:
>
> x = mycontainer(['foo', 'bar', 'bar', 'foo', 'foo'])
> y = 'foo' in mycontainer
> print y == True
>>> True
> print y
>>> mycontainer('foo', 'foo', 'foo')
>
> This has been discussed on python-ideas in mid-2010
> (http://mail.python.org/pipermail/python-ideas/2010-July/007733.html)
> without distinct outcome.
>
>
> In object behaviour: Rich Comparision allows numeric- or set-like
> comparision in any way we like, returing any kind of object. The
> documentation clearly states that the returned object is only(!)
> interpreted as a boolean when used in a boolean context (__contains__
> falls short here).
> For example:
>
> x = set((1,2,3,4,5))
> y = set((2,3,4))
> z = x > y
> print z == True
>>> True
> print z
>>> set([1,5]) # equivalent to x - y
>
> z = x < y
> print z == True
>>> False
> print z
>>> set([]) # equivalent to y - x
>
> print 3 > 2
>>> 1
> print (3 > 2) == True
>>> True
>
>
> In code behaviour: Why is it, that we can override the behaviour of
> arithmetic operations like "+, -, *, /", but not the boolean
> operations like "and, or, xor, not" ? Why do we force any value being
> generated in a boolean context like "if x == 1 or y == 1" to actually
> be of boolean type? The result of "(x/y) == 1" may be any kind of
> object coming from __eq__. However the "or"-operator here grinds them
> both down to being just True or False. This is due to the fact that
> the generated bytecode evaluates one expression at a time and does not
> keep track of the objects it got while doing so. The pure boolean
> behaviour arises from the use of JUMP_IF_FALSE/TRUE after each
> comparision.
> Instead of having boolean operations being a part of the language, we
> could implement them as a an extension to the Rich Comparision
> Protocol, giving rise to functions like object.__bor__,
> object.__bnot__ or object.__band__:
>
> The expression "if x == 1 or y == 1" would then become equivalent to
>
> tmp1 = x.__eq__(1)
> if tmp1:
>  return tmp1
> tmp2 = y.__eq__(1)
> if tmp2:
>  return tmp2
> return = tmp1.__bor__(tmp2)
>
> Likewise, the expression "if x == 1 and y == 1" would become
>
> tmp1 = x.__eq__(1)
> if not tmp1:
>  return tmp1
> tmp2 = y.__eq__(1)
> if not tmp2:
>  return tmp2
> return tmp1.__band__(tmp2)
>
> The object-example from above now tells us how boolean behaviour and
> arithmetic behaviour go hand in hand: "(setA > setB) or (setA < setB)"
> is True because "set([1,5]).__bor__(set([]))" is the same as
> "set([1,5]) + set([])" and equivalent to True in a boolean context.
> Likewise, "(setA > setB) and (setA < setB)" is False because
> "set([1,5]).__band__set([])" is just "set([])". It follows that "(setA
>> setB) == (setA - setB) == (setA & setB)". We can't do this with
> boolean operations being part of the language only.
>
>
> Summing all up, I really think that we should break the dominance of
> bool() and take a look at how we can implement boolean contexts
> without relying on boolean values all the time.
>
> None of this can be implemented without breaking at least the CPython
> API. For example, the behaviour of __contains__ can't be changed in
> the proposed way without changing the signature of "int
> PySequence_Contains()" to "PyObject* PySequence_Contains()".


Your basic idea seems like a good one to me at an abstract level --
methods like __contains__ have no good design reason to typecheck.
However, I don't think any of the concrete changes here serve to make
Python a nicer or saner language. There is no reason why set should be
changed either of the ways you discuss, and there is no excuse I can
think of to implement something that works like mycontainer.

Mike



More information about the Python-ideas mailing list