Break the dominance of boolean values in boolean context
True
Hi, the proposal is to reduce the number of occurences where Python automatically and inevitably uses pure boolean objects (bool()), instead of relying on rich object behaviour. My goal is to give objects finer control over how they want to be interpreted in a boolean context. This reaches down as far as implementing the boolean operations as an object-protocol. In Python, no kind of object is in any way special. Instead of relying on what an object actually IS, we rely on what an object can DO. This approach of anti-discrimination is the source of great confidence about the language and it's aversion to black magic. The bool()-type however is a primus inter pares: In numerous locations, Python strongly favors this type of object either explicitly by casting to it or implicitly through the compiled code. Here are a three examples (protocol-, object- and code-based) where Python grinds a boolean context down to pure boolean values: In protocol behaviour: The documentation about how to implement __contains__ vaguely states that the function "should return true, false otherwise". In fact we may return any object that can be interpreted as a boolean, this is intented behaviour. However the implementation of COMPARE_OP in Python/ceval.c inevitably casts the result of "x in y" to Py_True or Py_False, throwing away our object and leaving us with this bool only. This kills any chance of producing useful objects in __contains__, even objects that may have a meaning outside the pure boolean context. For example: x = mycontainer(['foo', 'bar', 'bar', 'foo', 'foo']) y = 'foo' in mycontainer print y == True print y
mycontainer('foo', 'foo', 'foo')
True
This has been discussed on python-ideas in mid-2010 (http://mail.python.org/pipermail/python-ideas/2010-July/007733.html) without distinct outcome. In object behaviour: Rich Comparision allows numeric- or set-like comparision in any way we like, returing any kind of object. The documentation clearly states that the returned object is only(!) interpreted as a boolean when used in a boolean context (__contains__ falls short here). For example: x = set((1,2,3,4,5)) y = set((2,3,4)) z = x > y print z == True print z
set([1,5]) # equivalent to x - y
False
z = x < y print z == True print z
set([]) # equivalent to y - x
1
print 3 > 2 print (3 > 2) == True
True
In code behaviour: Why is it, that we can override the behaviour of arithmetic operations like "+, -, *, /", but not the boolean operations like "and, or, xor, not" ? Why do we force any value being generated in a boolean context like "if x == 1 or y == 1" to actually be of boolean type? The result of "(x/y) == 1" may be any kind of object coming from __eq__. However the "or"-operator here grinds them both down to being just True or False. This is due to the fact that the generated bytecode evaluates one expression at a time and does not keep track of the objects it got while doing so. The pure boolean behaviour arises from the use of JUMP_IF_FALSE/TRUE after each comparision. Instead of having boolean operations being a part of the language, we could implement them as a an extension to the Rich Comparision Protocol, giving rise to functions like object.__bor__, object.__bnot__ or object.__band__: The expression "if x == 1 or y == 1" would then become equivalent to tmp1 = x.__eq__(1) if tmp1: return tmp1 tmp2 = y.__eq__(1) if tmp2: return tmp2 return = tmp1.__bor__(tmp2) Likewise, the expression "if x == 1 and y == 1" would become tmp1 = x.__eq__(1) if not tmp1: return tmp1 tmp2 = y.__eq__(1) if not tmp2: return tmp2 return tmp1.__band__(tmp2) The object-example from above now tells us how boolean behaviour and arithmetic behaviour go hand in hand: "(setA > setB) or (setA < setB)" is True because "set([1,5]).__bor__(set([]))" is the same as "set([1,5]) + set([])" and equivalent to True in a boolean context. Likewise, "(setA > setB) and (setA < setB)" is False because "set([1,5]).__band__set([])" is just "set([])". It follows that "(setA
setB) == (setA - setB) == (setA & setB)". We can't do this with boolean operations being part of the language only.
Summing all up, I really think that we should break the dominance of bool() and take a look at how we can implement boolean contexts without relying on boolean values all the time. None of this can be implemented without breaking at least the CPython API. For example, the behaviour of __contains__ can't be changed in the proposed way without changing the signature of "int PySequence_Contains()" to "PyObject* PySequence_Contains()".
participants (25)
-
Alexander Belopolsky
-
Amaury Forgeot d'Arc
-
Bill Janssen
-
Bruce Leban
-
Carl Matthew Johnson
-
Chris Rebert
-
Christopher King
-
Devin Jeanpierre
-
Eric Snow
-
Eric V. Smith
-
Ethan Furman
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Jacob Holm
-
Lukas Lueg
-
Mike Graham
-
MRAB
-
Ned Batchelder
-
Nick Coghlan
-
Paul Moore
-
Ron Adam
-
Steven D'Aprano
-
Sven Marnach
-
Terry Reedy