RE: [Python-Dev] bool does not want to be subclassed?

[Joshua Marshall]
I don't think I'm convinced; the same argument could be used for integers (if it doesn't make sense to create a sort of boolean which isn't in the set { true, false }, then it doesn't make sense to create
a sort of integer which isn't in the set { ..., -2, -1, 0, 1, 2, ... }). And maybe it doesn't, but this isn't the only reason for subclassing. Another reason for subclassing is to create items which can act like existing objects, but which have some additional behavior.
[Martin]
And indeed, for int, it is possible to have subclasses which have new instances whose values are in the set {..., -2, -1, 0, 1, 2, ...}. Indeed, it is possible to have multiple instances of int *itself* whose value is, say, 1000:
500+500 is 500+500 False
The same is not true for bool: There are only two *instances* of the type, not just two equivalence classes of equal values:
(4>5) is (3>9) True
So bool guarantees: a) there are only two distinct values, and b) there are only two different objects representing these values. It is property b) which prohibits subclassing.
Ah I misread Guido's comment--he's talking about the actual objects, not what they represent. So a different question... Can it be relied upon that two expressions which both evaluate to False both return the same object? That is, is it incorrect for a Python interpreter not to do this? I find this in the Python Reference Manual: "for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value" (note the "may"). Are bool and NoneType the only types for which this is reliably the case?

[Joshua Marshall]
... So a different question... Can it be relied upon that two expressions which both evaluate to False both return the same object?
Yes.
That is, is it incorrect for a Python interpreter not to do this?
Yes.
I find this in the Python Reference Manual: "for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value" (note the "may"). Are bool and NoneType the only types for which this is reliably the case?
Type objects are also singletons (e.g., type(True) returns the same object as type(False); ditto type("True") and type("False") and type("xyz"); etc).

Type objects are also singletons (e.g., type(True) returns the same object as type(False); ditto type("True") and type("False") and type("xyz"); etc).
But types are not immutable, so their semantics are different anyway. The point of the remark was to emphasize that immutable types have some leniency in their semantics regarding object identity. For mutable types the semantics regarding object identity are defined explicitly by each type or operation (some operations always return a new object, others always return the same one). --Guido van Rossum (home page: http://www.python.org/~guido/)

Type objects are also singletons (e.g., type(True) returns the same object as type(False); ditto type("True") and type("False") and type("xyz"); etc).
[Guido]
But types are not immutable, so their semantics are different anyway. The point of the remark was to emphasize that immutable types have some leniency in their semantics regarding object identity. For mutable types the semantics regarding object identity are defined explicitly by each type or operation (some operations always return a new object, others always return the same one).
The latter is forbidden by the language reference manual, though. Here's Joshua's original quote, but with more context: Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. I never thought of type objects as being mutable -- they're compared ("==") by identity, are usable as dict keys, act like immutable objects according to the text above, etc. Do you really think of them as being mutable? If so, why <wink>? E.g., after "a = 1; b = 1", a and b may or may not refer to the same object with the value one, depending on the implementation, but after "c = []; d = []", c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that "c = d = []" assigns the same object to both c and d.) If the text in the first quoted paragraph isn't correct, then this second paragraph doesn't explain anything about mutable objects in general by way of an example that happens to pick on empty-list literals, rather it's making a promise only about empty-list literals, but in a misleading way (reading for all the world as if it *were* trying to explain something about mutable objects in general by way of an example that happens to pick on empty-list literals). I expect the easiest way out is for you to decide that type objects are immutable after all -- even if there are obscure ways to mutate them!

Tim Peters wrote:
I expect the easiest way out is for you to decide that type objects are immutable after all -- even if there are obscure ways to mutate them!
It's not at all obscure:
class A(object):pass ... type(A).__name__ 'type' A.foo = 1 A.foo 1
Look, Ma, I'm mutating a type! No, son, you are just modifying it. The properties you quote (comparison checks for identity, usable as a dictionary key) don't imply at all that an object is immutable. Regards, Martin

"Martin v. L�wis" wrote:
Tim Peters wrote:
I expect the easiest way out is for you to decide that type objects are immutable after all -- even if there are obscure ways to mutate them!
It's not at all obscure:
class A(object):pass ... type(A).__name__ 'type' A.foo = 1 A.foo 1
Look, Ma, I'm mutating a type! No, son, you are just modifying it.
Isn't this just modifying A.__dict__, where __getattribute__ looks into to see whether something is defined? Or is that the same thing? Gerrit. -- Weather in Twenthe, Netherlands 14/02 15:25 UTC: 8.0°C wind 2.2 m/s SW (57 m above NAP) -- Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/

Gerrit wrote:
Isn't this just modifying A.__dict__, where __getattribute__ looks into to see whether something is defined? Or is that the same thing?
Correct. However, to the user, this should be an implementation detail, and is nearly indistinguishable from
A.__name__="B" A <class '__main__.B'>
Regards, Martin

"Martin v. Löwis" <martin@v.loewis.de> wrote in message news:402E2280.5070404@v.loewis.de...
Tim Peters wrote:
I expect the easiest way out is for you to decide that type objects are immutable after all -- even if there are obscure ways to mutate them!
It's not at all obscure:
class A(object):pass ... type(A).__name__ 'type' A.foo = 1 A.foo 1
Look, Ma, I'm mutating a type! No, son, you are just modifying it.
The properties you quote (comparison checks for identity, usable as a dictionary key) don't imply at all that an object is immutable.
Leaving aside mutability, there does seems to be something distinctively 'n-gleton'-like about types in that the identity of a type object is (a surrogate for) its value. For instance, after def f1(): pass def f2(): pass f1 and f2 will compare unequal, simply because '==' does not check functions for attribute equality, including code-object, identity. Nonetheless, f1 and f2 are interchangible in any calculation not dependent on identity. On the otherhand, after class c1(object): pass class c2(object): pass c1 and c2 *are* different and produce instances of different types when called. This sometimes surprises people, leading to misplaced clp bug posts, when the duplication is unintentional, as with slightly different imports resulting in two modules derived from the same disk file. Terry J. Reedy

[Tim Peters]
Type objects are also singletons (e.g., type(True) returns the same object as type(False); ditto type("True") and type("False") and type("xyz"); etc).
[Guido]
But types are not immutable, so their semantics are different anyway.
I never thought of type objects as being mutable [...]
I'm not sure of the meaning of "immutability of a type". Looking around, I found out this other message from Guido (quoted in its entirety in case the context changes the wanted meaning of words): ----------------------------------------------------------------------> From: Guido van Rossum <guido@python.org> Date: Thu, 18 Oct 2001 21:00:28 GMT Subject: Re: hashval and Numpy To: python-list@python.org Newsgroups: comp.lang.python
Michael Hudson <mwh@python.net> wrote:
|>>> a = Numeric.array([1,2]) |>>> hash(a) | 56 |>>> d = {a:1} # a will be in the 56%8-th slot (in 2.2, anyway) |>>> a[0] = 2 |>>> hash(a) | 57 |>>> d[a] # looks in 57%8-th slot - which is empty | KeyError |>>> b = Numeric.array([1,2]) |>>> hash(b) | 56 |>>> d[b] # looks in 56%8-th slot - but what's there is not __eq__ to b! | KeyError
| Moral: don't use mutable objects as dictionary keys.
Hm. The Numeric.array object should not have defined __hash__ or tp_hash. It's wrong if a mutable object defines a hash function, for exactly this reason. ----------------------------------------------------------------------< However, all types derive from object, and object has `__hash__'. Consequently, I would be tempted to think that under the new system, everything deriving from object is immutable by construction, unless tricks are used against it, like maybe, an intermediate class overriding `__hash__' with some function with the specific goal of raising an exception. Is that so? What means a "mutable type" then?

Hm. The Numeric.array object should not have defined __hash__ or tp_hash. It's wrong if a mutable object defines a hash function, for exactly this reason.
I think Guido was somewhat too terse here. Mutable objects should not define hash functions that change depending if their state changes - a hash must be constant for the life of the object.
However, all types derive from object, and object has `__hash__'. Consequently, I would be tempted to think that under the new system, everything deriving from object is immutable by construction, unless tricks are used against it
You are also misinterpreting "defines a hash function". Any object is hashable, but, unless specifically defined, the hash function uses the identity of the object (in CPython, its address) to compute the hash; this won't change if the state changes. So you should read Guido's statement also as "it's wrong if a mutable object *specifically* defines a hash function".
What means a "mutable type" then?
A mutable type is one whose instances have state that can change over the life of the object. Of these, there are two categories of objects: - values: objects whose identity is mostly irrelevant, and who compare equal if certain aspects of their state are equal. array.array is an example of a value type. - identity objects: objects whose identity matters in addition to their state; different objects will never compare equal. Identity objects should always be hashable, and their hash should be based on the identity. Mutability does not matter wrt. to hashing and dictionary keys. Values fall into two further categories: - immutable values: they should define a hash that takes those parts of the state into account that is also considered for comparison; equal objects should hash equal. - mutable values: an attempt to hash them should raise an exception, as one would require the following, contradicting, properties of a hash on such objects: - equal objects should hash equal (requires to take state into account) - the hash of an object should never change (requuires not to take the state into account). Dictionaries and array.array instances are mutable values, so they should have no hash. Types are identity objects; they can hash based on their identity despite being mutable. Regards, Martin

[François Pinard]
However, all types derive from object, and object has `__hash__'. Consequently, I would be tempted to think that under the new system, everything deriving from object is immutable by construction, unless tricks are used against it.
That object has __hash__ now is considered to be a bug: "New style classes and __hash__" http://www.python.org/sf/660098 [Martin v. Löwis]
You are also misinterpreting "defines a hash function". Any object is hashable,
That's not so; e.g.,
hash([]) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: list objects are unhashable
or
class C: ... def __cmp__(a, b): return 0 hash(C()) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: unhashable instance
That changing the last to use a new-style class:
class C(object): ... def __cmp__(a, b): return 0 ... hash(C()) 6973584
doesn't complain is one of the bugs in the report referenced above. The docs for __hash__ still spell out the *intent*: If a class does not define a __cmp__() method it should not define a __hash__() operation either; if it defines __cmp__() or __eq__() but not __hash__(), its instances will not be usable as dictionary keys. If a class defines mutable objects and implements a __cmp__() or __eq__() method, it should not implement __hash__(), since the dictionary implementation requires that a key's hash value is immutable (if the object's hash value changes, it will be in the wrong hash bucket) Old-style class instances work that way, and to the extent new-style class instances don't in 2.3, the new-style class implementation is in error.
but, unless specifically defined, the hash function uses the identity of the object (in CPython, its address) to compute the hash; this won't change if the state changes.
So you should read Guido's statement also as "it's wrong if a mutable object *specifically* defines a hash function".
What means a "mutable type" then?
It's gotten more <wink> confused. When types and classes were distinct beasts, types were immutable and classes were mutable, and instances of types or classes could be mutable or immutable (depending on the type or class they were instances of). The word "type" has lost its crispness in that respect.
A mutable type is one whose instances have state that can change over the life of the object.
I accept that, but it's not very useful on its own. Whether a type compares instances via value or via identity, and whether a type's instances can be used as dict keys, and whether a type's instances can be *sanely* used as dict keys, and whether a type's instances are hashable, can't be predicted just from knowing that the type's instances are mutable or immutable -- but people think they can be <wink>.

[Tim Peters]
The docs for __hash__ still spell out the *intent*: [...]
What means a "mutable type" then? It's gotten more <wink> confused. [...]
I wondered, earlier today, if we are not documenting this all upside down. (Just wondering, please: I hope nobody will get upset! :-) Python cannot really enforce immutability in the general case, despite it does enforce it for some built-in types. Immutability is a fuzzy concept which represents an *intent*, as you say, and a lot of effort is spent around it in the documentation, trying to remove the fuzziness.
If a class does not define a __cmp__() method it should not define a __hash__() operation either; if it defines __cmp__() or __eq__() but not __hash__(), its instances will not be usable as dictionary keys. If a class defines mutable objects and implements a __cmp__() or __eq__() method, it should not implement __hash__(), since the dictionary implementation requires that a key's hash value is immutable (if the object's hash value changes, it will be in the wrong hash bucket)
Maybe the documentation should also state that `__hash__()' ought to be constant for the full duration of the life of an object. (Is this true? Martin almost says this in his reply. Is there a meaning or reason for a varying hash value for a given object?) Moreover, this might yield an interesting optimisation if not done already: an instance might compute its hash only once, either at creation time, or probably better, lazily until first needed -- but I'm getting away of what I would like to say. Things might be clearer if Python was first and clearly documenting the required properties of `hash()', and then sorting types into hashable (those having such a function) and non-hashable (those not having it). An only then, introducing immutability and mutability as useful user concepts that could be implemented by making objects hashable or not. Then, the documentation could stick to "hashable" and "not hashable", which are very precisely defined, and carefully and systematically avoid "immutable" and "mutable", which are more fuzzy. I have the impression that the net result would be that things would be clearer: tuples are hashable, lists are not; dictionaries require hashable keys. Objects having `__cmp__' (or `__eq__') loose their default hashability. Etc. -- François Pinard http://www.iro.umontreal.ca/~pinard

François Pinard:
tuples are hashable, lists are not;
Actually that would be "tuples are hashable if all their elements are hashable". It's still very clear, though! I like this idea. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[Martin von Löwis]
A mutable type is one whose instances have state that can change over the life of the object. [...]
Interesting presentation. The following detail is worth stressing:
equal objects should hash equal.
This should be stressed as a formal requirement for a correctly behaved hashing function. The other formal requirement being that a hash function should always compute the same value for the lifespan of an object, and that value should be a non-negative integer (not a long one?). A _desirable_ property, but not a requirement, is that two unequal objects yield different hashes. So, instead of explaining how we should cleverly use `__hash__' to describe immutability, we might describe what `__hash__' ought to be, and then merely notice that hashable objects could be useful to represent immutable objects (whether user objects or builtin objects). -- François Pinard http://www.iro.umontreal.ca/~pinard

At 03:15 PM 2/16/04 -0500, François Pinard wrote:
So, instead of explaining how we should cleverly use `__hash__' to describe immutability, we might describe what `__hash__' ought to be, and then merely notice that hashable objects could be useful to represent immutable objects (whether user objects or builtin objects).
Actually, it might be even better to start with equality. Hashing is only meaningful for objects that can be tested for equality. That is: * By default, objects are equal only to themselves. But you can change this by overriding __cmp__ or __eq__. * Hashing is a technique used to group objects according to possible equality. So, if you want to create a hashable object that can be "equal to" other objects, it must return the same hash value as the objects it compares equal to. * Hash values must remain constant over the lifetime of an object. Thus, if an object may compare equal to a different set of objects over its lifetime, it must *not* be hashable. * Therefore, if an object is to be hashable, any attributes of the object that are used in calculating the hash (or performing comparisons) may not be changed once the its __hash__ method has been called. Usually, this is implemented by making an object completely unchangeable once created (e.g. tuples), but some user classes may implement this by "freezing" an object's contents once it is hashed (e.g. kjbuckets.kjSet).

[Phillip J. Eby]
[...] it might be even better to start with equality. Hashing is only meaningful for objects that can be tested for equality. [...]
Clean and clear. -- François Pinard http://www.iro.umontreal.ca/~pinard

Phillip J. Eby wrote:
Actually, it might be even better to start with equality. Hashing is only meaningful for objects that can be tested for equality.
However, in Python, all objects can be tested for equality. So hashing is meaningful for all objects? It is not: it is only meaningful for objects which compare equal to the same other objects over their lifespan. Regards, Martin

At 08:55 AM 2/17/04 +0100, Martin v. Löwis wrote:
Phillip J. Eby wrote:
Actually, it might be even better to start with equality. Hashing is only meaningful for objects that can be tested for equality.
However, in Python, all objects can be tested for equality. So hashing is meaningful for all objects?
It is not: it is only meaningful for objects which compare equal to the same other objects over their lifespan.
That's not valid logic; "X is only meaningful for Y" means that X implies Y, not the other way around. The fact that Y is a tautology doesn't imply that X is true, in fact if Y is a tautology then it only proves that "X implies Y" is true, because everything implies Y. However, I see your point that it's therefore silly to talk about things that imply a tautology. :) Anyway, if you read the rest of my post, you'd see that I explained the "lifespan" issue with a bit more precision than you have stated above. For example, I pointed out that it isn't necessary for an object to compare equal to the same other objects over its lifespan, only over the lifespan following its first __hash__ invocation.

Martin:
Of these, there are two categories of objects: - values: - identity objects:
Values fall into two further categories: - immutable values: - mutable values:
All these rules can be boiled down to: * Objects which compare equal should hash equal. * The notion of equality should not change over the lifetime of the object. There's no need to mention mutability at all. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Martin:
Of these, there are two categories of objects: - values: - identity objects:
Values fall into two further categories: - immutable values: - mutable values:
[Greg]
All these rules can be boiled down to:
* Objects which compare equal should hash equal. * The notion of equality should not change over the lifetime of the object.
There's no need to mention mutability at all.
Actually, think the very definition of a mutable object may be one whose equality can change over its lifetime. As you pointed out, we should talk about (im)mutable objects, not types, because (e.g.) tuples can contain lists. --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing wrote:
All these rules can be boiled down to:
* Objects which compare equal should hash equal. * The notion of equality should not change over the lifetime of the object.
There's no need to mention mutability at all.
I find "notion of equality" a fuzzy term: How do I determine whether it changes? The real requirement is that the hash of an object must not change. E.g. the the state of an object may change even if it contributes to the "notion of equality" if the state of all other objects changes simultaneously in the same way. Regards, Martin

So a different question... Can it be relied upon that two expressions which both evaluate to False both return the same object? That is, is it incorrect for a Python interpreter not to do this? I find this in the Python Reference Manual: "for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value" (note the "may"). Are bool and NoneType the only types for which this is reliably the case?
Among immutable built-in types, these plus Ellipsis are the only ones. The language doesn't guarantee such a thing for numbers other than bool, strings, unicode or tuples. --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (9)
-
"Martin v. Löwis"
-
François Pinard
-
Gerrit
-
Greg Ewing
-
Guido van Rossum
-
Joshua Marshall
-
Phillip J. Eby
-
Terry Reedy
-
Tim Peters