[Python-Dev] Fighting the theoretical randomness of "is" on immutables

Terry Jan Reedy tjreedy at udel.edu
Tue May 7 00:23:02 CEST 2013


On 5/6/2013 10:20 AM, Nick Coghlan wrote:
> On Mon, May 6, 2013 at 11:26 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> Le Mon, 6 May 2013 23:18:54 +1000,
>> Nick Coghlan <ncoghlan at gmail.com> a écrit :
>>> We're not going to change the language design because people don't
>>> understand the difference between "is" and "=="

For sure. The definition "The operators is and is not test for object 
identity: x is y is true if and only if x and y are the same object. x 
is not y yields the inverse truth value. [4]" is clear enough as far as 
it goes. But perhaps it should be said that whether or not x and y *are* 
the same object, in a particular situation, may depend on the 
implementation. The footnote [4] "Due to automatic garbage-collection, 
free lists, and the dynamic nature of descriptors, you may notice 
seemingly unusual behaviour in certain uses of the is operator, like 
those involving comparisons between instance methods, or constants." 
tells only part of the story, and the less common part at that.

>>> and then wrongly blame PyPy for breaking their code.

The language definition intentionally leaves 'isness' implementation 
defined for number and string operations in order to allow but not 
require optimizations. Preserving isness when mixing numbers and strings 
with mutable collections is a different issue.

>> Well, if I'm doing:
>>
>>    mylist = [x]
>>
>> and ``mylist[0] is x`` returns False, then I pretty much consider the
>> Python implementation to be broken, not my code :-)

If x were constrained to be an int, the comparison would not make much 
sense, but part of the essential nature of lists is that x could be 
literally any object. So unless False were a documented possibility, I 
might be inclined to agree with you, based on CPython precedent.

The situation *is* different with type-limited arrays.
 >>> from array import array
 >>> x = 1001
 >>> myray = array('i', [x])
 >>> myray[0] is x
False

I think the possibility of False is implicit in "an object type which 
can compactly represent an array of basic values". The later phrase "the 
type of objects stored in them is constrained" is incorrectly worded 
because arrays store constrained *values*, not *objects* or even object 
references as lists do.

> Yeah, that's a rather good point - I briefly forgot that the trigger
> here was PyPy's specialised single type containers.

Does implicitly replacing or implementing a list with something that is 
internally more like Cpython arrays than Cpython lists (as I understand 
what pypy is doing) violates the language spec? I re-read the doc and I 
am not sure.

Sequences are sequences of 'items'. For example: "s[i]   ith item of s, 
origin 0"  'Items' are not defined, but pragmatically, they can be 
defined either by value or identity Containment is defined in terms of 
equality, which itself can be defined in terms of either value or 
identity. For strings and ranges, the 'items' are values, not objects. 
They also are for bytes even though identity is recovered when objects 
for all possible byte values are pre-cached, as in CPython.

'Item' is necessarily left vague for mutable sequences as bytearrays 
also store values. The fact that Antoine's example 'works' for 
bytearrays is an artifact of the caching, not a language-mandated necessity.

 >>> b = bytearray()
 >>> b.append(98)
 >>> b[0] is 98
True

The definition for lists does not narrow 'item' either. "Lists are 
mutable sequences, typically used to store collections of homogeneous 
items (where the precise degree of similarity will vary by 
application)." Antoine's opinion would be more supportable if 'item' 
were replaced by 'object'.

Guido's notion of 'homogenous' could be interpreted as supporting 
specialized 'lists'. On the other hand, I think explicit import, as with 
the array module and numarray package, is a better idea. This is 
especially true if an implementation intends to be a drop-in replacement 
for CPython. It seems to me that Armin's pain comes from trying to be 
both different and compatible at the same time.

--
Terry Jan Reedy




More information about the Python-Dev mailing list