[Python-Dev] Fighting the theoretical randomness of "is" on immutables

Terry Jan Reedy tjreedy at udel.edu
Tue May 7 04:50:55 CEST 2013


On 5/6/2013 6:34 PM, Antoine Pitrou wrote:
> On Mon, 06 May 2013 18:23:02 -0400
> Terry Jan Reedy <tjreedy at udel.edu> wrote:
>>
>> 'Item' is necessarily left vague for mutable sequences as bytearrays
>> also store values. The fact that Antoine's example 'works' for
>> bytearrays is an artifact of the caching, not a language-mandated
>> necessity.
>
> No, it isn't.

Yes it is. Look again at the array example.
 >>> from array import array
 >>> x = 1001
 >>> myray = array('i', [x])
 >>> myray[0] is x
False

Change 1001 to a cached int value such as 98 and the result is True 
instead of False. For the equivalent bytearray example

 >>> b = bytearray()
 >>> b.append(98)
 >>> b[0] is 98
True

the result is always True *because*, and only because, all byte value 
are (now) cached. I believe the test for that is marked as CPython-specific.

 > You are mixing up values and references.

No I am not. My whole post was about being careful to not to confuse the 
two. I noted, however, that the Python *docs* use 'item' to mean either 
or both. If you do not like the *doc* being unclear, clarify it.

> A bytearray or a array.array may indeed store values, but a list stores references to
> objects.

I said exactly that in reference to CPython. As far as I know, the same 
is true of lists in every other implementation up until Pypy decided to 
optimize that away. What I also said is that I cannot read the *current* 
doc as guaranteeing that characterization. The reason is that the 
members of sequences, mutable sequences, and lists are all described as 
'items'. In the first two cases, 'item' means 'value or object 
reference'. I see nothing in the doc to force a reader to change or 
particularized the meaning of 'item' in the third case. If I missed 
something *in the specification*, please point it out to me.

> I'm pretty sure that not respecting identity of objects stored in
> general-purpose containers would break a *lot* of code out there.

Me too. Hence I suggested that if lists, etc, are intended to respect 
identity, with 'is' as currently defined, in any implementation, then 
the docs should say so and end the discussion. I would be happy to 
commit an approved patch, but I am not in a position to decide the 
substantive content. Hence, I tried to provide a neutral analysis that 
avoided confusing the CPython implementation with the Python specification.

In my final paragraph, however, I did suggest that Pypy respect 
precedent, to avoid breaking existing code and expectations, and call 
their mutable sequences something other than 'list'.

--
Terry Jan Reedy




More information about the Python-Dev mailing list