[Python-Dev] Fighting the theoretical randomness of "is" on immutables
Terry Jan Reedy
tjreedy at udel.edu
Tue May 7 04:50:55 CEST 2013
On 5/6/2013 6:34 PM, Antoine Pitrou wrote:
> On Mon, 06 May 2013 18:23:02 -0400
> Terry Jan Reedy <tjreedy at udel.edu> wrote:
>>
>> 'Item' is necessarily left vague for mutable sequences as bytearrays
>> also store values. The fact that Antoine's example 'works' for
>> bytearrays is an artifact of the caching, not a language-mandated
>> necessity.
>
> No, it isn't.
Yes it is. Look again at the array example.
>>> from array import array
>>> x = 1001
>>> myray = array('i', [x])
>>> myray[0] is x
False
Change 1001 to a cached int value such as 98 and the result is True
instead of False. For the equivalent bytearray example
>>> b = bytearray()
>>> b.append(98)
>>> b[0] is 98
True
the result is always True *because*, and only because, all byte value
are (now) cached. I believe the test for that is marked as CPython-specific.
> You are mixing up values and references.
No I am not. My whole post was about being careful to not to confuse the
two. I noted, however, that the Python *docs* use 'item' to mean either
or both. If you do not like the *doc* being unclear, clarify it.
> A bytearray or a array.array may indeed store values, but a list stores references to
> objects.
I said exactly that in reference to CPython. As far as I know, the same
is true of lists in every other implementation up until Pypy decided to
optimize that away. What I also said is that I cannot read the *current*
doc as guaranteeing that characterization. The reason is that the
members of sequences, mutable sequences, and lists are all described as
'items'. In the first two cases, 'item' means 'value or object
reference'. I see nothing in the doc to force a reader to change or
particularized the meaning of 'item' in the third case. If I missed
something *in the specification*, please point it out to me.
> I'm pretty sure that not respecting identity of objects stored in
> general-purpose containers would break a *lot* of code out there.
Me too. Hence I suggested that if lists, etc, are intended to respect
identity, with 'is' as currently defined, in any implementation, then
the docs should say so and end the discussion. I would be happy to
commit an approved patch, but I am not in a position to decide the
substantive content. Hence, I tried to provide a neutral analysis that
avoided confusing the CPython implementation with the Python specification.
In my final paragraph, however, I did suggest that Pypy respect
precedent, to avoid breaking existing code and expectations, and call
their mutable sequences something other than 'list'.
--
Terry Jan Reedy
More information about the Python-Dev
mailing list