Re: [Python-Dev] Fighting the theoretical randomness of "is" on immutables

May 7, 2013

      On Mon, May 6, 2013 at 7:50 PM, Terry Jan Reedy <tjreedy@udel.edu> wrote:
...
On 5/6/2013 6:34 PM, Antoine Pitrou wrote:
...
On Mon, 06 May 2013 18:23:02 -0400
Terry Jan Reedy <tjreedy@udel.edu> wrote:
...
'Item' is necessarily left vague for mutable sequences as bytearrays
also store values. The fact that Antoine's example 'works' for
bytearrays is an artifact of the caching, not a language-mandated
necessity.
No, it isn't.
Yes it is. Look again at the array example.
...
...
...
from array import array
x = 1001
myray = array('i', [x])
myray[0] is x
False
Change 1001 to a cached int value such as 98 and the result is True
instead of False. For the equivalent bytearray example
...
...
...
b = bytearray()
b.append(98)
b[0] is 98
True
the result is always True *because*, and only because, all byte value are
(now) cached. I believe the test for that is marked as CPython-specific.
...
You are mixing up values and references.
No I am not. My whole post was about being careful to not to confuse the
two. I noted, however, that the Python *docs* use 'item' to mean either or
both. If you do not like the *doc* being unclear, clarify it.
A bytearray or a array.array may indeed store values, but a list stores
...
references to
objects.
I said exactly that in reference to CPython. As far as I know, the same is
true of lists in every other implementation up until Pypy decided to
optimize that away. What I also said is that I cannot read the *current*
doc as guaranteeing that characterization. The reason is that the members
of sequences, mutable sequences, and lists are all described as 'items'. In
the first two cases, 'item' means 'value or object reference'. I see
nothing in the doc to force a reader to change or particularized the
meaning of 'item' in the third case. If I missed something *in the
specification*, please point it out to me.
I'm pretty sure that not respecting identity of objects stored in
...
general-purpose containers would break a *lot* of code out there.
Me too. Hence I suggested that if lists, etc, are intended to respect
identity, with 'is' as currently defined, in any implementation, then the
docs should say so and end the discussion. I would be happy to commit an
approved patch, but I am not in a position to decide the substantive
content. Hence, I tried to provide a neutral analysis that avoided
confusing the CPython implementation with the Python specification.
In my final paragraph, however, I did suggest that Pypy respect precedent,
to avoid breaking existing code and expectations, and call their mutable
sequences something other than 'list'.
Wouldn't the entire point of such things existing in pypy be that the
implementation is irrelevant to the user and used behind the scenes
automatically in the common case when a container is determined to fit the
special constraint?

I personally do not think we should guarantee that "mylist[0] = x; assert x
is mylist[0]" succeeds when x is an immutable type other than None.  If
something is immutable and not intended to be a singleton and does not
define equality (like None or sentinel values commonly tested using is such
as arbitrary object() instances) it needs to be up to the language VM to
determine when to copy or not in most situations.

You already gave the example of the interned small integers in CPython.
 String constants and names used in code are also interned in today's
CPython implementation.  This doesn't tend to trip any real code up.

-gps

Re: [Python-Dev] Fighting the theoretical randomness of "is" on immutables

Gregory P. Smith