why () is () and [] is [] work in other way?

Mon Apr 23 21:08:23 EDT 2012

On Mon, Apr 23, 2012 at 8:03 PM, Tim Delaney
<timothy.c.delaney at gmail.com> wrote:
> On 24 April 2012 09:08, Devin Jeanpierre <jeanpierreda at gmail.com> wrote:
>>
>> On Mon, Apr 23, 2012 at 6:26 PM, Tim Delaney
>> <timothy.c.delaney at gmail.com> wrote:
>> > And doing that would make zero sense, because it directly contradicts
>> > the
>> > whole *point* of "is". The point of "is" is to tell you whether or not
>> > two
>> > references are to the same object. This is a *useful* property.
>>
>> It's useful for mutable objects, yes. How is it useful for immmutable
>> things? They behave identically everywhere where you don't directly
>> ask the question of "a is b" or "id(a) == id(b)".
>
>
> Not always. NaNs are an exception - they don't even compare equal to
> themselves. And hence a very good reason why "is" and == are separate
> operations.

You can't use "is": with NaNs any more than any other number.
Sometimes it does what you want, sometimes not, depending on whether
Python interns NaNs. The only real way to check if a value is NaN is
to use math.isnan. The differing behavior of == and "is" is _not_
reliable -- sometimes a is b will return False even though both a and
b are NaN.

There are systems under which "is" _would_ be reliable for this. I'm
not sure whether I'd want to argue for it, but I'm most of the way
anyway. You would have to either intern all NaNs, or else change the
definition of "is".

Also, when I say "two things behave identically everywhere", I mean
that we have two objects, `a` and `b`, such that for any function f(),
you'd have the exact same effects of your program called f(a) or f(b)
in the same place.

(This is addressed in the following page:
http://home.pipeline.com/~hbaker1/ObjectIdentity.html )

There is no way to observe any difference between immutable values
that are equal except via id() and the is operator. They are equal,
and they will never stop being equal. Presumably equality is designed
sanely so that they have the same structure, so iteration etc. all
have the same results. There is no way to distinguish them based on
any of their properties, except to resort to the is operator and id()
function.

This is not true of mutable values. For example, consider the following program:

    a = []
    b = c = []
    c.append(3)
    print b

a and c cannot be completely identical, because if we replaced
"c.append(3)" with "a.append(3)", the print at the bottom would have a
different outcome. No such difference can ever occur with immutable
objects (modulo some technical details if you really want to get
picky, and also id and is).

So, to address NaNs: if you have a, and a is a NaN, and so is b --
what program would behave differently if you swapped a and b? how do
you tell them apart?

My assertion from before is that they behave identically -- they
behave the same way -- everywhere where you don't directly ask the
question of "a is b" or "id(a) == id(b)".

>>
>> Therefore it shouldn't make any difference if we make equality
>> _always_ imply "is", since that was always on the table as an
>> implementation detail.
>
>
> See above. Identity does not necessarily imply equality.

You have the implication backwards. I was saying that, for immutable
objects, perhaps equality should imply identity. I was furthermore
justifying this by saying that it wouldn't break anything that wasn't
already broken. i.e., the only things that would be broken were the
things that didn't remember that Python can intern immutable objects.

>>
>> I think the real breakage is that now "a is b" is not equivalent to
>> "id(a) == id(b)" (unless you try to hack that as well.)
>
>
> The operations are indeed equivalent if the lifetimes of both objects cover
> the entire computation. Since you are not rebinding names here, there is no
> way that the operations could not be equivalent (unless multithreading is
> involved and the name bindings change underneath you, but I think we can
> discount that for the purposes of this discussion).
>
> There is no breakage here - just (I'm suspecting willful) lack of
> understanding.

It's you that doesn't understand. Perhaps it is willful lack of
understanding? ;)

I was discussing what happens if you change "is" such that "a is b" in
all current situations, plus the situations where a and b are
immutable and equal.

The email you originally sent was in response to one in which I change
the way "is" works, and you had a complaint against that change. In
response, I said that I don't think the complaint is a good one, but I
suggested one other complaint. Under the change that I suggested,
id(a) == id(b) would no longer be equivalent to a is b, because `a is
b` can be true even for objects in different places in memory.

I think, in that context, a lot of what you say probably doesn't make
sense the way you meant it to, but I'll address it anyway.

> I should now conclude that you have no interest in actually discussing this
> but just want to take an obstinate position and never consider moving from
> it.

There are two situations. In one, I'm a troll, and your statement
doesn't do anything because trolls only feed on that sort of thing. In
the other, I'm being genuine, and you're not being nice.

I have an interest in discussing this, but you apparently don't see it
that way. What's happening from my perspective is that we're both
talking past each other and not understanding what we're saying. I've
tried to communicate better in this message as a response to that.

> For (most*) immutable objects, if there is only ever a single canonical
> instance with that value, == and "is" are effectively the same operation
> (what you seem to want). Python's types are implemented this way - they do
> an "is" check before moving onto the more expensive equality check (except
> where overridden*). That's the benefit. But as you will see if you go back
> and read my message, there is a tradeoff in memory and/or performance
> involved in the canonicalising. Somewhere between no canonicalising and
> canonicalising everything (inclusive) is a sweet spot in terms of this
> tradeoff, and it varies depending on other decisions that have been made for
> the language and (of course) the actual program being run.

I'm not talking about canonicalizing the way you were discussing it. I
was attempting to show that such canoncalization is not necessary.

You said that such canonicalization would be necessary because
interning isn't possible once you change `is`. I disagreed, and
presented a way of using a dict that behaves like interning, but
without using `is` or identity comparisons.

I'll be more explicit and specifically define such a function:

cache = {}

def intern(x):
     return cache.setdefault(x, x)

Do we agree that A) this function works without using identity
comparisons, and B) this function performs the task of interning?

> * NaNs again.

As for NaNs, to intern those you'd need to explicitly check for them,
since they interfere with the dict storage mechanics. So the above
intern function would not work. This would, however:

cache = {}
cachednan = float('nan')

def intern(x):
    try:
        if math.isnan(x):
            return cachednan
    except TypeError:
        pass
    return cache.setdefault(x, x)

I hope, again, that I've demonstrated that we don't need to
canonicalize everything just to implement interning.

-- Devin