Finding the instance reference of an object

Thu Oct 16 23:19:28 EDT 2008

On Oct 16, 2008, at 7:30 PM, Steven D'Aprano wrote:

>> However, 'bob' here really is a variable.  It's a variable whose  
>> value
>> (at the moment) is a reference to some object.
>
> Traditionally, a "variable" is a named memory location.

Agreed.

> The main objection I have to using "variable" to describe Python name/
> value bindings is that it has connotations that will confuse  
> programmers
> who are familiar with C-like languages. For example:
>
> def inc(x):
>    x += 1
>
> n = 1
> inc(n)
> assert n == 2
>
> Why doesn't that work? This is completely mysterious to anyone  
> expecting
> C-like variables.

Hmm... I'm not following you.  That wouldn't work in C, either.  'x'  
in 'inc' is a local variable; its value is just a copy of whatever  
value you pass in.  You can increment it all you want, and it won't  
affect the original variable (if indeed it was a variable that the  
value came from; it could be a literal or an expression or who knows  
what else).

> At this point people will often start confusing the issue by claiming
> that "all Python variables are pointers", which is an *implementation
> detail* in CPython but not in other implementations, like PyPy or  
> Jython.

I'm not claiming that -- and I'm trying to clarify, rather than  
confuse the issue.  (Of course if it turns out that my understanding  
of Python is incorrect, then I'm hoping to uncover and correct that,  
too.)

> Or people will imagine that Python makes a copy of the variable when  
> you
> call a function. That's not true, and in fact Python explicitly  
> promises
> never to copy a value unless you explicitly tell it to

Now that IS mysterious.  Doesn't calling a function add a frame to a  
stack?  And doesn't that necessitate copying in values for the  
variables in that stack frame (such as 'x' above)?  Of course we're  
now delving into internal implementation details... but it sure  
behaves as though this is exactly what it's doing (and is the same  
thing every other language does, AFAIK).

> but it seems to explain the above, at least until the programmer  
> starts *assuming* call-
> by-value behaviour and discovers this:
>
> def inc(alist):
>    alist += [1]  # or alist.append(1) if you prefer
>    return alist

It's still call-by-value behavior.  The value in this case is a list  
reference.  Using .append, or the += operator, modifies the list  
referred to by that list reference.  Compare that to:

  def inc(alist):
     alist = alist + [1]
     return alist

where you are not modifying the list passed in, but instead creating a  
new list, and storing a reference to that in local variable 'alist'.

The semantics here appear to be exactly the same as Java or REALbasic  
or any other modern language: variables are variables, and parameters  
are local variables with called by value, and it just so happens that  
some values may be references to data on the heap.

> Are functions call by value or call by reference???
>
> (Answer: neither. They are call by name.)

I have no idea what that means.  They're call by value as far as I can  
tell.  (Even if the value may happen to be a reference.)

Side question, for my own education: *does* Python have a "ByRef"  
parameter mode?

> I myself often talk about variables as shorthand. But it's a bad  
> habit,
> because it is misleading to anyone who thinks they know how variables
> behave, so when I catch myself doing it I fix it and talk about name
> bindings.

Perhaps you have a funny idea of what people think about how variables  
behave.  I suspect that making up new terminology for perfectly  
ordinary things (like Python variables) makes them more mysterious,  
not less.

> Of course, you're entitled to define "variable" any way you like, and
> then insist that Python variables don't behave like variables in other
> languages. Personally, I don't think that's helpful to anyone.

No, but if we define them in the standard way, and point out that  
Python variables behave exactly like variables in other languages,  
then that IS helpful.

>> Well, they are variables.  I'm not quite grasping the difficulty  
>> here...
>> unless perhaps you were (at first) thinking of the variables as  
>> holding
>> the object values, rather than the object references.
>
> But that surely is what almost everyone will think, almost all the  
> time.
> Consider:
>
> x = 5
> y = x + 3
>
> I'm pretty sure that nearly everyone will read it as "assign 5 to x,  
> then
> add 3 to x and assign the result to y" instead of:
>
> "assign a reference to the object 5 to x, then dereference x to get  
> the
> object 5, add it to the object 3 giving the object 8, and assign a
> reference to that result to y".

True.  I have no reason to believe that, in the case of a number, the  
value isn't the number itself.  (Except for occasional claims that  
"everything in Python is an object," but if that's literally true,  
what are the observable implications?)

> Of course that's what's really happening under the hood, and you can't
> *properly* understand how Python behaves without understanding that.  
> But
> I'm pretty sure few people think that way naturally, especially noobs.

In this sense I'm still a noob -- until a couple weeks ago, I hadn't  
touched Python in over a decade.  So I sure appreciate this  
refresher.  If numbers really are wrapped in objects, that's  
surprising to me, and I'd like to learn about any cases where you can  
actually observe this.  (It's not apparent from the behavior of the +=  
operator, for example... if they are objects, I would guess they are  
immutable.)

But it's not at all surprising with lists and dicts and objects --  
every modern language passes around references to those, rather than  
the data themselves, because the data could be huge and is often  
changing size all the time.  Storing the values in a variable would  
just be silly.

> References are essentially like pointers, and learning pointers is
> notoriously difficult for people.

Hmm... I bet you're over 30.  :)  So am I, for that matter, so I can  
remember when people had to learn "pointers" and found it difficult.   
But nowadays, the yoots are raised on Java, or are self-taught on  
something like REALbasic or .NET (or Python).  There aren't any  
pointers, but only references, and the semantics are the same in all  
those languages.  Pointer difficulty is something that harkens back to  
C/C++, and they're just not teaching that anymore, except to the EE  
majors.

So, if the semantics are all the same, I think it's helpful to use the  
standard terminology.

> Python does a magnificent job of making
> references easy, but it does so by almost always hiding the fact  
> that it
> uses references under the hood. That's why talk about variables is so
> seductive and dangerous: Python's behaviour is *usually* identical  
> to the
> behaviour most newbies expect from a language with "variables".

You could be right, when it comes to numeric values -- if these are  
immutable objects, then I can safely get by thinking of them as pure  
values rather than references (which is what they are in RB, for  
example).  Strings are another such case: as immutable, you can safely  
treat them as values, but it's comforting to know that you're not  
incurring the penalty of copying a huge data buffer every time you  
pass one to a function or assign it to another variable.

But with mutable objects, it is ordinary and expected that what you  
have is a reference to the object, and you can tell this quite simply  
by mutating the object in any way.  Every modern language I know works  
the same way, and I'd wager that the ones I don't know (e.g. Ruby)  
also work that way.  Python's a beautiful language, but I'm afraid  
it's nothing special in this particular regard.

Best,
- Joe