Modifying Class Object

Michael Sparks sparks.m at gmail.com
Sat Feb 13 16:59:42 CET 2010


Hi Alf,


On Feb 12, 8:22 pm, "Alf P. Steinbach" <al... at start.no> wrote:
> Thanks for the effort at non-flaming discussion, it *is*
> appreciated.

I would appreciate it if you tried to be non-flaming yourself,
since you can see I am not flaming you.

I was seeking to educate you on a simple matter which you seem
to be misunderstanding in python. This will be my second and
last attempt to do so since you chose to be inflammatory in your
response. (In case you do not understand what I find inflammatory,
I will discuss that at the end)

Please note below I may use CAPS occasionally. Whilst usually
taken as shouting, I mean them as BOLD. Please be aware of this :-)

My reason for choosing to do reply is for the simple reason
that recently I had similar discussions with a colleague who was
likewise stuck talking about implementation aspects (call by
reference/value rather than call with object/sharing).

> > Before I start, note we're talking about semantics, not
> > implementation. That distinction is very important.
>
> Yes.
[ inflammatory comment snipped]

Good - common ground - a good starting point.

Now, if I define a language, this has 3 main parts:
   * Syntax
   * Semantics
   * Implementation

The syntax of python is simply and shortly defined in a machine
readable format, and is therefore not interesting to discuss
here.

The implementation of python is similarly well defined. There are
multiple such implementations, one of which is CPython.

> However, all those references to implementation aspects,
> persisting
[ inflammatory comment snipped]

In theory when talking about a language, we do not need to talk
about implementation details. However when using a language,
implementation details do matter as well.

That's why people talk about implementation aspects. When talking
about how the language is defined, you need to talk about how the
language is defined. It's not defined in terms of Java pointers or
references. It's defined in terms of objects and names. (Or objects
and labels)

The exception to this is with pure functional language. In a pure
functional language I do not care about implementation details,
since they are outside the scope of the language.

It is worth noting that python and functional languages share a
common ethos - though with different conclusions - that optimising
for the programmers expression of the problem rather than for the
machine *matters*.

If you miss this, you will be stuck misunderstanding python,
pretty much forever. If you (the reader, not necessarily Alf)
understand this, good. If you don't, you need to re-read this
and really understand it.

(please bear in mind when I say "YOU" in that paragraph, I
mean "whomever is reading this", not anyone specific)

Let's get down to brass tacks.

In python, the LANGUAGE, there are no pointers or references,
much like in SML, haskell and friends there are no pointers or
references.  I'm using SML here as an example, because it is
conceptually close to python in terms to some aspects of
evaluation and how it deals with names and values. (There
are many differences as well, but we're dealing with calling,
names & values, in which they are close enough)

Taking an example from SML:

structure Stk =
struct
  exception EmptyStack;
  datatype 'x stack = EmptyStack | push of ('x * 'x stack);
  fun pop(push(x,y)) = y | pop EmptyStack = raise EmptyStack;
  fun top(push(x,y)) = x | top EmptyStack = raise EmptyStack;
end;

This may be used, say from the interactive prompt, as follows:

   val x = EmptyStack;  (* 1 *)
   val 'x = x;          (* 2 *)
   val y = push(5,'x);  (* 3 *)
   val z = push(4,y);   (* 4 *)
   top z;               (* 5 *)

Now, in this example, in SML, there are only names and values.
Unlike python, all values are immutable, and theoretically, no
sequencing of statements.

Now line 1 takes the EmptyStack value, and the name x is bound
to it. Line 2 takes that same EmptyStack value, and the name 'x
is also bound to it.

There are no pointers in SML, just names and values.

Like python, SML has aliases. 'x for example is an alias for x.
However that is just a symbolic name for the object itself.

When line 3 is evaluated, the value push(5, 'x) is bound to the
name y. Note - push(5, 'x) is in itself a value in SML, because
it has been defined as such in the Datatype definition in the
structure definition Stk.

When we reach line 5, the function top is called with the value
that z is bound to. Not a reference. Not a pointer. The actual
value. In this case z's value is push(4,push(5,EmptyStack)).

Thus the SML runtime evaluates the expression
   top push(4,push(5,EmptyStack))
And returns the value 4.

In python, I don't have such richness of ability to define values,
but I have these types play with:

   * numbers - 1.0 , 1, 1+2i, 0x11 0777
   * strings - "hello" "world"
   * lists - ["hello", "world" ]
   * tuples - ("hello", "world")
   * dicts - {"hello" : "world" }

All of these types are defined in python as python objects. How
they are implemented is irrelevant. What is relevant is that
lists and dicts are mutable. The rest are not. (This mutability,
along with sequencing and iteration make python different from
the pure subset of SML.)

However, nonetheless when I type this:

def hello(world):
   print world

hello("world")

When I call "hello", with the argument "world", the language
is DEFINED, that *just like SML* I am passing in that specific
object. Not a copy of a value (call by value). Not a reference
to the value. Not a pointer to a value. But that specific actual
object.

This isn't willy nilly or on a whim, that's a concious choice.

[ If this makes no sense, consider that python's history comes from
  a CS department like SML's does, rather than from an engineering
  department (ala C/C++/Java).


Now you might turn around and say (since you have in the thread)
"but in reality you're not _really_ passing in that object,
you're passing in a reference", you're wrong. If the language
defines it as passing in THAT object, then it defines it as
passing in THAT object.

But you can come back, as you have done, over and over and claim
that it's a pointer or a reference, and may even say things like
"well, according to this OTHER language's definition of this word,
I mean this".

That's nice, but ultimately broken.

Using another languages way of defining something is wrong.

In python it is *defined* as passing in the object. Not a reference.
Not a pointer. The object itself.

If it is a pointer or reference it has a value, and refers to a
value. Essentially 2 values in 1. This is part of the accepted
definition of pointer or reference. "A value that allows you to
find another value".

By claiming that in the below:

    >>> stack = ()
    >>> stack_ = stack

That stack is a reference to () and that stack_ is also a
reference to () _in python the language_, then I must be able
to get at the value of stack itself, and the value of stack_
itself.

You claim to be able to do this thus:
    >>> id(stack)
    3078742060L
    >>> id(stack_)
    3078742060L

On the surface of things, this sounds like a good claim, but it
isn't. If it was a reference, its type would be a reference.
It isn't:

    >>> type(id(stack))
    <type 'long'>
    >>> type(id(stack_))
    <type 'long'>

That value is an integer. Additionally, if these "id" values were
references, then I would be able to take those values and _inside
the language_ retrieve the value they refer to.

ie I would be able to type something like this to get this sort of
behaviour:
    >>> refstack = id(stack)
    >>> refstack_ = id(stack_)
    >>> STACK = !refstack
    >>> STACK_ = !refstack_
    >>> STACK
    ()
    >>> STACK_
    ()

This means one of two things:
   * Either "stack" by itself is a reference to (), but we
     cannot get at the reference itself, just the value it
     refers to.
   * OR "stack" is the preferred way of dereferencing the
     reference associated with stack, and to get at the actual
     reference we do id(stack).

The problem is i) is incomplete in terms of being a reference,
since we can only get at the r-value. The other one is that
ii) is incomplete because we have no way of going from an l-value
to an r-value.

This means that *AS FAR AS THE LANGUAGE* is concerned whether
the implementation looks or acts like it's implemented as call
by reference or whatever, it is simpler and more correct to say
that as far as the *LANGUAGE* is concerned, you pass in the
object, which will be bound with a local name inside the function
or method.

Thus is is more correct to say that python the language only has
aliases.

By comparison, it's possible to demonstrate that Java does have
references, because it is possible to box and unbox values to/from
reference values:

    http://java.sun.com/docs/books/jls/third_edition/html/conversions.html#190697

Thus whilst Java's semantics may be similar, python's differ in
that Java, the language, has references and python, the language,
has aliases.

Now let's move to the implementation aspects.

Python as a language is implemented in many languages. One of these
is C. There are compilers to C (pypy), C++ (shedskin), for the JVM
(Jython) and .net (Ironpython).

There is also an executable operation semantics for python,
which can be found here:

http://gideon.smdng.nl/2009/01/an-executable-operational-semantics-for-python/

This set of operational semantics is written in Haskell.

Haskell is a strictly pure, lazily evaluated language. It
therefore has no pointers or references, just values and names.
The implementation therefore cannot be in terms of references
and pointers. Therefore to say "in reality the implementation
will be passing a reference or pointer" is invalid. There is
after all at least one implementation that does not rely on
such machine oriented language details.

What you find in terms of modelling the semantics, is that
implementation is seeking to model those of the python virtual
machine, and there that talks about values, objects, references
and the heap. However in terms of modelling how the language is
used this is not exposed at the language level.

Indeed the only test it uses with regard to references involves
the creation of a reference type _using_ python:

   class Reference(object) :
       pass

Creating a reference:
   a = Ref()

Setting the value the reference refers to:
   a.value = False

Passing the reference to a function, and changing the value the
reference:
   def bar(ref):
       ref.value = True

   bar(a)

Whilst you might say "But that's wrong", note that .value on
the right hand side is used for unboxing the value - it's
syntactic sugar for a method call on the ref object.
Specifically, it's the result of doing ref.__getattr__("value").

Likewise, ref.value on the left hand side of an equals is
also syntactic sugar for boxing the value - specifically it's
syntactic sugar for ref.__setattr__("value", <whatever>)

That's the closest you can come to Java references in python -
since otherwise you have no way to box or unbox any values.
If you want to claim Python has java references with no ability
to box or unbox any values, then maybe that's vaguely OK, but
it's still _wrong_. You can claim it's LIKE java references, but
since python doesn't have references, only aliases, you'd be
wrong to claim they're the same as java references.

And from a language spec - semantics, not implementation
viewpoint - you'd also still be wrong because you can implement
python without calling the values references.

Specifically, the object model defined in src/Objects.lhs and
src/ObjectTheory.lhs in the above reference operational semantics
specification of python uses a functional heap to contain the
objects. New objects are created there thus:

rewrite   (State
( heap                                                          )
envs  stack  ( FwApp (Prim "object" "__new__") (a_c:as) )) =
          (state  ( heap <+> [a |-> ObjValue (new [__class__ |-> a_c])
NoneValue] )  envs  stack  ( BwResult
a                               ))
  where a = freeAddr heap

In this definition, which is compilable and executable (since it's
haskell), there are no references, pointers or memory allocation.

Thus to say that for this:

    def hello(world)
        print world

    hello("world")

Saying that I *must* be passing in "world" by reference since
implementation requires this, is false. In *an* implementation
may do that, and indeed, it's the most obvious way of
*implementing* it in many languages, but to say the language
specifies it that way is incorrect.

After all, in the Haskell implementation for example, the
object itself is just passed in.

 ~~ interlude ~~

Now, if python DID add references, this would be pretty awesome
because it's the most obvious way of adding in software transactional
memory, since this usage of X.value is remarkably close to the way
I deal with values in my implementation of software transactional
memory for python which you'll find here:

   * http://www.kamaelia.org/STM

And since python doesn't already have references, you wouldn't be
changing existing semantics.

This won't happen for a fair while at least for many reasons, not
least the fact that there's a moratorium on language changes at
present.

 ~~ end interlude ~~

Now if we return to CPython, we find that CPython is the most
commonly used implementation of python. As a result much python
code is written to work well with that implementation, rather
than with any other.

The CPython implementation has some limitations:
   * It is written in C
   * It cannot pass in actual objects (like haskell), only
     primitive values.
   * Therefore it has to pass in pointers to values to the parts
     of the subsystem that do the evaluation.
   * It has to track memory, and performs garbage collection.
   * Since it tracks garbage, it uses the parlance of garbage
     collection - specifically including references and reference
     counts. A reference count is however just a count of "who
     knows about this object and is keeping this alive")
   * All memory allocation lives in the heap, and as a result
     python objects sit there, and the implementation slings
     around pointers to these things.
   * Logically speaking (and at the language level), python
     objects may contain other objects. Specifically, a list
     may contain another object (or even itself). Both of
     these types of relationship cause the reference counts
     to increase.
   * The program running tracks the objects using names, and
     when an object stops being findable by another object
     (eg no longer in a list, and no longer labelled by someone),
     this is a trigger which the garbage collector can use to
     decide to delete the object.

Given CPython is written in C, it becomes natural for CPython -
the implementation - to therefore start talking about objects,
pointers, references, reference counts and so on. Not only that
since no programmer really likes memory leaks etc, actually
knowing what the implementation is doing is *at times* useful
to know.

Since the language spec is tied up with a specific implementation,
some of these details leak into the language of the document.

So yes, as a pragmatic python programmer, I do care about reference
counts. I do care about weakrefs which don't increase reference
counts. I do care about when values are created and deleted.

However, I only do so because of the *implementation* not because
of the specification.

When it comes to talking about the language, saying "it's pointers"
over and over or saying "it's references" over and over or even
saying "it's references/pointers as defined by java" over and over
does NOT make it correct.

If you move forward and accept that python THE LANGUAGE uses names
and objects, and names are more like labels stuck onto objects
than names of boxes that objects can sit inside, you'll move
forward somewhat.

If you don't, that's fine, but stating your version of "truth" as
correct, when told many times you are incorrect is either ignorance,
arrogance, pigheadedness or stupidity.

I have chosen to believe it's ignorance, and assuming good faith.

Now, I did say I'd pick out the inflammatory parts of how you've
spoken to me, and I will just leave them as that. You will
hopefully see how they're inflammatory and modify your behaviour
in future.

Please note though, I will pick out one point. I was not using
Wikipedia as a means of argument by authority. I was using it as
a starting point for a discussion which I thought many could use
as a valid starting point - since whilst not perfect, and can be
highly inaccurate in places, it does form one of the few places
where people generally eventually reach some form of consensus.

The fact that you've previously editted, fine. Good. That's very
civic minded of you and I thank you for it.

The idea that somehow bringing in an acceptable definition from
some source that you've effectively stated you agree with as
valid (because you've editted it), somehow invalidates the meat
of my points I find a little bizarre. I'll put that down to
cultural differences.

Anyway, here's the points I find inflammatory, and rather than
respond, I'll leave them as is.

I'll also not be responding to any further mails containing
inflammatory comments. You can think this stupid or annoying,
but quite frankly, I don't see the point.

Hopefully nothing I have said above you personally find
inflammatory, if you do, I apologise in advance. I do hope
that my mail has moved your understanding of python calling
conventions on. I primarily do this because I hope that you
do not pass on your misunderstandings as fact, when in fact
they are not - because doing so would do others a disservice.

Regards,


Michael.

Inflammatory comments  (from my personal viewpoint)
On Feb 12, 8:22 pm, "Alf P. Steinbach" <al... at start.no> wrote:
> It would seem to readers that posters here do not grasp and are
> unable to grasp that distinction.

The posters here do understand that distinction.

> However, all those references to implementation aspects,
> persisting in the face of corrections,

Stating something is a correction does not make that correction
true. If I state it's call by sharing, and that happens to be
the definition, then correcting me and calling it call by value
or call by reference does not make that correction true.

If you are told something repeatedly by many sources, perhaps
consider you may be wrong.

>  have just, IMHO, been consistent attempts at misrepresentation.

This can be taken as posters choosing to misrepresent your words,
and is equivalent to the question "When did you stop beating your
wife?", and equally offensive.

> I'm sorry to disappoint, but no. It was a random Google hit to find
> a first-year text that perhaps posters here could understand.

You have been talking to many people who have been around python
for sometime, including those who've worked on python's internals.
Assuming that they need "educating" as CS101 students is offensive.

> Not insinuiting anything about heir mental prowess or lack
> thereof,

No. It does - whether you like it or not. Communication is about
2 (or more) parties communicating. The sender of a message should
care more about how their message is recieved than how it is sent.
In my case, my goal of sending you a message was to educate.

Your goal APPEARS to have the purpose of defending a position you
hold strongly, to the extent of ignoring the effect on others. If
you think "That's their problem", then I can just correct you,
and then ignore you. If that's a misrepresentation of your goals,
then maybe telling you is helpful. I don't know.

I do know thoughI've made that sort of mistake in the past (not to
anyone here, but I have done), and realised my mistake and been
grown up enough to recognise my mistake and apologise.

I feel you have made that mistake here. Whether you apologise
is your call. I would.

> but just something they might understand given a repeatedly
> demonstrated inability to understand such basics, or even to
> understand the concept of reference to a definition of a term.

No, they do not agree with you. This is not the same as not
understanding, or lacking the mental ability to understand
"basic terms".

If I said you lacked the ability to understand such basics
of human interaction, you'd probably be rather offended.
If, at the same time, I demonstrated a complete inability to
understand such basics myself, you would find such remarks
inflammatory.

> Please do consider that my usage of the term "pointer" has
> always, in this thread, including the first article I posted
> in this thread, been with explicit reference to the Java
> language spec's meaning.

I understand this. What you seem to misunderstand is that others
also understand this. They also understand that Pointer as viewed
from the Java Language Spec *IS* a different concept.

There may be similarities, but the fact that you can unbox and
rebox a Java Reference (as per reference above), where you *cannot*
in python because, well, you can't, IS a fundamentally different
concept.

Conceptually you pass in the object in python. Not the object.
Java has to say you pass in a reference because Java has the
concept of references which may be box and unbox values.

> You may argue that those who wrote that language spec are
> confused about things or whatever, but that's rather stretching
> it, wouldn't you say?

No it isn't, there's others who claim exactly the same thing
about the Java Language Spec. In python, the language spec was
written by those who wrote CPython, therefore their terms are
clouded by concrete details. If they weren't, they'd use the
term aliases more often.


> So, this must be the most silly "argument by authority"
> ever offered.

This is inflammatory because you are deriding my premise as "silly"
because you have chosen to not ask yourself *why* I chose to use
that as a starting point, and then chose to ignore all my points
and arguments, deriding it as "silly".

That's rather inflammatory when someone is seeking to educate you
in the hope of making it such that you understand better and those
you come in contact with learn the correct semantics, rather than
what you think the semantics are, despite being told otherwise.

> I find that amusing and would have included a smiley except that
> I think that that would incur further flames.

Please not I have not flamed you. I have ignored your inflammatory
remarks except to note how I find them inflammatory towards myself
and others.

The assumption that I would flame you I do actually find relatively
offensive. I've stayed out of many flame wars for many years, and
generally find that seeking to understand why the opposing side in
an argument is saying what they do can actually often either resolve
differences or lead to an agreement to disagree.

Likewise I have tried to stay away from making suggestions as to
how you should change your behaviour since that could be considered
inflammatory by some.

> It's no criticism of you; I do appreciate your effort, thanks.

This comes across as dismissive, given you ignored the entire body
of my mail, and decided to deride it as "silly".

> And I reiterate that the meaning of "pointer" I have used is
> the meaning in the Java language spec.

This assumes I have not read the thread. I have. I read the
statement above numerous times. I also however have read the
Java Language Spec (you are not the only person to have done
this of course), and do NOT agree with you.

Indeed I can see usecases where *adding* references to python,
the language, could be useful. However, since they're not there,
I can't.

> And that that's about semantics, not about implementation,
> and not about C pointers or Pascal pointers or whatever else
> that some would rather wish it was.

Unfortunately, you have picked a language whose semantics for
references include the concept of being able to box and unbox
object references. You cannot do that in python. Boxing and
unboxing references is just another term for referencing and
dereferencing values. Again, something you can't do in python.

Again, by ignoring the meat of my argument and reiterating your
point, you come across as willfulling ignoring logical rational
arguments put your way in as non-inflammatory was as possible,
decrying all those who disagree with you as somehow stupid,
misrepresentational or ignorant.

By doing so, your mail as a whole is rather inflammatory, but
I've decided to only send one mail in reply to you and then
leave it at that.

> In the following that I snipped you had good discussion except
> for (as I could see) a fallacy-based denial of references
> existing in Python.

No, it's not denial, it's an understanding of programming
languages borne from having used a wide variety of languages
from Assembler (Z80, 6502, 6800), BASIC, pascal, C, C++, Java,
COBOL, Occam, Fortran, Perl, Python, Ruby, E, SML, Haskell,
Prolog, Lisp, and several others I've probably forgotten I've
used now over well over a quarter of century of usage. I've
used languages which are reference only, object only, value
only, etc.

My conclusion that python is call by object sharing is based
on 8 years experience of using python, and changing my views
from early on viewing it as pointers/references to understand
that the semantics are much closer to that of an impure
functional language where you pass complex objects *themselves*
(from a _semantic_ perspective).

To claim that I am in "denial", somehow implies that I am either
inexperienced, stupid, or unwilling to call a hammer a hammer or
a screwdriver a screwdriver.

I'm not. I'm simply unwilling to call a hammer a screwdriver or
call a screw a nail.

> I don't think there's any point in discussing that further,
> because it's evidently a religious matter where rational
> arguments

This is inflammatory, since you dismiss my arguments now as
religious rather than rational. I find that moderately disgusting
and offensive.

> --  and I've tried a few

You have not appeared to try to understand where my argument
is coming from sadly, despite me doing the same for yours. You
have dismissed my disagreement as religious, rather than seeking
to understanding the rational argument you have been presented,
and have not argued against it.

I however am seeking to be precise, by choosing NOT to use terms
like reference, because they are incorrect.

> --  don't reach, in spite of the extreme triviality of the
> subject matter.

Far from being trivial, understanding that you cannot unbox
"the value" of a python object is rather fundamental to the
understanding of python. Understanding that you are passing
around (logically speaking) the actual object explains for
example why this style of coding confuses people when they
start with python:

    >>> class MyObject(object):
    ...    def __init__(self, store=[]):
    ...        self.store = store
    ...    def append(self, X):
    ...        self.store.append(X)
    ...    def pop(self):
    ...        return self.store.pop()
    ...
    >>> X = MyObject()
    >>> Y = MyObject()
    >>> X.append(10)
    >>> Y.pop()
    10

The misunderstanding that people have is that they
assume that this:
    ...    def __init__(self, store=[]):

Results in a new empty object being created every time, whereas
actually it's created at the point the def statement is run,
specifically here:
    >>> MyObject.__init__.im_func.func_defaults
    ([],)

The fact I can get *an* object id thus:
    >>> id(MyObject.__init__.im_func.func_defaults[0])
    3078510124L
    >>> id(X.store)
    3078510124L
    >>> id(Y.store)
    3078510124L

Is pretty irrelevant. It's not a reference - it's an opaque
value that happens to be useful when debugging for telling
whether I've got the same object or not. Otherwise the objects
themselves would have no default name, and no way of telling
if THIS list is different fron THAT list.

However, by understanding that the [] object is created when
the def statement is evaluated and the object reused by all
subsequent calls, rather than being created when the object
is called explains much more clearly WHY the code works the
way it does.

Yes, you can make a similar argument in terms of references,
and under the hood that's what happens with pointers to an
empty list object stored on the heap. However, in terms of
semantics, it's an actual python object, not a reference.

After all python is a HUMAN oriented language, not a machine
oriented language. A reference says "this thing over here".
An object is a thing. The latter is more HUMAN, the former
is machine oriented.

> Thanks for the effort at non-flaming discussion,
> it *is* appreciated.

It does not appear that way, since you have chosen to assume
that I don't understand the thrust of your arguments, rather
than disagree with them. You have chosen to assume that my
arguments are wrong and used or made for religious reasons
rather than rational ones - such as the desire to be precise.

I sincerely hope that my reply does not offend or inflame you,
since that is not the intent. I do hope it educates you and
puts into context the responses you have gained from others.

After all, one simply shouting in a corner saying "YOU'RE
ALL WRONG, WRONG, WRONG. I'M RIGHT RIGHT RIGHT", when one
does not to understand what one is talking about does not
tend to engender warm fluffy feelings or sentiments of
authority towards such an individual. Be it me, you, or
anyone else.

At the moment, you appear to me to be engaging in such a
behaviour. Now you don't know from Jack and probably don't
care about  my viewpoint, but I would really appreciate it
if you would try not to be inflammatory in your response
to this. (Since you do appear to also have a need to have
the last word)

Hoping this was useful on some level,


Michael.



More information about the Python-list mailing list