Unification of Methods and Functions

Sun May 9 19:18:53 EDT 2004

On 8 May 2004 07:07:09 -0700, moughanj at tcd.ie (James Moughan) wrote:

>David MacQuigg <dmq at gain.com> wrote in message news:<4a9o90pbu122npgf4m2hrgg04g2j0ic6ka at 4ax.com>...
>> On 7 May 2004 06:31:51 -0700, moughanj at tcd.ie (James Moughan) wrote:
<snip>
>> Also, if you are calling
>> a function that has an instance variable ( .length ) and no instance
>> has been set by a prior binding, you would need to set __self__
>> manually.
>> __self__ = foo; print FooLen()
>
>???!!!??? 
>
>This is what I was talking about in my first post, global variables
>which change depending on where you are in the code... as I understand
>what you're saying, __self__ will have to be set, then reset when a
>method is called from within a method and the exits.  And __self__
>could presumably be changed halfway through a method, too. I'm sorry,
>I don't see this as being more explicit or simpler.

The setting of __self__ happens automatically, just like the setting
of the first argument in a call from an instance.  The user doesn't
have to worry about it.  In fact, I can't think of a circumstance
where the user would need to explicitly set __self__.  Maybe some
diagnostic code, in which case having available a system variable like
__self__ is a plus.  You can, without any loss of functionality in a
normal program, never mention __self__ in an introductory course.  The
user doesn't need to know what it is called.

My preference is to give it a name and highlight it with double
underscores.  To me that makes the discussion more concrete and
explicit, and builds on concepts already understood.  Don't forget,
the students already understand global variables at this point in the
course.  The "magic" of setting a particular global variable to an
instance is about the same as the magic of inserting that instance as
a first argument in a function call.  The problem in either syntax is
not the magic of setting 'self' or '__self__'.

<snip>
>> >A method in a class in Python is just like a global function; for a
>> >global function to operate on an object, it must take it as an
>> >argument. The prototype syntax would appear to break the above
>> >example.
>> 
>> Global functions have no instance variables, so there is no need for a
>> special first argument.  A Python method requires a special first
>> argument (even if it is not used).  
>
>But the first argument isn't terribly 'special'; it tells the method
>what it's working on, just like any other argument.  It's only
>'special' characteristic is that there's some syntactic sugar to
>convert foo.getLength() into Foo.getLength(foo).

The specialness of the first argument isn't much, I agree, but it is
enough to make the calling sequence different from a normal function
or a static method.  It is these differences that the new syntax gets
rid of, thereby enabling the unification of all methods and functions,
and simplifying the presentation of OOP.  Methods in the new syntax
are identical to functions (which the students already understand),
except for the presence of instance variables.

Instance variables are the one fundamental difference between
functions and methods, and one that we wish to focus our entire
attention on in the presentation.  Any new and unnecessary syntactic
clutter is a distraction, particularly if the new syntax is used in
some cases (normal methods) but not others (static methods).

>> >> The difference in the proposed syntax is that it doesn't need the
>> >> staticmethod wrapper to tell the interpreter -- don't expect a special
>> >> first argument.  In the new syntax all functions/methods will have the
>> >> same calling sequence.
>> >
>> >If a method doesn't operate on the data from an object then as a rule
>> >it should be global.  There are exceptions, but they generally don't
>> >occur in Python so much as a in 'true oo' language like Java.
>> 
>> The placement of a function at the module level or in a class should
>> be determined by the nature of the function, not any syntax problems.
>> If the function has characteristics unique to a class, it ought to be
>> included with that class.  The Mammal.show() function, for example,
>> provides a display of characteristics unique to mammals, so we put it
>> in class Mammal.  We could have written a general-purpose Inventory()
>> function to recursively walk an arbitrary class hierarchy and print
>> the number of instances of each class.  That general function would be
>> best placed at the global level, outside of any one class.
>> 
>
>Mammal.show() shows characteristics to do with Mammals, *but not
>specifically Mammal*.  There really is a difference between a class
>and it's subclasses.

The Mammal.show() function *is* specific to Mammal.  I think what you
are saying is that calling Mammal.show() results in a display of
characteristics of both Mammal and its ancestor Animal.  That is a
requirement of the problem we are solving, not a result of bad
programming.  We want to see *all* the characteristics of Mammal,
including those it inherited from Animal.

Leave out the call to Animal.show() if you don't want to also see the
ancestor's data.

>The general-purpose inventory solution would be a better solution.  It
>doesn't require repetition, it's hard (impossible?) to break and it's
>generic, allowing it to be used beyond this single class heirarchy.
>
>If the inventory function would be best placed outside a class, why do
>you think it's a good idea to put something with exactly the same
>functionality inside your classes?

The proposed Inventory() function is a general function that *would*
be appropriate outside a class.  The exising class-specific functions
like Mammal.show() are unique to each class.  I tried to make that
clear in a short example by giving each data item a different text
label.  I've now added some unique data to the example just so we can
get past this stumbling block.  A real program would have a multi-line
display for each class, and there would be *no way* you could come up
with some general function to produce that display for any class.

<snip>
>> Learning Python, 2nd ed. by Mark Lutz and David Ascher is generally
>> considered the best introductory text on Python.  96 pages on OOP.
>> 
>
>Books are always kind of strange, because a book must have a certain
>number of pages and cover a certain range of content at a certain
>technical level.  For the level and range of the ORA Learning books,
>that is going to mean a bit of padding for a simple language like
>Python.  If I see Learning Python in a bookshop then I'll take a look,
>though.
>
>Regardless, I stand by what I said before - students generally will
>not read 70 pages on a single topic, especially when it's a relatively
>minor part of the course.

Learning Python, 2nd ed. would be appropriate for a one-semester
course.  My problem is that I have only a fraction of a semester in a
circuit-design course.  So I don't cover OOP at all.  I would include
OOP if I could do it with four more hours.  Currently Python is a
little over the top.  I don't think it is a problem with Lutz's book.
He covers what he needs to, and at an appropriate pace.

>> >Learning to program is about 5% how to do something, and 95% when and
>> >why you should do it.  You seem to be focusing almost exclusively on
>> >how, which I suspect is why we're all so upset :) you get that way
>> >when you have to fix the code which eventually results.
>> 
>> The OOP presentations I've seen that focus as much as 50% on *why*
>> generally leave me bored and frustrated.  I feel like screaming --
>> Stop talking about car parts and show me some nice code examples.  If
>> it's useful, I'm motivated.  Good style is a separate issue, also best
>> taught with good examples (and some bad for contrast).
>> 
>
>I'm not talking about car parts.  I'm talking about explaining
>modularity, complexity, side-effects, classes as data structures etc.

These are concepts that design engineers understand very well.  I
wouldn't spend any time teaching them about modularity, but I would
point out how different program structures facilitate modular design,
and how syntax can sometimes restrict your ability to modularize as
you see fit.  Case in point: The need for static methods to put the
show() functions where we want them.

>(It's hilarious to see what happens when people get taught by car-part
>style metaphors; they take them completely literally.  I've seen
>someone writing the classic vending machine example write a 'Can'
>class, subclass it to get 'CokeCan', 'PepsiCan'... and then create ten
>of each to represent the machines' stock.  That was after three years
>of university, too...)

Oh ... don't get me started on academia. :>)

>> >OK: "The whole idea of having these structures in any program is
>> >wrong."
>> >
>> >Firstly, the program uses a class hierarchy as a data structure.  That
>> >isn't what class heirarchies are designed for, and not how they should
>> >be used IMO. But it's what any bright student will pick up from the
>> >example.
>> 
>> The classes contain both data and functions.  The data is specific to
>> each class.  I even show an example of where the two-class first
>> example forced us to put some data at an inappropriate level, but with
>> a four class hierarchy, we can put each data item right where it
>> belongs.
>> 
>
>The data is not specific to the class.  It's specific to the class and
>it's subclasses.  Subclasses should be dependent on the superclass,
>and generally not the other way around.

What data are we talking about?  numMammals is specific to Mammal.
genus is specific to Feline, but *inherited* by instances of a
subclass like Cat.

>> Nothing in the Bovine class can affect anything in a Cat.  Feline and
>> Bovine are independent branches below Mammal.  Adding a Mouse class
>> anywhere other than in the chain Cat - Feline - Mammal - Animal cannot
>> affect Cat.  Could you give a specific example?
>> 
>
>Say someone adds a mouse class but doesn't call the constructor for
>Mammal.  The data produced by mammal and therefore cat is now
>incorrect, as instances of mouse are not included in your count.  In a
>real example, anything might be hanging on that variable - so e.g.
>someone adds some mouse instances and the program crashes with an
>array index out of bounds (or whatever the Pythonic equivalent is :) )
>, or maybe we just get bad user output.  This type of behaviour is
>damn-near impossible to debug in a complex program, because you didn't
>change anything which could have caused it.  It's caused by what you
>didn't do.

These are normal programming errors that can occur in any program, no
matter how well structured.  I don't see how the specific structure of
Animals.py encourages these errors.

>> I'm not sure what you mean by "side effects" here.  The show()
>> function at each level is completely independent of the show()
>> function at another level.  >
>
>But the inventory data isn't independent.  It's affected by classes
>somewhere else in the heirarchy.  Worse, it's done implicitly.

The "inventory data" actually consists of independent pieces of data
from each class. ( numCats is a piece of inventory data from the Cat
class.)  I'm sorry I just can't follow this.

>> Chaining them together results in a
>> sequence of calls, and a sequence of outputs that is exactly what we
>> want.  The nice thing about separating the total "show" functionality
>> into parts specific to each class is that when we add a class in the
>> middle, as I did with Feline, inserted between Mammal and Cat, it is
>> real easy to change the Cat class to accomodate the insertion.
>> 
>> Python has a 'super' function to facilitate this kind of chaining.
>> Michele Simionato's 'prototype.py' module makes 'super' even easier to
>> use. Instead of having Cat.show() call Mammal.show() I can now just
>> say super.show() and it will automatically call the show() function
>> from whatever class is the current parent.  Then when I add a Feline
>> class between Mammal and Cat, I don't even need to change the
>> internals of Cat.
>
>That's fine - providing you're not using a class heirarchy to store
>data.  It's not the act of calling a method in a super-class which is
>a bad idea, it's the way you are making *the numbers outputted* from
>cat dependent of actions taken *or not taken* in another class
>*completely outside cat's scope*.

Seems like this is the way it has to be if you want to increment the
counts for Cat and all its ancestors whenever you create a new
instance of Cat.  Again, I'm not understanding the problem you are
seeing.  You seem to be saying there should be only methods, not data,
stored in each class.

>> >> What I'm looking for is not clever re-structuring, but just a
>> >> straightforward translation, and some comments along the way -- oh
>> >> yes, that is a little awkward having to use a staticmethod here.  Wow,
>> >> you mean staticmethods aren't fundamentally necessary, just a bandaid
>> >> to make up for Python's deficiencies?  That was my reaction when I
>> >> first saw Prothon.
>> >
>> >Static methods are more like a band-aid to make up for the
>> >deficiencies of OOP.  Python isn't a pure OO language, and doesn't
>> >suffer the need for them badly.
>> 
>> In one syntax we need special "static methods" to handle calls where a
>> specific instance is not available, or not appropriate.  In another
>> syntax we can do the same thing with one universal function form.

To try and get to the bottom of this, I re-wrote the Animals.py
example, following what I think are your recommendations on moving the
static methods to module-level functions.  I did not move the data out
of the classes, because that makes no sense to me at all.

Take a look at http://ece.arizona.edu/~edatools/Python/Exercises/ and
let me know if Animals_2b.py is what you had in mind.  If not, can you
edit it to show me what you mean?

-- Dave