Attack a sacred Python Cow

Mon Jul 28 18:50:52 EDT 2008

Derek Martin a écrit :
> On Sun, Jul 27, 2008 at 09:39:26PM +0200, Bruno Desthuilliers wrote:
>>> As for the latter part of #3, self (or some other variable) is
>>> required in the parameter list of object methods,
>> It's actually the parameter list of the *function* that is used as the 
>> implementation of a method. Not quite the same thing. 
> 
> The idea that Python behaves this way is new to me.  For example, the
> tutorials make no mention of it:
> 
>   http://docs.python.org/tut/node11.html#SECTION0011300000000000000000
> 
> The Python reference manual has very little to say about classes,
> indeed.  If it's discussed there, it's buried somewhere I could not
> easily find it.

Yeps, most of the doc didn't really follow Python's evolutions alas. But 
it's still documented - I've learned all this from the doc.

You'll find more details starting here:

http://www.python.org/doc/newstyle/

and a couple more stuff in the language specs part of the doc:

http://docs.python.org/ref/descriptors.html
http://docs.python.org/ref/descriptor-invocation.html

>> consistency mandates that the target object of the method is part of
>> the parameter list of the *function*, since that's how you make
>> objects availables to a function.
> 
> Fair enough, but I submit that this distinction is abstruse,

The distinction between class interface (the method call) and class 
implementation (the function called by the method) ?

> and
> poorly documented,

This is certainly true. Patches to the doc are welcome.

> and also generally not something the average
> application developer should want to or have to care about... 

I don't know what's an "average application developper", but as an 
application developper myself, I feel I have to care about the 
implementation of my programs, just like I feel I have to care about 
knowing enough about the languages I use to use them properly.

> it's of
> interest primarily to computer scientists and language enthusiasts.
> The language should prefer to hide such details from the people using
> it.

There I beg to disagree. Transparently exposing most of it's object 
model is a design choice, and is for a great part responsible for Python 
expressive power and flexibility. And all this is really part of the 
language - I mean, it's a public API, not an implementation detail. 
FWIW, I'm certainly not what you'd call a "computer scientist" (I left 
school at 16 and have absolutely no formal education in CS).

Anyway: "the language" (IOW: the language's designer) made a different 
choice, and I'm very grateful he did.

>>> however when the method is *called*, it is omitted.
>> Certainly not. 
> 
> Seems not so certain to me...  We disagree, even after your careful
> explanation.

You're of course (and hopefully) entitled the right to disagree !-)

>  See below.
> 
>> You need to lookup the corresponding attribute *on a given object*
>> to get the method. Whether you write
>>
>>   some_object.some_method()
>>
>> or
>>
>>   some_function(some_object)
>>
>> you still need to explicitely mention some_object.
> 
> But these two constructs are conceptually DIFFERENT,

Why so ?

> whether or not
> their implementation is the same or similar.  The first says that
> some_method is defined within the name space of some_object.

The first says that you're sending the message "some_method" to 
some_object. Or, to express it in Python terminology, that you're 
looking up the name "some_method" on some_object, and try to call the 
object returned by the attribute lookup mechanism, whatever that object 
is (function, method, class or any other callable).

Now saying that it implies that "some_method is defined within the name 
space of some_object" depends on the definitions of 'defined', 'within' 
and 'namespace' (more on this below).

>  The
> second says that some_object is a parameter of some_function...    

Yes. It also say that some_function knows enough about some_object to 
accept it as a parameter (or at least that the developper that passed 
some_object to some_function thought / expected it would be the case).

You know, the dotted notation ("obj.attrib") is, well, just a notation. 
It's by no mean necessary to OO. You could have a perfectly valid object 
system where the base notation is "some_message(some_object)" instead of 
being "some_object.some_message" - and FWIW, in Common Lisp - which BTW 
have one of the richer object systems around -, the syntax for method 
call is the same as the syntax for function call, IOW 
"(func_or_method_name object arg1 arg2 argN)".

> Namespace != parameter!!!!!!!!!

Functions parameters are part of the namespace of the function body.

Please don't get me wrong : I'm not saying your point is moot, just 
suggesting another possible way to look at the whole thing.

> To many people previously familiar with OO programming in other
> languages (not just Java or C++), but not intimately familiar with
> Python's implementation details,

It's actually not an implementation detail - it's part of the language spec.

> the first also implies that
> some_method is inherently part of some_object,

There again, I disagree. To me, it implies that some_object understands 
the 'some_method' message. Which is not the same thing. Ok, here's a 
possible implementation:

# foo.py

def foo(obj):
     return obj.__class__.__name__

# bar.py
from foo import foo

class Meta(type):
     def __new__(meta, name, bases, attribs):
         cls = type.__new__(meta, name, bases, attribs)
         old_getattr = getattr(cls, '__getattr__', None)

         def _getattr(self, attrname):
             if attrname == 'some_method':
                 return lambda self=self: foo(self)
             elif callable(old_getattr):
                 return old_getattr(self, attrname)
             else:
                 raise AttributeError("blah blah")

         cls.__getattr__ = _getattr
         return cls

# baaz.py
import bar

class Quux(object):
     __metaclass__ = bar.Meta

class Baaz(object):
     def __init__(self):
         self._nix = Quux()
     def __getattr__(self, name):
         return getattr(self._nix, name)

# main.py
import baaz
some_object = baaz.Baaz()

Is 'some_method' "inherently part of" some_object here ? There isn't 
even an object named 'some_method' anywhere in the above code...

(and no, don't tell me, I know: it's a very convoluted way to do a 
simple thing - but that's not that far from things you could find in 
real-life library code for not-so-simple things).

> in which case
> explicitly providing a parameter to pass in the object naturally seems
> kind of crazy.  The method can and should have implicit knowledge of
> what object it has been made a part.

The method does. Not the function.

Here's a possible (and incomplete) Python implementation of the method type:

class Method(object):
     def __init__(self, func, instance, cls):
         self.im_func = func
         self.im_self = instance
         self.im_class = cls
     def __call__(self, *args, **kw):
         if self.im_self:
             args = (self.im_self, ) + args
             return self.im_func(*args, **kw)
         elif isinstance(args[0], self.im_class):
             return self.im_func(*args, **kw)
         else:
             raise TypeError("blah blah")

> Part of the point of using
> objects is that they do have special knowledge of themselves... 

s/do/seem to/

> they
> (generally) manipulate data that's part of the object.  Conceptually,
> the idea that an object's methods can be defined outside of the scope
> of the object,

s/object/class/

> and need to be told what object they are part
 > of/operating on is somewhat nonsensical...

That's still how other OOPLs work, you know. But they hide the whole 
damn thing out and close the box, while Python exposes it all. And I can 
tell you from experience that it's a sound idea - this gives you full 
control about your object's behaviour.

wrt/ functions being defined outside classes then used as part of the 
implementation of a class, I fail to see where is the problem - but I 
surely see how it can help avoiding quite a lot of boilerplate when 
wisely used.

>>> Thus when an object method is called, it must be called with one fewer
>>> arguments than those which are defined.   This can be confusing,
>>> especially to new programmers.
>> This is confusing as long as you insist on saying that what you 
>> "def"ined is a method - which is not the case.
> 
> I can see now the distinction, but please pardon my prior ignorance,
> since the documentation says it IS the case, as I pointed out earlier.

Yeps. Part of the problem is that OO terminology doesn't have any clear, 
unambiguous definition - so terms like 'method' can be used with 
somewhar different meanings. Most of Python's doc use the term 'method' 
for functions defined within class statements - and FWIW, that's usually 
what I do to.

> Furthermore, as you described, defining the function within the scope
> of a class binds a name to the function and then makes it a method of
> the class.   Once that happens, *the function has become a method*.

The function doesn't "become" a method - it's __get__ method returns a 
method object, that itself wraps the object and the function (cf above 
snippet). What's get stored in the class __dict__ is really the function:

 >>> class Foo(object):
...     def bar(self):
...         print "bar(%s)" % self
...
 >>> Foo.__dict__['bar']
<function bar at 0xb7ccaf7c>
 >>>

Whether you bind the name within or outside of the class statement 
doesn't change anything.

> To be perfectly honest, the idea that an object method can be defined
> outside the scope of an object

I assume you meant "outside the class statement's body" ?

> (i.e. where the code has no reason to
> have any knowledge of the object)

Just any code "using" an object need to have at least some knowledge of 
this object, you know. Or do you mean that one should not pass message 
to any other object than self ? This seems like a pretty severe 
restriction to me - in fact, I fail to see how one could write any code 
that way !-)

> seems kind of gross to me...   another
> Python wart.  

Nope. A *great* strength.

> One which could occasionally be useful I suppose,

More than "occasionaly". Lots of frameworks use that (usually in 
conjonction with metaclasses) to inject attributes (usually functions) 
into your objects. Have a look at how most Python ORM work.

> but a
> wart nonetheless. 

Your opinion. But then you wont probably like Python. May I suggest Ruby 
instead - it has a much more canonical object model ?-)

Err, no, wait - while dynamically adding attributes / methods to objects 
/ classes is possible but not that common in Python (outside frameworks 
and ORMs at least), it's close to a national sport in Ruby. Nope, you 
won't like Ruby neither...

>  This seems inherently not object-oriented at all,

Because things happens outside a class statement ? Remember, it's 
*object* oriented, not class oriented. Classes are not part of the base 
definitions of OO, and just aren't necessary to OO (have a look at Self, 
Io, or Javascript).

As far as I'm concerned, "object oriented" is defined by

1/ an object has an identity, a state and a behaviour
2/ objects communicate by sending messages to each others

And that's all for the OO theory - everything else is (more or less) 
language-specific. As you can see, there's no mention of "class" here, 
and not even of "method". All you have is identity, state, behaviour and 
messages - IOW, high level concepts that can be (are are indeed) 
implemented in many different ways.

> for reasons I've already stated.  It also strikes me as a feature
> designed to encourage bad programming practices.

For which definition of "bad" ?

Your views on what OO is are IMHO very restricted - I'd say, restricted 
to what the C++/Java/UML guys defined as "being OO".

Anyway: you'd be surprised by the self (no pun) discipline of most 
Python programmers. Python let you do all kind of crazy things, but 
almost no one seems to get over the board.

FWIW, if you find the idea of a "method" defined outside the class 
statement shocking, what about rebinding the class of an object at 
runtime ? You may not know it, but the class of an object is just 
another attribute, and nothing prevents you from rebinding it to any 
other object whenever you want !-)

> Even discounting that, if Python had a keyword which referenced the
> object of which a given peice of code was a part, e.g. self, then a
> function written to be an object method could use this keyword *even
> if it is defined outside of the scope of a class*.  The self keyword,
> once the function was bound to an object, would automatically refer to
> the correct object.   If the function were called outside of the
> context of an object, then referencing self would result in an
> exception.

This could probably be implemented, but it would introduce additional 
complexity. As I already said somewhere in this thread, as far as I'm 
concerned,  as long as it doesn't break any existing code and doesn't 
impose restrictions on what is actually possible, I wouldn't care that 
much - but I think it would be mostly a waste of time (IMHO etc).

> You'll probably argue that this takes away your ability to define a
> function and subsequently use it both as a stand-alone function and
> also as a method.

I could. FWIW, I've almost never had a need for such a construction yet, 
and I don't remember having seen such a thing anywhere.

But anyway, to avoid breaking code, the modification would still have to 
  take into account functions using an explicit self (or cls) in the 
function's signature. I'm afraid this would end up making a mess of 
something that was originally simple.

>  I'm OK with that -- while it might occasionally
> be useful, I think if you feel the need to do this, it probably means
> your program design is wrong/bad.  More than likely what you really
> needed was to define a class that had the function as a method, and
> another class (or several) that inherits from the first.

Not designing things the CanonicalUMLJavaMainstreamWay(tm) doesn't mean 
the design is wrong. Also, there are some problems that just can't be 
solved that way - or are overly (and uselessly) tedious to solve that way.

Talking about design, you may not have noticed yet, but quite a lot of 
the OO design patterns are mostly workaround the lack of flexibility in 
Java and C++ (hint: how would you implement the decorator pattern in 
Python ?). And while we're at it, the GoF (IMHO one of the best books on 
OO design) lousily insists on composition/delegation being often a way 
better design than inheritance (which FWIW is what Python does with 
method calls being delegated to functions).

> 
>> The point is that you don't get access to the object "within itself". 
>> You get access to an object *within a function*.
> 
> Thus methods are not really methods at all,

Please show me where you get access to the object "within itself" in any 
other OO language. Methods (for the usual acceptation of the term) are 
*not* "within" the instances. And they access instances thru a reference 
to it, reference that get injected into the code one way or another. 
Most languages make this implicit, Python makes it explicit. So what ?

>  which would seem to
> suggest that Python's OO model is inherently broken (albeit by design,
> and perhaps occasionally to good effect).

Here again, you are being overly dogmatic IMHO. Being explicit, or just 
being different from mainstream, is not the same as being "broken".

>> The fact that a function is defined within a class statement doesn't 
>> imply any "magic", 
> 
> It does indeed -- it does more than imply.  It states outright that
> the function is defined within the namespace of that object,

s/object/class/

> and as
> such that it is inherently part of that object.

s/object/class/

>  So why should it need
> to be explicitly told about the object of which it is already a part?

Because it's the simplest thing to do ?-)

More seriously, methods are usually (in Python - always in most OOPLs) 
part of a class, not of it's instances - IOW, the same code is shared by 
all instances of a same class. And the language implementation needs to 
make the instance accessible to the method code one way or another.

 From this POV, Python doesn't behave differently - except that it 
choosed to expose the fact and make it part of the API.

> It further does indeed imply, to hordes of programmers experienced
> with OO programming in other languages, that as a member, property,
> attribute, or what ever you care to call it, of the object, it should
> have special knowledge about the object of which it is a part.

class Foo(object):
     some_dict = dict()

     def __init__(self, some_int, some_list, some_string):
         self.int = some_int
         self.list = some_list
         self.string = some_string

foo = Foo(42, range(3), "yadda")

Where do you see that 42, range(3) and "yadda" have any knowledge of foo?

Translate this to any other OOPLs and tell me if the answer is different.

>> IOW : there's one arguably good reason to drop the target object from 
>> functions used as methods implementation, which is to make Python looks 
>> more like Java
> 
> No, that's not the reason.  I don't especially like Java, nor do I use
> it. 

Sorry, I usually use 'Java' as a generic branding for the whole static 
struct-based class-oriented mindset - UML / C++ / Java / C# etc - by 
opposition to dynamic OOPLs.

Anyway, in this particular case, it was not even what I meant, so please 
accept my apologies and s/Java/canonical/ in the above remark.

> The reason is to make the object model behave more intuitively.

I understand that having to specify the target object can seem 
disturbing, at least at first. Now once you know  why, I personnaly find 
it more "intuitive" to not have different constructs for functions, 
methods, and functions-to-be-used-as-methods. I write functions, period. 
IOW:

>> , and there's at least two good reason to keep it the way it is,
>> which are simplicity (no special case) and consistency (no special
>> case).
> 
> Clearly a lot of people find that it is less simple TO USE.

I would say it requires a bit more typing. Does that make it less simple 
to use ? I'm not sure. Am I biased here ? After many years of Python 
programming, I don't even notice typing 'self' in the argument list no 
more, so very probably: yes.

>  The point
> of computers is to make hard things easier... if there is a task that
> is annoying, or tedious, or repetitive, it should be done by code, not
> humans.

Which BTW is why I really enjoy having the possibility to modify a class 
at runtime - believe me, it can save quite a lot of boilerplate...

>  This is something that Python should do automatically for its
> users.

This point is debatable, indeed. Still, the only serious reason I see 
here is to make Python look more like most mainstream OOPLs, and while 
it may be a good idea - I'm not making any judgement on this - I can 
happily live with the current state of things as far as I'm concerned. 
Anyway, the decision doesn't belong to me.