Could Python supplant Java?

Fri Aug 23 07:12:48 EDT 2002

Paul Foley wrote:

> On Wed, 21 Aug 2002 06:22:24 -0700, James J Besemer wrote:
>
> > Some more zealous sources claim that "polymorphism" and other OOP
> > techniques are impossible with early binding and thus require late
> > binding to make it possible.  This is bullshit.
>
> Assuming the "this" that you say is bullshit is not self-referential
> (like "this statement is false"), please explain how it can be done.

C++ implements polymorphism in two ways.

First off, since all variables have a known type, most member references can
be directly resolved by knowing the object type and member name.  In the case
of name clashes, the ambiguity is resolved by using the most recent
declaration.

     class A {
         int x;
         int bar(){...}
     };

     class B : A {
         int y;
         int bar(){...}
     };

     void foo( A a ){
         a.x;
         a.bar();    // calls A.bar()
     }

     void foo( B b ){
         b.x;
         b.y;
         b.bar();    // calls B.bar()
     }

in the above, references to a.x and b.y are unambiguous and a.y would be an
error.  b.x also is legal since B inherits from A.  Similarly, functions can
be distinguished if they have different argument lists.  a.bar() is
unambiguous but b.bar() is not.  b.bar() is resolved in favor of the most
recent declaration.  There's a scope resolution operator to override
defaults.  In the case of multiple inheritance, I believe the diamond or
trapezoid rule is used, same as python switched to.  This is all completely
static.

To handle the more dynamic case C++ uses 'virtual' functions.  This allows a
function to access functions in subclasses it doesn't know exists.
Declarations are like before except for the addition of a 'virtual' keyword:

     class A {
         int x;
         virtual int bar(){...}
     };

     class B : A {
         int y;
         virtual int bar(){...}
     };

Now we can define a new function that accepts arguments of either type A or
type B but accesses each instances' version of bar():

    foo( A arg ){
        return arg.bar();    // calls A.bar() or B.bar() depending on type of
arg
    }

    A a;
    B b;

    foo( a );    // will call A.bar()
    foo( b );    // will call B.bar()

This works even if the class declarations for A and B exist in different
modules.  That is the definition of foo above can be precompiled before the
code for B is even written.  And yet at runtime a call with the new object
type will automatically be routed to the proper virtual function.

[Implementation details are an exercise for the reader.]  However, I will
point out that there is no Python- or Smalltalk- like dictionary lookup at
runtime.  Virtual function calls are *almost* as fast as regular ones
(there's an extra level of indirection).

Virtual functions are explained in more detail here:

    http://www.glenmccl.com/virt_cmp.htm

So you can have 'true' Polymorphism without late binding.  The only 'trick'
is that the types of all objects have to be pre-declared (and be used
consistently).

In all fairness there are minor limitations in C++ and Java:

1. there are no virtual variables.  Only functions are virtual.  Good OOP
practice discourages exposing variables, so this is not a terrible hardship.

2. you can't access arbitrary pieces of an object.  E.g., in Python you can
write a function that "works" with, say, "any object that implements
hash()".  In C++ and Java you either declare your intent to access an entire
object or interface or else not at all.

#2 doesn't mean you can't access arbitrary subsets, only again you have to
explicitly specify in advance exactly what subset you'll provide.  In Java
(and sometimes in C++) these subsets are called "interfaces".  So in the
hash() example, some wise programmer eventually would have identified earlier
that hash() and possibly some other functions are an interesting set of
functions and define a HASH interface.  In C++ then, classes would inherit
from the "hash interface".  In Java they have a separate "implements"
declaration for interfaces.  Then in your code, you implement hash() and any
other functions that make up the interface.

Anyway, Polymorphism can be done in either language.  The trade-off is having
to declare your intentions in advance vs. the runtime overhead of having a
dictionary lookup for each function call.

If I overlooked anything in your lengthy question, please feel free to ask
and I'll be happy to elaborate further.

> > Thus, true Pythonistas will argue that late binding (by any other
> > name) is superior.
>
> Early binding is clearly superior -- but too-early binding is not, and
> compile-time is often too early.  [Hence the proliferation of "times"
> in Lisp: you can have things happen at at least five distinct times:
> read time, macroexpansion time, compile time, load time and run time;
> and I suppose you could prefix design time and write time to those]
>
> During development, compile-time is almost guaranteed to be too early.

I think it depends.  If you're embarked on some exploratory XP voyage of
discovery then this probably is true.  In other circumstances it may not be
the case.

Regards

--jb

--
James J. Besemer  503-280-0838 voice
http://cascade-sys.com  503-280-0375 fax
mailto:jb at cascade-sys.com