Could Python supplant Java?

Fri Aug 23 13:13:32 EDT 2002

Duncan Booth <duncan at NOSPAMrcp.co.uk> wrote in message news:<Xns9272B21CFD328duncanrcpcouk at 127.0.0.1>...
> joeking at merseymail.com (FISH) wrote in 
> news:dbc5020.0208220730.39536ae0 at posting.google.com:
> 
> > But why remove one layer of the testing - that is to say a 
> > compiler's ability to check you are putting the right data
> > in the right places?
> 
> There is a cost to static typing as well as a benefit. I hope we can agree 
> on that.
> 
> Part of the cost is that for static typing to work (at least in languages I 
> know), you have to tell the compiler the types of values that you expect to 
> use when declaring a function. This means that you have the cost of adding 
> in those extra type declarations, and when your design changes you have the 
> cost of changing the declarations everywhere they occurred.

Of course.  But firstly, you shouldn't be needing to widely change 
the type of data you are storing in a given variable - not on a 
regular basis anyway.  And if you are doing so on a regular basis, 
then it does rather suggest you haven't thought about the task at 
hand.  "It's an int, erm no a string, erm no it's a float, no wrong
again - its a small off duty Checoslovacian traffic warden!"  ;-)

And secondly, a *good* programmer will document the types of data 
being put into a variable anyway - rather than leaving up to their
successor to trawl through the code to find out.  If you don't
use static typing, then you *should* put a comment in somewhere to
at least give a clue to the poor chump who will inherit your code
what type of data the variable is intended to hold.

Now, using a language like Java, when I see a variable in the code
which is unfamiliar, I know there is only a limited number of 
places that I need to look to find out what type of data is stored
in it.  In a dynamically typed language I have to track backwards
through the flow of execution to arrive at the last assignment.

If you don't like the idea of static variables as extra layers of
testing, then just think of them as mandatory in-line documentation.

And again I must repeat my caveat that for languages aimed mainly
at producing small snippets of programability, that static typing 
is a tad overkill.  (Such languages are often generalised under the 
vague term 'scrpiting languages'.)

> Another part of the cost comes from the time spent working out exactly 
> which type you want to pass to a function. Do you want to pass a File 
> object, or a ReadableFile object, or a 
> ReadableAndSeekableAndTellableAndCloseableButNotWriteableFile object? Do 
> you already have an interface that describes exactly what you want to pass 
> in this circumstance. If not, do you define a new interface, and modify 
> lots of classes to say they implement it, or do you just use an existing 
> but wider interface, and make sure you classes all implement the wider 
> interface even when most of the methods are stubbed out. If you need to 
> define a new interface do you need team meetings to agree it, new design 
> documents signed of etc. All of these things take time.

Right - but the trade off here is that once you've finished you 
should have a solid solution which is more readable.  For example,
I *could* write a method like this:

void wordWrap(Vector result,String breakchars,String source)

or I could write it like this:

void wordWrap(Object result, Object breakchars,Object source)

Granted - the first means I have to *think* about the typed of data
I am passing in, but WRT the method I know that 

(a) anyone looking at the 'signature' knows exactly what type of 
data is expected - without ambiguity.  They don't have to hope 
I've documenented it somewhere, anywhere... perhaps on the back 
of an envelope buried deep under a pile of crap on my desk.

(b) the compiler will be nice enough to check for them they are
passing datatypes I expect.  (Passing wrong type of params to a 
function is a common mistake, I hope you'll agree. Particularly
transposition.)

(c) the amount of run-time testing I have to do inside the method 
to *guarantee* the caller passed in the correct type of data
is limited, because I can guarantee the type will (at the very 
least) be what I extect - leaving me to just check if it is 'within 
range'.  (This is obviously a major factor in compiled languages,
not so much interpreted ones.  In the latter you can switch testing
on and off, in the former you either ship two versions, or decided 
between testing on (slow/safe) or testing off (faster/dangerous).)

(d) the number of possible points of failure within the method is
thereby reduced, meaning that my test suite is smaller, takes less
time to run, takes me less time to write and maintain.  And costs
me less overall in keyboards.  :-)

> There is also the cost from suddenly needing to write multiple copies of 
> the same function, or from having to use C++ templates to avoid writing 
> multiple copies of the code. Any time you have to write a C++ template, the 
> cost is high.

Now this is true enough - although even if you combined all those
different overloaded methods into one generic method, you *still*
have to write a huge switch statement to figure out which types of
data you got passed and what you should do with them.

void printInt(int i) { System.out.println("Your int is "+i); }
void printString(String s) { System.out.println("Your string is "+s); }

...versus...

void print(Object o)
{  if(o instanceof Integer)
   {  System.out.println("Your int is "+o);
   }
   else if(o instanceof String)
   {  System.out.println("Your string is "+o);
   }
   // Note - you need to add some more code here... it is 
   // possible to call this method with neither of the above 
   // types, and no error will occur.  If this was a set of 
   // statically typed methods, it wouldn't be an issue  :-) 
}

Which is more readable/maintainable?  And *that* was only a trivial
example ;-)

> There is another cost when a programmer who should know better sees their 
> code compiles, therefore by definition 'it must be working' so they ship it 
> to the test team who manage to report back 3 weeks later that 500 of their 
> (manual?) tests failed, all because of one little bug that the programmer 
> should have spotted to begin with. Yes, I know that we aren't talking 
> either/or for compiler tests and unit tests, but some programmers really do 
> this. Writing without a safety net is scary, but perhaps there is an 
> advantage to keeping your programmers scared?

I don't see what this has to do with static/dynamic typing.  More to
do with good programming practice.

It is almost as if you are admitting that the lack of security 
offered by static typing makes the language so much more inadequate 
(for non-trivial work) that even bad programmers will need to do some 
testing...  But surely you can't be saying that... can you?  :-)

> The benefit to static typing is that I expect that when I hit the 'compile 
> and test' button, I will be told that the code failed to compile, whereas 
> without it I expect to be told that a unit test failed. Sorry, I must have 
> missed something there. Let me try that again. If my unit tests are 
> sufficiently comprehensive I get an error at the same point in time as I 
> would have got anyway. Nope, I'm not sure I see a benefit.

void wordWrap(Vector result,String breakchars,String source)

...versus...

void wordWrap(Object result,Object breakchars,Object source)
{  assert result instanceof Vector;
   assert breakchars instanceof String;
   assert source instanceof String;

First one seems more readable to me.  A matter of choice I 
guess.  Maybe you've got shares in a keyboard manufacturing
company?  :-)

> Actually, most of the time I have to work with code where the unit tests 
> aren't sufficiently comprehensive. Most of the time Python bombs straight 
> out if you pass the wrong type of arguments to a function. Occasionally a 
> problem like this might get through. Not as often as you might expect 
> though.

An aside...

You seem to place a lot of faith in unit testing to overcome
the lack of testing capable by your compiler (not that I'm
trying to suggest that static types are the answer to all 
programming woes - or even the majority!)

The problem with unit testing is that it is up to the humans
to decide what needs to be tested - and the importance of each
result.  NASA, for all its millions and millions of dollars, 
hundreds of staff, and endless testing procedures right up 
until two seconds before launch, still failed to note that
a humble test for the temperature of the O-rings on their
shuttle boosters was important.

When planes fall out of the sky, it is usually (in the West at
least) not through lack of testing.  It is often because the
engineers didn't realise their plane could fail in a particular
way - and there in lies the problem.  In order for unit tests to
be successful, the programmer has to anticipate all the possible
ways a system can fail.  And humans simply cannot do that (nor
can machines!)

I have nothing against dynamic typing - but I would only use a
dynamically typed language for small local script-like applications.
I would never tackle a major million-line monolythic project in one.  
Because I know my limitations, and I want as much backup from the 
tools as I can possibly get.  I'm only human.  I can (and do) screw
up (read my postings elsewhere and you'll see - hehehe :-)

It is interesting that, for all the trumpetting of dynamic typing,
most (all?) modern dynamically typed languages are not usually 
implemented in dynamically typed languages.  Javascript is 
implemented in C++ (or Java, if you prefer Rhino).  Jython is in
Java.  Perl is in C (IIRC).  And Python....?  ;-)  (Python also
draws a lot of its power from its libraries - many of which are
C based!)

I think each language has its place - for example, I wouldn't use Java 
to write scripts (JSP - yuk yuk yuk!) and I wouldn't use Javascript to 
write a major app - no matter how good you made its libraries.

-FISH-   ><>