Could Python supplant Java?

Wed Aug 21 15:28:36 EDT 2002

Will Newton wrote:

> On Wednesday 21 Aug 2002 5:13 pm, James J. Besemer wrote:
>
> > The trade-off is that in exchange for pre declaring your types
> > the compiler detects and helps eliminate this class of bugs even
> > before you run your unit test.  The cost of declaring ahead of
> > time is small and the gain to be had (vs. debugging each of the
> > corresponding runtime errors) can be significant.
>
> IMO it's another case of the 80/20 rule. Syntax errors in C++ generally would
> fall firmly in the 80% of bugs that take only 20% of the time to locate and
> fix. As anyone who has debugged large quantities of C/C++ can tell you, just
> because a program compiles it is not bug free, and the bugs that are likely
> left are the hard ones to track down.

I didn't refer to all C++ syntax errors.  I identified a class of errors
common both to Python and languages like C++ where there's the
trade-off is a net win for programmers using declarations.

> If it came down to a straight trade-off
> between a grabage-collected, pointer-less language and a statically typed
> language with pointers I can guarantee that the debugging time of the former
> will be far less than the latter.

I think garbage collection actually may be *THE* key advantage in Python vs.
C++.   Also one reason LISP lives on and attracts so many adherents after so many
decades.  Perl too, FWIW.  Oh yea, and Java.

However, there's no reason we can't have static declarations in language with
these other benefits (unrelated to dynamic vs. static) that wouldn't be the best
of both worlds.

I'm picturing a version of Python where you can optionally declare the data types
of, say function arguments, and enjoy the best of both worlds.  The new
object/type consolidation helps a lot in this regard.  At a minimum, semantics
would be the equivalant of an assert() testing the type of the object.  So you
are alerted with a specific error at the point it occurrs.  For extra credit,
compilers could do flow analysis and detect some mis-matches at compile time.
Also given a fixed, known object time, some nifty optimizations would be possible
that otherwise might not be possible.  I feel like I may be ready to write my
first PEP.  ;o)

C++ actually isn't all that bad because there are simplistic formulas for keeping
storage problems at a minimum.  Unlike C it is relatively easy to write, say, a
String class that never looses storage.  That class "auto" variables
automatically construct/destruct themselves exactly like chars and ints is, in
the hands of the experienced practitioner, almost as good as garbage collection.

But I have to conceed that "not bad" still falls short, as you do have to get the
formula right and you do still have at some level an explicit malloc/free; I
agree a garbage collector is even better and completely freeing the user from
such concerns is a big win.

> When you factor in interpreted versus
> compile-link-debug type of systems you will see Python does not suffer from
> being dynamically typed at all.

BZZZIT!  Wrong.  May be true with trivial projects and may be true with old
fashioned technology like GCC but MS's Visual C++, with incremental compilation
AND linking, builds can be much FASTER than Python when rebuilding large projects
in the middle of the development cycle.  E.g., with one 150K LOC VC++ project I
worked on I could change a few modules and rebuild in just a few seconds.  In
contrast, I just generated a fairly trivial 150K Python program and it took
almost 3 minutes compile (on a machine 4X faster than I used for that C++
project).  I bet a real Python program (something other than 'pass' for every
other line) would even take lots longer.

> So yes, I agree that static typing can catch bugs, but they are usually quite
> trivial typos or cut and paste errors (const excepted).

I partly agree but you are overlooking significant exceptions.  The particular
examples I gave of argument type mismatches ARE common programmer errors that are
automatically detected in C++ but can be very hard to track down in Python.
E.g., passing a string instead of a list of strings or vice versa.  Or a tuple
instead of a list.  Instead of an error message you get

    ['n', 'e', 'w', 'n', 'a', 'm', 'e']

or you get an obscure exception far from the actual error  Often from Python's
trace back you don't have a clue where the acutal mistake may reside.

In larger projects, this class of error is doubly costly, because the problem is
created by another developer but the problem initially shows up in your code.  So
two people have to grind on it instead of one (especially if the other developer
is your customer and then even if it's 'his' fault).

> However I also
> believe that the reduced lines of code and development time mean that to code
> and debug in Python is, in my experience, much faster than to code and debug
> in C++ or C.

I suspect the "Pythonista Experience" is heavily colored by mostly single person
projects.  Which is fair.  I don't dispute Python's advantage on the low end.

However, my main point is that for larger, multiple person projects, the ability
to declare types becomes increasingly important.

Incidentally, how many >100K line Python programs do we know about?  Any?

I measure the entire 2.2 Lib source at 129,078 lines, but it's spread over 544
modules, thus averaging less than 240 lines per module.  A bunch of unrelated
tiny pieces is suitable for a library but it's not the same experience as a large
application.

--jb
--
James J. Besemer  503-280-0838 voice
http://cascade-sys.com  503-280-0375 fax
mailto:jb at cascade-sys.com