Could Python supplant Java?

Wed Aug 21 12:13:58 EDT 2002

brueckd at tbye.com wrote:

> Have you done any analysis or classification of your bugs that actually
> shows that the bugs found in production would equate to syntax errors in a
> compiled languaged?

I agree that bugs that make it out to the field likely would be
similar in both cases and furthermore usually would not be
of a type readily detectable by the compiler.  The two classes
of errors are quite a bit different.

The bugs found by the compiler in an Early Binding language
generally are bugs that would show up during unit testing in a
Late Binding language.  Either way the errors likely would not
make it out to the field.

The trade-off is that in exchange for pre declaring your types
the compiler detects and helps eliminate this class of bugs even
before you run your unit test.  The cost of declaring ahead of
time is small and the gain to be had (vs. debugging each of the
corresponding runtime errors) can be significant.

The trade-off materially affects programmer productivity.
Common errors such as type mismatches and wrong type
args to functions are syntax errors in C++.  But in Python they
produce some kind of aberrant runtime behavior.  In lucky
instances, an exception occurs close to the function definition
and trace back shows you a list of candidate offending callers
to search back from.  However error symptoms may be
arbitrarily distant and obscure from actual mistakes, and
thus they can take arbitrarily large amounts of time to
diagnose and resolve in Python.  Even though many
in fact are easy to fix, they're not as easy generally as
fixing a clearly diagnosed type mismatch.

This is essentially the 30 year old debate of why Pascal
is better than Fortran.

So there IS a class of bugs common to C++ and Python
programming that are more work for the programmer
to track down and fix in Python.

Furthermore, I don't see a huge benefit to being vague about types.
Very few python programs make use of the fact that a variable
or a function parameter can take on a variety of types.  I'd wager
that in 90% of the cases new code only works with a single data
type or perhaps two very closely related ones.  So really, we're
just talking about the penny wise savings of not having to be
explicit about our intentions.  I can see how that would help rapid
prototyping and be nice for beginner programmers and for
writing little one-off scripts.  But it can hurt on large projects
in the long run.  Sometimes, Pythoners do cute things like
define a library function that works on "any object that looks
like a file".  But you don't have to give up Polymorphism in
an Early Binding environment.  You simply declare the interface
explicit at compile time and find out before you run if there's
a semantic mismatch in your code -- rather than wait an
arbitrary amount of time to bump into the corresponding
runtime error.

None of this is to argue C++ is better overall or that being explicit
about types solves all problems.  I'm only saying that being explicit
where possible can be a big win for the programmer and improve
the quality of the resulting code.

I don't understand why this all is so controversial.  I haven't studied
the official propaganda for a while but I was under the distinct
impression that some form of early binding/type declarations -- as
an optional feature -- was slated for Python 3.0.

I think it would be a big win, the best of both worlds.  As an option,
who could complain?

> The perception that dynamically-typed languages don't work for large
> applications is common, but it is a common *mis*conception (for example,
> Google for one of the recent threads about successful large Python
> applications - despite Python's limited popularity there are actually
> quite a few large and successful Python projects - certainly too many to
> be a fluke!).

First off, I don't think 10K lines is that big of a project.  Although
the industry mean is something like 200 lines per programmer month
I know programmers who can produce a 10K application by
themselves in a month.  100K lines and you're breaking out of
the range of small projects, approaching what a good programmer
can do in a year.

I hear Zope is the one big Python APP and everything else is an
also ran.  I accept this may be obsolete data but where is the
current data?  Is there a reliable enumeration somewhere for
Python?  Ideally one would like to see a histogram of lines of
Python indexed by project.

Your paragraph is pretty garbled.  Am I reading it properly that you
are claiming Goggle as one of Python's large project successes?

If so, a quick scan of their job opening page suggests quite the
opposite.  Most of the software openings and all the 'Senior' ones
require "extensive experience in C++".  If they mention Python
at all, it's usually as "a plus" and even then usually as one of
several 'scripting' alternatives.

So Google appears to be a large C++ shop where Python is largely
incidental.  I'm guessing Python wasn't "fast enough" for their high
volume core product.  ;o)

If that's not what you're saying, where would I find a list of all
the Python projects by teams greater than 20 people, or more
simply the 20 largest python endeavors (and their size)?  Or the
top 10 or whatever you have.

At a second stab at your meaning, I tried Google( "large python
projects" ) and it churned up a bunch of garbage.

> I'll go so far as to say that languages such as C++, VB, Java, are
> actually *less* suitable for very large projects than Python, and their
>suitability *decreases* as the size of the project increases.

As a statement of opinion, nobody can argue.  But as a statement
of fact (as you've written it) I think it's more accurate to say the jury
is still out on this trade-off.  Certainly the substantially larger installed
bases in Each of the three alternatives is a partial counter-argument.

--jb

--
James J. Besemer  503-280-0838 voice
http://cascade-sys.com  503-280-0375 fax
mailto:jb at cascade-sys.com