Does Python really follow its philosophy of "Readability counts"?
Wed Jan 14 20:16:23 CET 2009
"Russ P." <Russ.Paielli at gmail.com> writes:
> I know some researchers in software engineering who believe that the
> ultimate solution to software reliability is automatic code
> generation. The don't really care much which language is used, because
> it would only be an intermediate form that humans don't interact with
> directly. In that scenario, humans would essentially use a "higher
> level" language such as UML or some such thing.
> I personally have a hard time seeing how that could work, but that may
> just be due to be my own lack of understanding or vision.
The usual idea is that you would write a specificiation, and a
constructive mathematical proof that a certain value meets that
specification. The compiler then verifies the proof and turns it into
code. Coq (http://coq.inria.fr) is an example of a language that
works like that. There is a family of jokes that go:
Q. How many $LANGUAGE programmers does it take to change a lightbulb?
A. [funny response that illustrates some point about $LANGUAGE].
The instantiation for Coq goes:
Q. How many Coq programmers does it take to change a lightbulb?
A. Are you kidding? It took two postdocs six months just to prove
that the bulb and socket are threaded in the same direction.
Despite this, a compiler for a fairly substantial C subset has been
written mostly in Coq (http://compcert.inria.fr/doc/index.html). But,
this stuff is far far away from Python.
I have a situation which I face almost every day, where I have some
gigabytes of data that I want to slice and dice somehow and get some
numbers out of. I spend 15 minutes writing a one-off Python program
and then several hours waiting for it to run. If I used C instead,
I'd spend several hours writing the one-off program and then 15
minutes waiting for it to run, which is not exactly better. (Or, I
could spend several hours writing a parallel version of the Python
program and running it on multiple machines, also not an improvement).
Often, the Python program crashes halfway through, even though I
tested it on a few megabytes of data before starting the full
multi-gigabyte run, because it hit some unexpected condition in the
data that could have been prevented with more compile time checking
that made sure the structures understood by the one-off script matched
the ones in the program that generated the input data.
I would be ecstatic with a version of Python where I might have to
spend 20 minutes instead of 15 minutes writing the program, but then
it runs in half an hour instead of several hours and doesn't crash. I
think the Python community should be aiming towards this.
More information about the Python-list