Python vs. C/C++/Java: quantitative data ?

Thu Feb 21 18:57:50 EST 2002

Doug Bagley the not .com spam whammy wrote:

> Magnus Lyckå <magnus at thinkware.se> writes:
> 
>>There is a comparision at http://www.bagley.org/~doug/shootout/
>>but it's really just a pointless play. Don't fall for this one.
>>
> 
> Harumph! The shootout isn't asking anyone to fall for anything. I hope
> I have enumerated the many deficiencies it has on the methodology
> page. I hope it's clear I'm not trying to pull the wool over anyone's
> eyes.

I agree with everything you write, and I'm sorry for my
careless formulation. And of course I'm aware of the
problem. A "real" application might take man-years to
develop in ONE language. Who on earth would have the
resources to do this in 30 languages. IBM maybe?

I should have said "Don't draw any conclusions regarding
real world application performance from these benchmarks."

I do think that your benchmark can be very misleading if
people think the numbers are relevant for real world
applications. Obviously any system that loads a significant
runtime system has a huge disadvantage when the test programs
are much smaller than typical, real programs. Also, making
only CPU bound applications higly exagerates the gains with
using languages like C. I rarely write programs that never
wait for file i/o, network or user interaction.

I think the main value of your site is to give a flavour
of all these languages, but frankly, I don't think the
benchmarking values are meaningful. And people will get
the wrong adea about Python. They will see some syntax,
but they won't realize how the language is really used
by python programmers.

I think it would be more valuable to try to show a few
applications where you really try to do something meaningful,
and not just implement basic algorithms. Instead of molding
all programs in the same form, try to make them utilize the
strengths (and expose the weaknesses) of each language.
But that is another benchmark than yours...

E.g. the pseudo random generator will not teach us anything about
how to use random numbers in Python. It's just a misleading example
of correct syntax. Why do you want to show something that the normal
Python programmer would hardly ever code? Using the rich standard
libraries is one of the essential ingredients in Python programming!

Actually, since your random generator is simpler than the
standard, and since the standard one is also implemented in
Python, your one is faster! But your heap sort takes 6 times
longer than Python's standard sort-method. (And if speed was
an important requirement, the Python programmer would actually
code that little part in C. But maybe throwing in mixed language
solutions in the benchmark as well would be too much... At least
if you want to try a decent amount of permutations for thirty
languages. :-) ) Anyway, running your code on my computer gives
the "impression" that C is 26 times faster than Python in sorting
doubles. But if I'm lazy and use .sort() instead of a python
implementation of heapsort, the difference is only factor 4.

I guess quicksort would be faster in C as well, but the point
is to show what each language has to offer. Surely in performance,
but first of all in how much effort it requires of the programmer!
Using the standard library random and builtin sort surely makes
that Python program nice and short.

Perhaps a few of those "give what you've got" tests could be
woven in with your tests? It seems very arbitrary to suddenly
require a certain implementation which is very far from ideal
for some languages. The least one could hope for is to have two
versions of Python programs for for instance random numbers and
sorting. At least it will give those who get their first exposure
to Python a chance to see what the language really has to offer.
I'm sure it's not only Python which is "mistreated" in this way.

It would be great with more studies like
http://wwwipd.ira.uka.de/~prechelt/Biblio/jccpprt_computer2000.pdf

 From my perspective, Python compared to the "big ones", such as
Java, Perl, C, C++, ksh and maybe VB would suffice. (I'm not joking,
VB is used to write a lot of code!) I'm curious about other languages
like icon, dylan, ocaml and haskell as well, but I could sacrifice
them.

I do think that small specific benchmarks similar to yours can have
a value. For instance, they can show us certain weaknesses, which
might give us clues about what to improve in a language. But they
don't really play a role in deciding what language to use in a
real world application. Not unless we only plan to build a system
with only tiny, short running and entirely CPU bound programs.