[Tutor] Python and Speed

Tue, 17 Apr 2001 08:11:15 +1000

Guys,

         I noticed that the Advocacy thread contained reference to the
question of Perl's assumed speed advantage over Python. As someone who
championed Python's acceptance for use in implementing a replacement for the
shell scripts which support our bread-and-butter application, I was keen to
quantify the difference. I was particularly keen to see how well Python went
with a common shell operation - reading in a file, sorting it and then
writing the new file.

	Lets assume we have two text files, each containing variable length
strings (I think I used Linux HOW-TO's, from memory). We want to merge the
files and sorted the merged result by ASCII collating sequence. The
resulting 'newmaster' contained over 70,000 lines of text. Trivial stuff for
a shell script :

cat master.txt > /tmp/temp.txt
cat trans.txt >> /tmp/temp.txt
cat /tmp/temp.txt | sort | uniq > newmaster.txt

	I wrote code in C, Perl and Python (each using the default
'quicksort' algorithm to sort the data), and ran tests on a Celeron 566
(SuSe Linux), SPARC Enterprise 65000 (Solaris 2.6) and Compaq Alphaserver
4100. Under Solaris and Linux, Python actually outperformed Perl on each
machine, with the Sun machine (much bigger than the Alpha..) achieving the
best overall result. The C implementation took a total of 1.10 , Python 2.07
and Perl 2.21 secs.  I'd like to make the following points :
	
	1. These are just numbers - I'm sure that better programmers than
myself could achieve better numbers for their chosen language. I doubt
whether the Perl or Python code would ever begin to approach the C execution
times without some sort of C extension.
	2. The vast majority of the elapsed time, in each case, was spent in
the performing file ops ('User'), whilst 'System' times are what the serious
benchmarkers seem to focus on. For the record, they were .05, .11 and .15
respectively. 
	3. I tried to use the same algorithmic approach for each language,
and relied on the default 'quicksort' approach. 
	4. If you read almost any treatise on sorting, you will see that
different sort algorithms suit different patterns of data. I believe that
Perl uses deep recursion in its version of the quicksort algorithm, which
would give it an advantage with smaller text files. I think I prefer the
approach taken by the C and Python guys. 

	For me, the big win was not in any perceived speed advantage : it
was the fact that I achieved my goal with 32 lines of Python (over 45 linees
of Perl and way too many lines of C...). Here are the comparisons between
the Perl sort code and the Python sort code :

Perl :
sub quicksort ($) { my ($arrayref) = @_; @$arrayref = sort { $a cmp $b }
@$arrayref; return $arrayref; } 
Python
list1.sort() # sort the list of strings 
	In defence of Perl, we also use it for a variety of purposes, and it
is a fantastically powerful language with a wealth of available modules.
However, after mucking around for a while with the C and Perl solutions, I
was stunned by how quickly the Python solution fell into my lap. Debugging
it would also be a lot more bearable at  3am ... And, yes, we are now using
Python as part of our revenue raiser !
	Finally, I would like this to provide the catalyst for the Perl and
Python communities to each agree on a set of common tasks which we could use
for benchmark testing each new release of these interpreters. I know that
Guido has never denied that Python is slower, but I'd like to be able to
quantify it against something a little less trivial than my own manufactured
example.
Enjoy,
Arthur
Arthur Watts
Software Engineer GBST Automation
Global Banking & Securities Transactions

Telephone + 61 7 3331 5555
mailto: arthur.watts@gbst.com
www.gbst.com