Speed problems with Python vs. Perl
Georg Umgiesser
georg at isdgm.ve.cnr.it
Wed Mar 28 10:12:58 EST 2001
Last week I wrote a simple Python program and found out that it was
terribly slow. Therefore I retried with Perl and found a much better
performance. The programs simple read a file and split the lines on white
space, something I have to do very often for data elaboration
Here are the two programs:
----------------------------------------------------------
#!/usr/bin/python
import sys
import re
whitespace = re.compile("\s+")
def main():
icount = 0
for line in sys.stdin.readlines():
icount = icount + 1
f = whitespace.split(line)
print "Total lines read: " + `icount`
if __name__ == '__main__':
main()
--------------------------------------------------------
#!/usr/bin/perl
$icount = 0;
while(<>) {
$icount++;
@f = split;
}
print "Total lines read: $icount\n";
---------------------------------------------------------
I ran the two programs with the line splitting and also commenting out the
line in which I split on whitespace. Here are the results:
Perl:
with line splitting
Total lines read: 12212
0.34user 0.00system 0:00.34elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (264major+34minor)pagefaults 0swaps
without line splitting
Total lines read: 12212
0.10user 0.00system 0:00.10elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (250major+34minor)pagefaults 0swaps
Python:
with line splitting
Total lines read: 12212
1.93user 0.01system 0:01.94elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (236major+311minor)pagefaults 0swaps
without line splitting
Total lines read: 12212
0.20user 0.00system 0:00.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (233major+311minor)pagefaults 0swaps
As you can see, without the splitting (only reading and counting the lines)
Perl is twice as fast, but once I split on whitespace, Python gets more
than 5 times slower than Perl.
These results have been achieved on a AMD with 650 MHerz. On my home
machine, a 266 Celeron, the performance gap is worse. Without line
splitting the programs
are about the same speed, but with splitting enabled, the Python version
becomes 10 times slower than the Perl version (5 secs against 0.5 secs)
I wonder why this happens? Both languages are interpreted. Is it the
fault of the implementation of the re module? Somebody has had a similar
experience.
Georg Umgiesser
--
__
^/..\^
----------------------------------------------------------m( 00 )m---
Georg Umgiesser | e-mail : georg at isdgm.ve.cnr.it
Oceanography, ISDGM-CNR | web : http://www.isdgm.ve.cnr.it/~georg/
1364 S. Polo | tel : ++39 - 041 - 5216 875
30125 Venezia, Italia | fax : ++39 - 041 - 2602 340
---------------------------------------------------------------------
More information about the Python-list
mailing list