Speed problems with Python vs. Perl

Georg Umgiesser georg at isdgm.ve.cnr.it
Wed Mar 28 10:12:58 EST 2001


Last week I wrote a simple Python program and found out that it was 
terribly slow. Therefore I retried with Perl and found a much better 
performance. The programs simple read a file and split the lines on white 
space, something I have to do very often for data elaboration

Here are the two programs:

----------------------------------------------------------

#!/usr/bin/python
 
import sys
import re
 
whitespace = re.compile("\s+")
 
def main():
 
    icount = 0
    for line in sys.stdin.readlines():
        icount = icount + 1
        f = whitespace.split(line)
    print "Total lines read: " + `icount`
 
if __name__ == '__main__':
    main()

--------------------------------------------------------

#!/usr/bin/perl
 
$icount = 0;

while(<>) {
  $icount++;
  @f = split;
}
print "Total lines read: $icount\n";
                                                                            
---------------------------------------------------------

I ran the two programs with the line splitting and also commenting out the
line in which I split on whitespace. Here are the results:


Perl:
 
with line splitting
 
Total lines read: 12212
0.34user 0.00system 0:00.34elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (264major+34minor)pagefaults 0swaps
 
without line splitting
 
Total lines read: 12212
0.10user 0.00system 0:00.10elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (250major+34minor)pagefaults 0swaps
 
 
Python:
 
with line splitting
 
Total lines read: 12212
1.93user 0.01system 0:01.94elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (236major+311minor)pagefaults 0swaps
 
without line splitting
 
Total lines read: 12212
0.20user 0.00system 0:00.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (233major+311minor)pagefaults 0swaps
 
                                                                           
As you can see, without the splitting (only reading and counting the lines)
Perl is twice as fast, but once I split on whitespace, Python gets more 
than 5 times slower than Perl. 

These results have been achieved on a AMD with 650 MHerz. On my home 
machine, a 266 Celeron, the performance gap is worse. Without line 
splitting the programs
are about the same speed, but with splitting enabled, the Python version
becomes 10 times slower than the Perl version (5 secs against 0.5 secs)

I wonder why this happens? Both languages are interpreted. Is it the
fault of the implementation of the re module? Somebody has had a similar
experience. 

Georg Umgiesser


-- 

                                                             __
                                                           ^/..\^   
----------------------------------------------------------m( 00 )m---
Georg Umgiesser         | e-mail :              georg at isdgm.ve.cnr.it
Oceanography, ISDGM-CNR | web    : http://www.isdgm.ve.cnr.it/~georg/
1364 S. Polo            | tel    :              ++39 - 041 - 5216 875
30125 Venezia, Italia   | fax    :              ++39 - 041 - 2602 340
---------------------------------------------------------------------





More information about the Python-list mailing list