Python IO performance?

Mike Romberg romberg at smaug.fsl.noaa.gov
Thu Jun 5 19:35:31 EDT 2003


>>>>> " " == Ganesan R <rganesan at myrealbox.com> writes:

     > Hi,

     > I apologize for bringing up this topic again. I am sure I am
     > missing something obvious. I am a relatively recent perl to
     > python convert and I always had a nagging feeling that
     > processing text files with python was slow. I set about
     > experimenting. Here's my version of cat in python

     > ==== #!/usr/bin/python2.2

     > import fileinput
     > for line in fileinput.input(): 
     >     print line

[snip]

     > Here's my equivalent perl code:

     > ==== #!/usr/bin/perl -w
     > while (<>) 
     >    { print; }

  I'm not so certain that these are equivalent.  I'm not really a perl
expert, but I bet the perl version behaves more like the traditional
cat in that it just sends the data from one source to the sink.  But
the python fileinput module is reading the data a *line* at a time.
This means that it has to examine each character to find the newlines.

  I whiped up this example which is probably not equivalent either
(benchmarks being what they are):

import sys
f = open(sys.argv[1], 'rb')
s = f.read(4096)
while s != '':
    print s
    s = f.read(4096)

  Here we just read the data a block at a time and write it back to
stdout.  This avoids the newline check.  With this version, I get:

> ls -l /var/lib/rpm/Packages
-rw-r--r--    1 rpm      rpm      48234496 Apr 11 10:22 /var/lib/rpm/Packages
> time perl cat.pl /var/lib/rpm/Packages  > /dev/null
9.120u 0.770s 0:23.37 42.3%     0+0k 0+0io 266pf+0w
> time python cat.py /var/lib/rpm/Packages > /dev/null
0.410u 0.570s 0:02.27 43.1%     0+0k 0+0io 479pf+0w

Mike Romberg (romberg at fsl.noaa.gov)




More information about the Python-list mailing list