python disk i/o speed
bokr at oz.net
Thu Aug 8 02:53:21 CEST 2002
On Wed, 07 Aug 2002 15:52:13 +0000, Martin Franklin <mfranklin1 at gatwick.westerngeco.slb.com> wrote:
>On Wednesday 07 Aug 2002 2:40 pm, Jeff Epler wrote:
>> On Wed, Aug 07, 2002 at 07:21:28AM -0700, nnes wrote:
>> > I generated a file about 7MB long, with 3 numbers on each line. Then I
>> > wrote a programm in python, java and ANSI C, generating a second file
>> > based on the first one, with 4 numbers; the original 3 plus the sum of
>> > these.
>> > e.g. "2","5","1" ----> "2","5","1","8"
>> > I wondered about the reason of almost 10 times the difference from c
>> > to python since the programms should be mostly I/O bound and not CPU
>> > bound. Is there also a way of improving the speed for python in this
>> > situation? If sombody wants to make comments on the c and the java
>> > code it would be ok also, since I am not an expert programmer.
>> On any modern machine, reading a 7MB file a second time will not be "I/O
>> bound", because it will be in cache, and should be read at nearly the
>> speed of memcpy(), if not mmap().
>> BTW, here's my attempt at a Python program. Not having your programs, I
>> can't compare performance:
>> import sys, re
>> pat = re.compile('"([\d]+)","([\d]+)","([\d]+)"')
>> for line in sys.stdin:
>> match = pat.match(line)
>> # if not match:
>> # sys.stdout.write(line)
>> a, b, c = map(int, match.group(1, 2, 3))
>> sys.stdout.write('"%s","%s","%s","%s"\n' % (a,b,c, a+b+c))
>> Remember that you can shave another ~5% off of Python runtime by using
>> 'python -O'. Also, you could attempt to measure the startup time, which
>> is likely to be smaller for C, and larger for Python and Java.
>And here is my python version:-
>for line in file:
> a, b, c=map(int, line.split())
> fout.write("%i %i %i %i\n" %(a, b, c, d))
A nit: the OP wanted quotes and commas. And also borrowing from above, you don't need 'd'
fout.write('"%s","%s","%s","%s"\n' % (a,b,c, a+b+c))
Don't know whether %i is faster thant %s, but I guess it could be.
>And the results from my 'c' version:-
>speedTest2.c:19:1: warning: no newline at end of file
>Yes thats right I cound not compile the 'c' version <wink>
More information about the Python-list