How can I speed this function up?

Fredrik Lundh fredrik at pythonware.com
Sat Nov 18 05:37:49 EST 2006


nnorwitz at gmail.com wrote:

> Generally, don't create objects, don't perform repeated operations.  In
> this case, batch up I/O.
> 
>> def write_data1(out, data):
>>      for i in data:
>>          if i[0] is 'ELEMENT':
>>              out.write("%s %06d " % (i[0], i[1]))
>>              for j in i[2]:
>>                  out.write("%d " % (j))
>>              out.write("\n")
> 
> def write_data1(out, data, map=map, str=str):
>      SPACE_JOIN = ' '.join
>      lines = [("ELEMENT %06d " % i1) + SPACE_JOIN(map(str, i2))
>               for i0, i1, i2 in data if i0 == 'ELEMENT']
>      out.write('\n'.join(lines))
> 
> While perhaps a bit obfuscated, it's a bit faster than the original.
> Part of what makes this hard to read is the crappy variable names.  I
> didn't know what to call them.  This version assumes that data will
> always be a sequence of 3-element items.
> 
> The original version took about 11.5 seconds, the version above takes
> just over 5 seconds.

footnote: your version doesn't print the final "\n".  here's a variant 
that do, and leaves the batching to the I/O subsystem:

def write_data3(out, data, map=map, str=str):
      SPACE_JOIN = ' '.join
      out.writelines(
	 "ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
               for i0, i1, i2 in data if i0 == 'ELEMENT'
	 )

this runs exactly as fast as your example on my machine, but uses less 
memory.  and if you, for benchmarking purposes, pass in a "sink" file 
object that ignores the data you pass it, it runs in no time at all ;-)

</F>




More information about the Python-list mailing list