"Faster" I/O in a script

Michael Torrie torriem at gmail.com
Wed Jun 4 02:37:15 CEST 2008


kalakouentin wrote:
> I use python in order to analyze my data which are in a text form. The
> script is fairly simple. It reads a line form the input file, computes
> what it must compute and then write it it to a buffer/list. When the
> whole reading file is processed (essential all lines) then the
> algorithms goes ahead and writes them one by one on the output file.
> It works fine. But because of the continuous I/O it takes a lot of
> time to execute.

Sounds like perhaps generators would help.  They let you process your
data a chunk at a time, rather than reading them all in at once.  For
some powerful tips and examples, see:

http://www.dabeaz.com/generators/Generators.pdf

To me this principle was very enlightening.  Especially where you can
then apply optimizations to your data processing flow by altering when
the generators take place.  Much as how pushing selects down the tree in
relational algebra reduces runtime dramatically, using generators to
filter and process data can be made very fast.



More information about the Python-list mailing list