parallel csv-file processing

Paul Boddie paul at
Fri Nov 9 13:48:42 CET 2007

On 9 Nov, 12:02, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
> Why not pass the disk offsets to the job server (untested):
>    n = 1000
>    for i,_ in enumerate(reader):
>      if i % n == 0:
>        job_server.submit(calc_scores, reader.tell(), n)
> the remote process seeks to the appropriate place and processes n lines
> starting from there.

This is similar to a lot of the smarter solutions for Tim Bray's "Wide
Finder" - a problem apparently in the same domain. See here for more

Lots of discussion about more than just parallel processing/
programming, too.


More information about the Python-list mailing list