Multi thread reading a file

Mag Gam magawake at gmail.com
Tue Jun 30 21:52:18 EDT 2009


Hello All,

I am very new to python and I am in the process of loading a very
large compressed csv file into another format.  I was wondering if I
can do this in a multi thread approach.

Here is the pseudo code I was thinking about:

Let T  = Total number of lines in a file, Example 1000000 (1 million files)
Let B = Total number of lines in a buffer, for example 10000 lines


Create a thread to read until buffer
Create another thread to read buffer+buffer  ( So we have 2 threads
now. But since the file is zipped I have to wait until the first
thread is completed. Unless someone knows of a clever technique.
Write the content of thread 1 into a numpy array
Write the content of thread 2 into a numpy array

But I don't think we are capable of multiprocessing tasks for this....


Any ideas? Has anyone ever tackled a problem like this before?



More information about the Python-list mailing list