[Tutor] concurrent file reading using python

Abhishek Pratap abhishek.vit at gmail.com
Mon Mar 26 20:05:09 CEST 2012


Hi Guys


I want to utilize the power of cores on my server and read big files
(> 50Gb) simultaneously by seeking to N locations. Process each
separate chunk and merge the output. Very similar to MapReduce
concept.

What I want to know is the best way to read a file concurrently. I
have read about file-handle.seek(),  os.lseek() but not sure if thats
the way to go. Any used cases would be of help.

PS: did find some links on stackoverflow but it was not clear to me if
I found the right solution.


Thanks!
-Abhi


More information about the Tutor mailing list