Pickle based workflow - looking for advice
fabien.maussion at gmail.com
Mon Apr 13 19:35:28 CEST 2015
On 13.04.2015 19:08, Peter Otten wrote:
> How about a file-based workflow?
> Write distinct scripts, e. g.
> a2b.py that reads from *.a and writes to *.b
> and so on. Then use a plain old makefile to define the dependencies.
> Whether .a uses pickle, .b uses json, and .z uses csv is but an
> implementation detail that only its producers and consumers need to know.
> Testing an arbitrary step is as easy as invoking the respective script with
> some prefabricated input and checking the resulting output file(s).
I think I like the idea because it is more durable. The data I
manipulate comes with specific formats which are very efficient. With
the pickle I was kind of "lazy" and, well, saved a couple of read/write
Still, your idea is probably more elegant.
With multiprocessing, do I have to care about processes writing
simultaneously in *different* files? I guess the OS takes good care of
this stuff but I'm not an expert.
More information about the Python-list