[IPython-dev] Building an integrated measurement system with IPython

Mon Feb 17 13:27:10 EST 2014

Hi,

My name is Zahari Dimitrov, and I am a last year Physics Student. My final
year project consists on developing a system on top of IPython which is
able to integrate the configuration of laboratory measurements with the
analysis of the data of those measurements (which could be made in remote
clusters). I would appreciate it very much if you could tell me what do you
think on the overall design, and answer some (I'm afraid too many) specific
questions I have. Apologies for this long message.

Some information on this project can be found on zigzah.com (but it's not
really up to date...)

I have already done the part dedicated to talking with instruments (more or
less), and now I would like to do the following:

- I'd like to represent the computation process as a polytree (
http://en.wikipedia.org/wiki/Polytree ) where the nodes (which I call
IObjects) would be functions that run on some ipengine. These functions
have some parameters (inputs) and return a dict of outputs. The outputs of
one node can be connected to the inputs of the next, and to execute a child
node, all the parents must be executed and have some results.

-This structures are saved on a MongoDb Database (seems really easy to use
for now) and managed with the mongoengine ORM,

-I'd like to be able to transform the inputs of the graph that are not
connected to an output (or have a constant value set) in an html form (the
inputs declare some types like str, int or range which are mapped to
different JS input widgets where possible). The free outputs are turned
into some display widgets as well.

-When you type ipython notebook --iograph=mygraph, the form appears on the
top of your ipython notebook, and you can set the parameters and execute
different parts of the graph. The results are accessible as variables in
the notebook.

-Things like the instrument commands (ie, ask for the frequency in an
oscilloscope) are just things that extend the IObject class and run on a
dedicated engine with id __instruments in the computer where the instrument
is connected (could be different for different instruments).

-Other more complicated classes like RangedExperiment (measure something
over a given range which is an input) are also IObjects. This classes would
have the capability of reporting the measurements as they are produced and
fill sequentially a plot with that.

-Other IObjects can take the results from the experiment and process them
in a remote cluster.

-For a given IObject, all results (outputs) it produces are logged by
default in the mongo database. You can always set the value of the outputs
to a previous result and execute its children with this input.

-There will be a decorator that converts any function in a IObject (and
possibly use python 3 annotations for that)

As said, I'd be very happy if I could complete this. But here are the
questions:

-At what level do I hack the IPython notebook to add  the forms? I haven't
found an extension mechanism capable of doing what I need, so I thing I
have to fork the project. Is there any place that explains the conventions
used for the templates and the JavaScript code in the notebook?

-I believe the Ipython controller has some log system. Could it be
integrated with the one I want for the IObjects?

-I am thinking about using mongodb to deploy a shared filesystem in the
computers where Ipython is running, so it's possible load data from
anywhere. Is this a good solution, particularly to save things like
text-pickled python objects?

-How do I avoid moving around large amounts of data when the output and
input IObjects are in the same computer, different from the controller.

-Which is the best way to obtain partial results from the engines, like the
ones needed for the RangedExperiment example (ie, display the measurement
after it is produced and then do the next)?

-Which is the current best way to display interactive plots in IPython?
Thinking again about the Ranged Experiment.

-I understand each Ipengine will see the hardware of the computer it runs
in. Is this right?

-Is there any problem for a node to itself run a paralell algorithm on
other engines?

Thank you again for your patience,

Best regards,

Zahari.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140217/5ca79bba/attachment.html>