[IPython-dev] Some new cell types for describing data analyses in IPy. Notebook

Mon Jul 1 14:20:42 EDT 2013

Hey all,

As part of my research into capturing the data analysis process in
documents, I have been working on some extensions to the IPython Notebook
as a way of implementing proof-of-concepts for certain ideas my advisors
and I have had. I think the subscribers of this list might find them
interesting and would love to hear what you guys think.

I have posted a screencast showcasing and explaining the work here:
https://www.youtube.com/watch?v=iQPagwhad_8 and will "briefly" describe it
in text below.

I've implemented 3 fundamentally different new cell types in a fork of the
IPython codebase: interactive code cells, task cells, and alternatives set
cells. To be clear, my goal is absolutely not to replace IPython Notebook.
I am simply leveraging their excellent core application to explore some new
ideas about representing data analyses in documents. Descriptions of the
cell types follow.

*interactivecode cells:* Interactive code cells are code cells which have
additional information attached to them allowing them to render a UI
control which controls one (or more) values within the code and re-executes
the code when the control is used to change the value. Example: controlling
the bandwidth of a kernel regression estimator via a slider.

*task cells*: Task cells are cells that can contain other cells (including
nested task cells or altset cells). They are used to group conceptually
linked content and can be executed in order to execute all the cells they
contain with a single command. They are primarily for organization.
Example:  the data cleaning task during a data analysis would likely
contain multiple code and exposition blocks which fit conceptually within a
single goal.

*altset cells: *Alternatives set (altset) cells represent a point in an
analysis where multiple approaches were tried before the analyst decided on
a final strategy. An altset contains two or more branches representing
these different approaches, only one of which can be active at a time. This
allows an analyst to capture the entire research process in their IPython
notebook in a structurally meaningful way, rather than just the final
approach.

Finally, when the structure of a document actually contains information
about the research process, there are a bunch of really cool things we can
do when querying, processing, executing, and rendering the document which
are difficult or impossible without this extra information. This email has
already gotten quite long, however, so I will leave discussion of those to
another time.

I'd love to hear what people think of these concepts, so please share your
thoughts.

Thanks for reading and thanks to the IPython core team for their great work.
~G

Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20130701/f2a17f3f/attachment.html>