[IPython-dev] [FWD] An interesting take on the Notebook Problem
Toni Alatalo
antont at an.org
Fri Jul 22 10:05:18 EDT 2005
Fernando Perez wrote:
> [Chris, note that ipytho-dev discards non-subscriber posts (too much
> spam). I've manually
thanks for posting this - was quite a good read, and something that
makes the notebook project even more interesting
>> takes, although not precisely focused on the notion of an interactive
>> notebook, might still be an interesting read to anyone involved in
>> (..)
>> Robert Gentleman (2005) "Reproducible Research: A Bioinformatics Case
>> Study", Statistical Applications in Genetics and Molecular Biology:
>> Vol. 4: No. 1, Article 2.
>> http://www.bepress.com/sagmb/vol4/iss1/art2
>
in fact, the question i was left pondering, regarding the work on python
notebooks and the discussions about literate programming vs. the 'REPL'
(Read-Eval-Print-Loop, i.e. ipython) modes of working, was how the
interactive use fits the picture there.
in that paper, they define the so-called Compendiums 'navigable
documents', referring to the reader being able to 'explore and
reproduce' the content. is such exploration of e.g. a dataset or a model
a case of 'REPL', when e.g. changing variables or perhaps defining
functions to explore te data/model in different cases? i would think so.
that way such compendiums and ipython would be a perfect match,
something that literate programming (which if of course one of the
underlying basic ideas, as Gentleman also mentions) perhaps does not
facilitate in itself? (although checking out Leo etc. is still in my
to-do list so can't really tell).
installed R the other day to get a feel of that environment, seemed like
a rich world and the interactive use was pretty nice (though i still
think it is & will be better with Python), but unfortunately could not
get the GolubRR package to work (yet). seems to be some version issue:
perhaps that GolubRR is packaged for R < 2.0, while the current (in e.g.
Debian) is > 2.0. a question about it had been posted to the
Bioconductor mailing list, but i could not find any answer (nor haven't
gotten one from the poster of that mail) -
https://stat.ethz.ch/pipermail/bioconductor/2004-November/006906.html .
Besides the SciPy tutorial, perhaps that seminal paper Golub and the R
compendium should also be among the 'test cases' for Python notebooks
(hopefully the dataset is not too tricky to convert to whatever the
format in our system would be).
oh and one technical issue relating reproducibility (although that is
probably not among the top questions for us yet): as Gentleman notes,
some research methods use randomness, and to be able to reproduce the
exact same computations, and a heavy way is to include the random table
used in the compendium. quick Googling now didn't tell me if the Python
random generator is guaranteed to give the same results with the same
seed and in what conditions (i know from working on procedural modelling
that it at least works on the same computer..), but that can be looked
at later.
ah and one simple issue: how in our format do we include values of
variables in a text? Gentleman shows on p. 11 how a result from a
computation is included in a sentence using the Sexpr command in their
system. in Python it would of course be, supposing that 'genes' would be
the list of selected genes, simply:
".. filtering process selected %d genes" % len(genes)
OR
"selected " + len(genes) + " genes"
OR
"selected", len(genes), "genes" -- if we somehow support the form that
the print statement handles
, but how in the XML? (i guess it's trivial but just don't know yet).
there may be other issues in the paper that would be important to note,
but those are what i was left thinking.
~Toni
More information about the IPython-dev
mailing list