[IPython-dev] Distributing evaluated notebooks that require big-ish data

Antonino Ingargiola tritemio at gmail.com
Fri May 16 15:24:54 EDT 2014


Hi,

I'm working on a software for single-molecule FRET analysis (FRETBursts)[1]
that heavily relies on ipython notebook to run the analysis.

I provide some evaluated notebooks serving as tutorials in a separate
repository (FRETBursts_notebooks). The reference documentation (mostly
installation instructions and API) is hosted on ReadTheDocs [3].

I have concerns on the viability of this approach since the notebooks
repository can easily grow to hundreds of MB given the high number of
images. Maintaining a second repository it is also and additional burden.

Ideally I would like to generate the notebooks dynamically from unevaluated
notebooks in the original source repository. However the data  necessary to
reproduce the analysis is ~150MB (hosted on figshare [4]), and I may add
more datasets in the future.

So I'm asking for suggestions.

I like the simple concept of a notebook repository with links to nbviewer,
but seems that the solution is not scalable.

 I don't think RTD can handle downloading and data processing that requires
several minutes to execute on modern desktops.

So, what's left? Anybody has a similar issue?


Best,
Antonio

[1] FRETBursts: https://github.com/tritemio/FRETBursts
[2] FRETBursts_notebooks: https://github.com/tritemio/FRETBursts_notebooks
[3] FRETBursts documentation: http://fretbursts.readthedocs.org/index.html
[4] FRETBursts datatsets: http://dx.doi.org/10.6084/m9.figshare.1019906
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140516/c07bcfdb/attachment.html>


More information about the IPython-dev mailing list