[IPython-dev] Distributing evaluated notebooks that require big-ish data

Antonino Ingargiola tritemio at gmail.com
Wed May 21 17:54:33 EDT 2014


Hi Nathan,

thanks for the reply. Your approach is very compelling indeed. I'll try it
whenever I switch to an own server to build the docs.

For now I'll stick to ReadTheDoc and I'll manually run the notebooks for
testing/updating. I'll try to do not update the images too frequently so
that the repository will not explode.

Best,
Antonio



On Mon, May 19, 2014 at 12:16 AM, Nathan Goldbaum <nathan12343 at gmail.com>wrote:

> For the yt documentation [1] we do this using a jenkins buildbot and a
> sphinx extension that I wrote called RunNotebook [2].  To follow this route
> you'll need to host your own docs builds on your project's website and also
> have a server that can dynamically generate the docs builds.  If you don't
> need to build the docs with each commit to your codebase, you could also
> generate the docs manually as part of your release process.
>
> A nice bonus to incorporating notebooks into your docs in this fashion is
> that the notebooks serve as a form of testing: broken code in the docs
> leads to broken results embedded in the docs build.
>
>  [1] http://yt-project.org/doc, see e.g. the yt bootcamp at
> http://yt-project.org/doc/bootcamp/index.html
>  [2] https://github.com/ngoldbaum/runnotebook
>
>
> On Fri, May 16, 2014 at 12:24 PM, Antonino Ingargiola <tritemio at gmail.com>wrote:
>
>> Hi,
>>
>> I'm working on a software for single-molecule FRET analysis
>> (FRETBursts)[1] that heavily relies on ipython notebook to run the analysis.
>>
>> I provide some evaluated notebooks serving as tutorials in a separate
>> repository (FRETBursts_notebooks). The reference documentation (mostly
>> installation instructions and API) is hosted on ReadTheDocs [3].
>>
>> I have concerns on the viability of this approach since the notebooks
>> repository can easily grow to hundreds of MB given the high number of
>> images. Maintaining a second repository it is also and additional burden.
>>
>> Ideally I would like to generate the notebooks dynamically from
>> unevaluated notebooks in the original source repository. However the data
>>  necessary to reproduce the analysis is ~150MB (hosted on figshare [4]),
>> and I may add more datasets in the future.
>>
>> So I'm asking for suggestions.
>>
>> I like the simple concept of a notebook repository with links to
>> nbviewer, but seems that the solution is not scalable.
>>
>>  I don't think RTD can handle downloading and data processing that
>> requires several minutes to execute on modern desktops.
>>
>> So, what's left? Anybody has a similar issue?
>>
>>
>> Best,
>> Antonio
>>
>> [1] FRETBursts: https://github.com/tritemio/FRETBursts
>> [2] FRETBursts_notebooks:
>> https://github.com/tritemio/FRETBursts_notebooks
>> [3] FRETBursts documentation:
>> http://fretbursts.readthedocs.org/index.html
>> [4] FRETBursts datatsets: http://dx.doi.org/10.6084/m9.figshare.1019906
>>
>>
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140521/f4ffb8e8/attachment.html>


More information about the IPython-dev mailing list