[IPython-dev] tracking nbconvert-generated images in version control
gvwilson at third-bit.com
Fri Apr 18 15:00:35 EDT 2014
The answer to this question may well be, "You shouldn't be trying to do
that," but here goes anyway:
1. The websites for Software Carpentry bootcamps are hosted on GitHub,
which generates them automatically by running a tool called Jekyll
whenever content is committed to a repository's gh-pages branch.
2. Jekyll knows how to convert compile Markdown and HTML, but doesn't
understand IPython Notebooks, so if people have notebooks in their
bootcamp repository, they have to run nbconvert on their own machine and
add the generated .md file to the repository. (Yes, we could do
something clever with post-commit hooks and continuous integration
systems, but this seems simpler for our users.)
3. When nbconvert runs, it creates image files on disk for the plots and
other code-generated visuals in the notebook. These image files have
auto-generated names like 01-numpy_76_0.png, and the Markdown/HTML
generated by nbconvert links to them.
4. We can easily add those images to the version control repository as
well - but if we move cells around in the notebook, nbconvert will give
them different names the next time it runs. We can add *those* images
to version control too, but what do we do about cleaning out the old
ones? One suggestion is to 'git rm' all the generated images before
re-running nbconvert and trust git to detect the new image and infer
that we meant to 'git mv', but that feels dangerous.
Is there a cleaner solution? One that we can explain and justify to
people who are relatively new to both the notebook and version control,
and is unlikely to go horribly, horribly wrong (which 'git rm' with
wildcards well could)?
More information about the IPython-dev