[IPython-dev] Some Thoughts on Notebook Security

Mon Dec 10 20:48:44 EST 2012

The IPython Notebook's vulnerability to cross-site scripting and
cross-site request forgery, XSS and XSRF, is a serious problem that
provides baddies with a range of attack vectors, each with almost
unlimited potential for harm to the user. Attempting to find a single
solution to so many problems is overwhelming and almost bound to fail.
Therefore, we will likely benefit from breaking the problem down,
insofar as we're able.

The most obvious place to start is to distinguish between static views
of a notebook and a notebook that has a running kernel, which I'll
call static notebooks and kernelled notebooks for lack of a
convention.

A static notebook is just a webpage, so it should ideally behave like
one. Any webpage can execute arbitrary JavaScript, so the fact that
static notebooks have this ability is not a concern in itself.

Serving all static notebooks from a separate domain should prevent XSS
and XSRF attacks because of the Same Origin Policy.

Static notebooks, served from a different domain, could be rendered
inside iframes, enabling us to embed them inside other webpages and
applications. These notebooks would still be superficially served by
our own servers, so the UX wouldn't be effected.

Using randomised URLs, or some other scheme that does not validate
requests by stored credentials, may allow us to serve notebooks from
the separate domain while keeping access to the notebooks private.
This may be useful when users wish to share a static notebook
selectively.

Other approaches all seem to rest on attempts to cripple JavaScript
execution, by either rendering the JavaScript source as text, else
removing it altogether. This seems like a bad idea as many static
notebooks, particularly in the long run, will need to be able to use
JavaScript to work properly. Will we refuse to render a user's graph
or widget in a static notebook because it uses JavaScript? This is a
never-ending spiral.

I think we need to build on browser security, and therefore trust it,
rather than build a heap of nasty hacks, which may be circumvented
anyway.

Just taking it back to basics for a moment: If Malory signs up for a
dating site, she might decide to put some dodgy JS inside some script
tags, and submit that as part of her profile. If the site doesn't
sanitise that HTML, we all know how it plays out; any poor chap that
thinks she's a hotty gets pwned. Sanitising her input is textbook.

Static notebooks are a totally different scenario. User's will want to
include JS and have it work when publishing a static notebook. This is
a feature of the Notebook. Furthermore, there's a number of ways to
get JS into a static notebook, some more subtle than others, while a
dating site is only dealing with text from an input field.

I don't have much to say about kernelled notebooks right now, but on
statics, I'd like to ask: If we serve a static notebook from a domain
which has zero features beyond hosting static notebooks, would those
notebooks be able to do anything to a user that a regular webpage
couldn't? I'm not a security guy, so I really don't know, but it seems
that we'd be totally free to let users publish whatever they liked,
with no more restrictions than browsers impose already.

I just don't want to see hours of work ploughed into crippling a super
powerful feature, if we could just an use extra domain with a simple
API instead.

Carl