[IPython-dev] Some Thoughts on Notebook Security

Mon Dec 10 23:12:02 EST 2012

Carl,

I appreciate your thinking about this.  It is really important.  But I
think it *may* simple to fix:

There are two issues we have:

* In markdown cells users can put arbitrary HTML and JS.

To fix this, we need to enable the HTML sanitizer that comes with the
JS Markdown rendered that we are using.  This is what StackOverflow
uses to sanitize their markdown and should completely remove any
security risks coming from within markdown cells.

* In CodeCell output, the Javascript repr is dynamically passed
into eval.  This only happens when code is run, not when the notebook
is loaded, so it is less critical, but still needs to be fixed.

To fix this, we need to disable the Javascript representation of
objects altogether.

Will these two things not completely fix the security problems we
currently have?

Now the question is how to enable all of the nice things you can do
with Javascript.  I think the answer is Javascript plugins, JSON reprs
and JSON handlers:

https://github.com/ipython/ipython/pull/2518

The idea is that the extra Javascript cool-stuff will be installed by
the person who runs the notebook server once and for all notebooks on
that server.  Similar to how python packages are installed = you do
this before you start python.  To get data from python to the
Javascript plugins we will use JSON objects and trigger the callbacks
to handle them.

Unless I am misunderstanding the nature of the security risks, I think
this is what we should do.

Cheers,

Brian

On Mon, Dec 10, 2012 at 5:48 PM, Carl Smith <carl.input at gmail.com> wrote:
> The IPython Notebook's vulnerability to cross-site scripting and
> cross-site request forgery, XSS and XSRF, is a serious problem that
> provides baddies with a range of attack vectors, each with almost
> unlimited potential for harm to the user. Attempting to find a single
> solution to so many problems is overwhelming and almost bound to fail.
> Therefore, we will likely benefit from breaking the problem down,
> insofar as we're able.
>
> The most obvious place to start is to distinguish between static views
> of a notebook and a notebook that has a running kernel, which I'll
> call static notebooks and kernelled notebooks for lack of a
> convention.
>
> A static notebook is just a webpage, so it should ideally behave like
> one. Any webpage can execute arbitrary JavaScript, so the fact that
> static notebooks have this ability is not a concern in itself.
>
> Serving all static notebooks from a separate domain should prevent XSS
> and XSRF attacks because of the Same Origin Policy.
>
> Static notebooks, served from a different domain, could be rendered
> inside iframes, enabling us to embed them inside other webpages and
> applications. These notebooks would still be superficially served by
> our own servers, so the UX wouldn't be effected.
>
> Using randomised URLs, or some other scheme that does not validate
> requests by stored credentials, may allow us to serve notebooks from
> the separate domain while keeping access to the notebooks private.
> This may be useful when users wish to share a static notebook
> selectively.
>
> Other approaches all seem to rest on attempts to cripple JavaScript
> execution, by either rendering the JavaScript source as text, else
> removing it altogether. This seems like a bad idea as many static
> notebooks, particularly in the long run, will need to be able to use
> JavaScript to work properly. Will we refuse to render a user's graph
> or widget in a static notebook because it uses JavaScript? This is a
> never-ending spiral.
>
> I think we need to build on browser security, and therefore trust it,
> rather than build a heap of nasty hacks, which may be circumvented
> anyway.
>
> Just taking it back to basics for a moment: If Malory signs up for a
> dating site, she might decide to put some dodgy JS inside some script
> tags, and submit that as part of her profile. If the site doesn't
> sanitise that HTML, we all know how it plays out; any poor chap that
> thinks she's a hotty gets pwned. Sanitising her input is textbook.
>
> Static notebooks are a totally different scenario. User's will want to
> include JS and have it work when publishing a static notebook. This is
> a feature of the Notebook. Furthermore, there's a number of ways to
> get JS into a static notebook, some more subtle than others, while a
> dating site is only dealing with text from an input field.
>
> I don't have much to say about kernelled notebooks right now, but on
> statics, I'd like to ask: If we serve a static notebook from a domain
> which has zero features beyond hosting static notebooks, would those
> notebooks be able to do anything to a user that a regular webpage
> couldn't? I'm not a security guy, so I really don't know, but it seems
> that we'd be totally free to let users publish whatever they liked,
> with no more restrictions than browsers impose already.
>
> I just don't want to see hours of work ploughed into crippling a super
> powerful feature, if we could just an use extra domain with a simple
> API instead.
>
> Carl
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev

-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu and ellisonbg at gmail.com