[IPython-dev] Some Thoughts on Notebook Security

Mon Dec 10 23:18:23 EST 2012

+1

Sent from my iPhone

On Dec 10, 2012, at 11:12 PM, Brian Granger <ellisonbg at gmail.com> wrote:

> Carl,
> 
> I appreciate your thinking about this.  It is really important.  But I
> think it *may* simple to fix:
> 
> There are two issues we have:
> 
> * In markdown cells users can put arbitrary HTML and JS.
> 
> To fix this, we need to enable the HTML sanitizer that comes with the
> JS Markdown rendered that we are using.  This is what StackOverflow
> uses to sanitize their markdown and should completely remove any
> security risks coming from within markdown cells.
> 
> * In CodeCell output, the Javascript repr is dynamically passed
> into eval.  This only happens when code is run, not when the notebook
> is loaded, so it is less critical, but still needs to be fixed.
> 
> To fix this, we need to disable the Javascript representation of
> objects altogether.
> 
> Will these two things not completely fix the security problems we
> currently have?
> 
> Now the question is how to enable all of the nice things you can do
> with Javascript.  I think the answer is Javascript plugins, JSON reprs
> and JSON handlers:
> 
> https://github.com/ipython/ipython/pull/2518
> 
> The idea is that the extra Javascript cool-stuff will be installed by
> the person who runs the notebook server once and for all notebooks on
> that server.  Similar to how python packages are installed = you do
> this before you start python.  To get data from python to the
> Javascript plugins we will use JSON objects and trigger the callbacks
> to handle them.
> 
> Unless I am misunderstanding the nature of the security risks, I think
> this is what we should do.
> 
> Cheers,
> 
> Brian
> 
> 
> On Mon, Dec 10, 2012 at 5:48 PM, Carl Smith <carl.input at gmail.com> wrote:
>> The IPython Notebook's vulnerability to cross-site scripting and
>> cross-site request forgery, XSS and XSRF, is a serious problem that
>> provides baddies with a range of attack vectors, each with almost
>> unlimited potential for harm to the user. Attempting to find a single
>> solution to so many problems is overwhelming and almost bound to fail.
>> Therefore, we will likely benefit from breaking the problem down,
>> insofar as we're able.
>> 
>> The most obvious place to start is to distinguish between static views
>> of a notebook and a notebook that has a running kernel, which I'll
>> call static notebooks and kernelled notebooks for lack of a
>> convention.
>> 
>> A static notebook is just a webpage, so it should ideally behave like
>> one. Any webpage can execute arbitrary JavaScript, so the fact that
>> static notebooks have this ability is not a concern in itself.
>> 
>> Serving all static notebooks from a separate domain should prevent XSS
>> and XSRF attacks because of the Same Origin Policy.
>> 
>> Static notebooks, served from a different domain, could be rendered
>> inside iframes, enabling us to embed them inside other webpages and
>> applications. These notebooks would still be superficially served by
>> our own servers, so the UX wouldn't be effected.
>> 
>> Using randomised URLs, or some other scheme that does not validate
>> requests by stored credentials, may allow us to serve notebooks from
>> the separate domain while keeping access to the notebooks private.
>> This may be useful when users wish to share a static notebook
>> selectively.
>> 
>> Other approaches all seem to rest on attempts to cripple JavaScript
>> execution, by either rendering the JavaScript source as text, else
>> removing it altogether. This seems like a bad idea as many static
>> notebooks, particularly in the long run, will need to be able to use
>> JavaScript to work properly. Will we refuse to render a user's graph
>> or widget in a static notebook because it uses JavaScript? This is a
>> never-ending spiral.
>> 
>> I think we need to build on browser security, and therefore trust it,
>> rather than build a heap of nasty hacks, which may be circumvented
>> anyway.
>> 
>> Just taking it back to basics for a moment: If Malory signs up for a
>> dating site, she might decide to put some dodgy JS inside some script
>> tags, and submit that as part of her profile. If the site doesn't
>> sanitise that HTML, we all know how it plays out; any poor chap that
>> thinks she's a hotty gets pwned. Sanitising her input is textbook.
>> 
>> Static notebooks are a totally different scenario. User's will want to
>> include JS and have it work when publishing a static notebook. This is
>> a feature of the Notebook. Furthermore, there's a number of ways to
>> get JS into a static notebook, some more subtle than others, while a
>> dating site is only dealing with text from an input field.
>> 
>> I don't have much to say about kernelled notebooks right now, but on
>> statics, I'd like to ask: If we serve a static notebook from a domain
>> which has zero features beyond hosting static notebooks, would those
>> notebooks be able to do anything to a user that a regular webpage
>> couldn't? I'm not a security guy, so I really don't know, but it seems
>> that we'd be totally free to let users publish whatever they liked,
>> with no more restrictions than browsers impose already.
>> 
>> I just don't want to see hours of work ploughed into crippling a super
>> powerful feature, if we could just an use extra domain with a simple
>> API instead.
>> 
>> Carl
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
> 
> 
> 
> -- 
> Brian E. Granger
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu and ellisonbg at gmail.com
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev