[IPython-dev] Some Thoughts on Notebook Security

Wed Dec 12 14:37:51 EST 2012

Le 12 déc. 2012 à 18:46, Brian Granger a écrit :

> OK trying to catch up with this thread...
> 
> * Matthias brings up a great point that we need to consider
> forged/hostile notebooks that were not created with our notebook
> server.  The solution to this is to make sure that our notebook server
> can "clean" notebooks before displaying them.  The solutions that I
> posted above will address this completely and allow people to simply
> open notebooks from the web without worrying about them.
> 
> * I still feel like I don't have a straight answer to my question:
> will the solutions I proposed solve all of the security problems while
> allowing us to serve authenticated notebooks on single domains?  If
> now, what security problems would remain?
> 
>> 
>> To fix this, we need to enable the HTML sanitizer that comes with the
>> JS Markdown rendered that we are using.  This is what StackOverflow
>> uses to sanitize their markdown and should completely remove any
>> security risks coming from within markdown cells.
> 
> 
>> * In CodeCell output, the Javascript repr is dynamically passed
>> into eval.  This only happens when code is run, not when the notebook
>> is loaded, so it is less critical, but still needs to be fixed.
> 
>> Oh, yes forgot about that, we will have to clean that HTML as well.
> 
Which can be resumes in clean html/js in all output. 
(as rendered md cell are actually "output")

Then yes.

> 
> * I am confused about the multiple domain solution that is being
> proposed.  The idea is that notebooks containing arbitrary Javascript
> would be served from a separate domain that does not offer any
> authentication.  The authenticated notebooks would live on a second
> domain that doesn't allow Javascript.  But then how is a user supposed
> to author a notebook with Javascript and have that notebook not be
> anonymous?  What if a user wants to author a notebook with Javascript
> that needs to be private? 

let see. 
Multi domain just allow to have fine-grained privileges in cookies.

You can login to wordpress with your google account right ? 
But then it only have access to your identity, not your mails right ?

Then you do the same under the hood with subdomains. 
(for the sake of simplicity i'll use domains names that have meanings, and a simplified logic.) 

user logs in https://iamroot.ipython.org.
He/She have acces to his/her dashboard. 

Get to notification "Other User want you to see a notebook QUX [open]"

click on Open opens the notebook in another tabs.

https://comment-only.ipython.org/view/QUX.ipynb
The user is still "logged in", but any request to https://iamroot.ipython.org will fail because of xss policies. 
And obviously the only thing malicious JS could do is post comment on the current notebook.

The point is to restrain the number of actions available from a certain domain name. 
As you control the authorization of a domain name server side, nothing prevent a user to grant more rights to a specific "domain" or notebook. 

It does not prevent collaborating in anyways, 
But if you share kernel you share filesystem, so there are no point in really using subdomain at this stage.
Still it gives another layer of security as if  there is a way to inject javascript, this javascript will not find any
malicious things to do.

Or it allows you "test" untrusted plugins..

I let you imagine other stuff. 
-- 
Matthias

> * I don't see any fundamental different between "static notebooks" and
> "notebooks with kernels" - both can have Javascript and the new
> Javascript plugins will work on both.  Both have the same overall
> security issues and I don't think it makes sense to try and handle
> them separately.
> 
> Cheers,
> 
> Brian
> 
> 
> 
> 
> On Tue, Dec 11, 2012 at 9:15 AM, Matthias BUSSONNIER
> <bussonniermatthias at gmail.com> wrote:
>> 
>> Le 11 déc. 2012 à 16:10, Carl Smith a écrit :
>> 
>>> Hi Brian
>>>> 
>>>> The idea is that the extra Javascript cool-stuff will be installed by
>>>> the person who runs the notebook server once and for all notebooks on
>>>> that server.  Similar to how python packages are installed = you do
>>>> this before you start python.  To get data from python to the
>>>> Javascript plugins we will use JSON objects and trigger the callbacks
>>>> to handle them.
>>> 
>>> This seems to be dependent on a kernel, which static notebooks don't
>>> have? If I generate a static notebook, which is just a web page, then
>>> post that page to a hosting service, or email it to someone, how would
>>> the plugins work? Maybe we're looking at two slightly different
>>> scenarios. I'm focussed on static views only. The host should not have
>>> to allow anything more than posting and getting HTML documents.
>> 
>> IIRC, you can embed several repr in the ipynb file.
>> So you could provide a plugin that can "render" object on static view.
>> (like d3.js graph, you don't need the kernel to do that)
>> 
>>> ================
>>> 
>>> Hi Matthias
>>> 
>>>> Static notebooks, served from a different domain, could be rendered
>>>> inside iframes, enabling us to embed them inside other webpages and
>>>> applications. These notebooks would still be superficially served by
>>>> our own servers, so the UX wouldn't be effected.
>>>> 
>>>> keep in mind that iframe are not sandboxed, and you can inject js on parent
>>>> frame.
>>>> Unless you use the sandbox attributes, which is part of html5 but not
>>>> implemented in every
>>>> browser… And not yet infallible, it is more a "we'll help you embed other
>>>> pages by providing a separate
>>>> js namespace but we don't guaranty yes that the VM is unbreachable"
>>> 
>> 
>>> I pretty sure iframes are sandboxed in the sense that a parent page
>>> and an iframe can not communicate unless they have the same origin,
>>> and this is an old feature. The new sandbox attribute in HTML5 is for
>>> a different purpose.
>> 
>> But this is still kind of a problem, as usually the static view will be served from the "same origin"
>> as the rest of the website.
>> 
>> If you don't want to make a notebook fully public, you have to have some kind of authentication
>> that allows you to load it.
>> 
>> I'm still doubt a little about what frames are supposed to do and what they actually do.
>> I'm not an expert on that, but it is still worth digging.
>> 
>>>> Responsible disclosure don't want to say much more but but having a
>>>> statically display
>>>> notebook is often link to having a "sharing/import" button which is
>>>> dangerous.
>>>> And could lead to self propagating notebook through account that can infect
>>>> other
>>>> notebooks, or share itself on twitter...
>>> 
>>> Any buttons, like for importing a notebook, would live in the parent
>>> page and would have no access to, nor allow access from, the iframe.
>>> The parent page would know which static notebook it embeds in the
>>> iframe though, so it could provide buttons that connect to the actual
>>> notebook in question, which is a totally different file to the static
>>> notebook being rendered in the iframe anyway.
>> 
>> I understand what you want to do, I guess the definition of "static" is blurry.
>> If you want a perfectly static version (does it make sense in html) you can go with iframe.
>> If you want the ability to comment on a particular cell, then you have to build iframe for
>> every cell.
>> and you lose the ability to comment "inline" as github does.
>> 
>> 
>>> 
>>>> Multi domain is a real good idea. I have a clear view in my head on how we
>>>> could use that in a way close to OAuth to allow javascript by still having
>>>> "logged-in" users.
>>>> It wouldn't be as seamless a something like github, but close.
>>> 
>>> I think we're looking at things differently: You seem to be
>>> considering static views as something generated on the fly and on
>>> demand, nbviewer style. I'm thinking about running nbconvert on a
>>> notebook, then keeping the output as a webpage to be copied and passed
>>> around freely. Once the static notebook exists, it's a done deal.
>>> There's no chance of any changes to IPython breaking it. It's a
>>> independent webpage. Updating it would amount to deleting it and
>>> replacing it with a new version.
>> 
>> I don't think those are quite different.
>> You can have a "perfectly static" version that embeds bad js and require some kind of authentication to be seen.
>> The line between "on the fly" and static is thin.
>> 
>>> 
>>>> The **big** question is:
>>>> Are viewer logged in (in any way) to the given server, and if so do they
>>>> have the right to do anything else with those credentials ?
>>>> If it is just a public notebook viewer, then it's fine.
>>>> 
>>>> If you want something more "interactive" (sharing/ permissions…etc, and the
>>>> display any JS ) you won't have much choice.
>>>> Or you will have a painful multi-login.
>>> 
>>> I'm very much against hosting user submitted notebooks on any domain
>>> with cookie based authentication. It needs to be divided into 'trusted
>>> domain', where no user's JS will ever be served, and 'hosting domain'
>>> that has no account system of it's own. The trusted domain would
>>> control the hosting domain, as a kind of slave.
>> 
>> Yep, kind of what I have in mind.
>> The hosting domain can have "tokens"
>> Publish this comment on this notebook on the behalf of ...
>> The you have to "validate" those action on the "trusted domain".
>> --
>> Matthias
>> 
>> 
>> 
>>> 
>>> That's just my take on all this.
>>> 
>>> Cheers
>>> 
>>> Carl
>>> _______________________________________________
>>> IPython-dev mailing list
>>> IPython-dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>> 
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
> 
> 
> 
> -- 
> Brian E. Granger
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu and ellisonbg at gmail.com
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev