From: Samuele Pedroni [mailto:pedronis@bluewin.ch]
the first candidate would be a generalization of 'class' (although that make it redundant with 'class' and meta-classes) so that
KEYW-TO-BE kind name [ '(' expr,... ')' ] [ maybe [] extended syntax ]: suite
would be equivalent to
name = kind(name-as-string,(expr,...),dict-populated-executing-suite)
[fixed up to exclude the docstring, as per the followup message] I like this - it's completely general, and easy to understand. Then again, I always like constructs defined in terms of code equivalence, it seems to be a good way to make the semantics completely explicit. The nice thing, to me, is that it solves the immediate problem (modulo a suitable "kind" to work for properties), as well as being extensible to allow it to be used in more general contexts. The downside may be that it's *too* general - I've no feel for how it would look if overused - it might feel like people end up defining their own application language.
the remaining problem would be to pick a suitable KEYW-TO-BE
"block"? Someone, I believe, suggested reusing "def" - this might be nice, but IIRC it won't work because of the grammar's strict lookahead limits. (If it does work, then "def" looks good to me). If def won't work, how about "define"? The construct is sort of an extended form of def. Or is that too cute? By the way, can I just say that I am +1 on Michael Hudson's original patch for [...] on definitions. Even though it doesn't solve the issue of properties, I think it's a nice solution for classmethod and staticmethod, and again I like the generality. Paul.
"Moore, Paul" <Paul.Moore@atosorigin.com> writes:
the remaining problem would be to pick a suitable KEYW-TO-BE
"block"?
Someone, I believe, suggested reusing "def" - this might be nice, but IIRC it won't work because of the grammar's strict lookahead limits. (If it does work, then "def" looks good to me).
I think you can left factor it so it works func :: func_start func_middle ':' func_start :: "def" NAME func_middle :: '(' arglist ')' func_middle :: NAME '(' testlist ')'
If def won't work, how about "define"? The construct is sort of an extended form of def. Or is that too cute?
It's a new keyword.
By the way, can I just say that I am +1 on Michael Hudson's original patch for [...] on definitions. Even though it doesn't solve the issue of properties, I think it's a nice solution for classmethod and staticmethod, and again I like the generality.
Yes, I do think this patch is in danger of being forgotten... Cheers, M. -- The Oxford Bottled Beer Database heartily disapproves of the excessive consumption of alcohol. No, really. -- http://www.bottledbeer.co.uk/beergames.html
Paul, I'd like to see what you think of my alternate proposal that does away with the keyword altogether.
By the way, can I just say that I am +1 on Michael Hudson's original patch for [...] on definitions. Even though it doesn't solve the issue of properties, I think it's a nice solution for classmethod and staticmethod, and again I like the generality.
I hope that everyone involved in this discussion understands that none of this goes into Python 2.3. We've promised syntactic stability, and that's what people will get. --Guido van Rossum (home page: http://www.python.org/~guido/)
On Thu, 30 Jan 2003, Guido van Rossum wrote:
Paul, I'd like to see what you think of my alternate proposal that does away with the keyword altogether.
By the way, can I just say that I am +1 on Michael Hudson's original patch for [...] on definitions. Even though it doesn't solve the issue of properties, I think it's a nice solution for classmethod and staticmethod, and again I like the generality.
I hope that everyone involved in this discussion understands that none of this goes into Python 2.3. We've promised syntactic stability, and that's what people will get.
I hope it will never go into Python at all. Most suggestions reminded me of... LaTeX. Some others are hidden way to introduce properties into Python syntax like they are in Java or C++. [grrr!] It works now, so why add fancities? def f(x): ... f.prop1 = val1 ... f.propN = valN For mass-production: def a: ... def b: ... def c: ... for o in a,b,c: o.prop1 = val1 o.prop2 = val2 o.prop3 = val3 (I am sorry I have not closely followed every message of this discussion, so maybe be I missed some good proposals.)
--Guido van Rossum (home page: http://www.python.org/~guido/)
Sincerely yours, Roman A.Suzi -- - Petrozavodsk - Karelia - Russia - mailto:rnd@onego.ru -
From: "Moore, Paul" <Paul.Moore@atosorigin.com>
From: Samuele Pedroni [mailto:pedronis@bluewin.ch]
the first candidate would be a generalization of 'class' (although that make it redundant with 'class' and meta-classes) so that
KEYW-TO-BE kind name [ '(' expr,... ')' ] [ maybe [] extended syntax ]: suite
would be equivalent to
name = kind(name-as-string,(expr,...),dict-populated-executing-suite)
[fixed up to exclude the docstring, as per the followup message]
an alternative (if parseable, I have not fully thought about that) would be to leave out the KEYW-TO-BE and try to parse directly kind name [ '(' expr,... ')' ] [ maybe [] extended syntax ]: where kind could be any general expr or better only a qualified name that are NOT keywords, so we would possible have: property foo: <suite> interface.interface I(J,K): <suite> all working as specified like 'class' and with its scope rules. Control flow statements would still have to be added to the language one by one (I find that ok and pythonic). Also because specifying and implementing implicit thunk with proper scoping and non-local return etc does not (to me) seem worth the complication. About extending or generalizing function 'def' beyond [] extended syntax, I don't see a compelling case.
Some more thoughts on property syntax... (1) It's a pity "access" was revoked as a keyword, since it would have fitted quite well: access foo: def __get__(self): ... def __set__(self, x): ... Is there any chance it could be reinstated? (2) Maybe "property" could be recognised as a pseudo-keyword when it's the first word of a statement inside a class definition. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Some more thoughts on property syntax...
Actually I was attempting to find a solution not just for properties but for other situations as well. E.g. someone might want to define capabilities, or event handlers, or ...
(1) It's a pity "access" was revoked as a keyword, since it would have fitted quite well:
access foo: def __get__(self): ... def __set__(self, x): ...
Is there any chance it could be reinstated?
Not really. Anyway, I don't know what you expect the above to do.
(2) Maybe "property" could be recognised as a pseudo-keyword when it's the first word of a statement inside a class definition.
See above. I'd like to find something that goes beyond properties. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido:
Actually I was attempting to find a solution not just for properties but for other situations as well. E.g. someone might want to define capabilities, or event handlers, or ...
I'm not sure what a capability is, exactly, so I don't know what would be required to provide one. Or how an event handler differs from a method, for that matter. But anyway, here's another idea: def foo as property: def __get__(self): ... def __set__(self, x): ... which would be equivalent to foo = property(<dict-from-the-suite>) or perhaps foo = property(<thunk-doing-the-suite>) You might also want to allow for some arguments somewhere (not sure exactly where, though). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
def foo as property: def __get__(self): ... def __set__(self, x): ...
The above syntax seems to be particularly easy to read and understand, and that is a Good Thing. ;) If I understand it correctly, here is a longer sample:: class Fun(object): def origonal(self, a, b): return a * b def foo(self, a, b) as method: return a + b def bar(klass, a, b) as classmethod: return klass(a, b) def push(self, item) as sync_method: self._queue.append(item) def pop(self) as sync_method: self._queue.pop() def circumference as property: """Property docstring""" def __get__(self): return self.radius*6.283 def __set__(self, value): self.radius = value/6.283 def __delete__(self): del self.radius Similar to Guido's behind-the-scenes implementation peak:: def v(a0, a1, a2) as e: S T = <a thunk created from S> v = e(T) v(a0, a1, a2) # would be the same as e(T).__call__(a0, a1, a2) Seems very clean and extensible since 'e' can interpret the thunk 'T' as it sees fit. If the implementation chooses to execute the thunk and pull the __get__/__set__/__delete__ methods to implement a property, it can. In the same way, a different implementation may choose to execute 'T' as a code block when called, perhaps modifying the parameter list. Issues-that-will-steal-my-sleep-tonight: * The scoping issues seem very hairy. * The definition of what the thunk is remains nebulas. * Additional parameters for 'e' in the 'e(T)' would be *very* useful. * This syntax does not address the inline semantics (and potential usefulness!) of Guido's proposal:: > foo = property: > ... Even with all these outstanding issues, the potentials start to tantalize the imagination! -Shane Holloway
Shane wrote:
Greg Ewing wrote:
def foo as property: def __get__(self): ... def __set__(self, x): ...
The above syntax seems to be particularly easy to read and understand
except that "binding-statement name1 as name2" already has a different meaning in Python. </F>
Fredrik Lundh wrote:
Shane wrote:
Greg Ewing wrote:
def foo as property: def __get__(self): ... def __set__(self, x): ...
The above syntax seems to be particularly easy to read and understand
except that "binding-statement name1 as name2" already has a different meaning in Python.
What would you think of 'is' then? def name(...) is classmethod: doesn't ring confusing bells IMO. I certainly like it more than some brackets which you don't even know how to pronounce. Moreover the original chaining e.g. def name(...) is classmethod, syncmethod: still reads well IMO. holger
What would you think of 'is' then?
def name(...) is classmethod:
Doesn't read nearly as well to my ears. "Define foo as property" is a grammatical and meaningful English sentence, whereas "Define foo is property" isn't. Simply "foo is property" would be better, but unfortunately it already has a meaning. Hmmm... maybe whereas foo is property: ... :-? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Guido:
Actually I was attempting to find a solution not just for properties but for other situations as well. E.g. someone might want to define capabilities, or event handlers, or ...
[Greg]
I'm not sure what a capability is, exactly, so I don't know what would be required to provide one.
Me neither. :-) One person tried to convince me to change the language to allow 'capclass' and 'capability' as keywords (alternatives for 'class' and 'def'). In the end I convinced them that 'rexec' is good enough (if the implementation weren't flawed by security holes, all of which are theoretically fixable). I *still* don't know what a capability is.
Or how an event handler differs from a method, for that matter.
Probably by being hooked up to an event loop automatically.
But anyway, here's another idea:
def foo as property: def __get__(self): ... def __set__(self, x): ...
which would be equivalent to
foo = property(<dict-from-the-suite>)
or perhaps
foo = property(<thunk-doing-the-suite>)
You might also want to allow for some arguments somewhere (not sure exactly where, though).
I don't like things that reuse 'def', unless the existing 'def' is a special case and not just an alternative branch in the grammar. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Guido:
Actually I was attempting to find a solution not just for properties but for other situations as well. E.g. someone might want to define capabilities, or event handlers, or ...
[Greg]
I'm not sure what a capability is, exactly, so I don't know what would be required to provide one.
Me neither. :-) One person tried to convince me to change the language to allow 'capclass' and 'capability' as keywords (alternatives for 'class' and 'def'). In the end I convinced them that 'rexec' is good enough (if the implementation weren't flawed by security holes, all of which are theoretically fixable). I *still* don't know what a capability is.
I'll admit to being that person. A capability is, in essence, an opaque bound method. Of course, for them to be much use, you want the system to provide some other stuff, like not being able to recreate capabilities (i.e. you get hold of them from on high, and that's the _only_ way to get them). If I'd known from the start that rexec restricted access to bound method attributes, I'd've saved a lot of time and typing, particularly since I've recognised from early on that rexec is needed to get anywhere at all :-) OTOH, its taken me a while to realise that the only other thing you need are opaque bound methods, so maybe it wouldn't have helped to have know that about rexec from the start. And maybe more is needed, I won't know until I get more deeply into it. Anyway, I'm now interested in fixing rexec. Guido's already told me some ways in which it is broken. If people know of others, I'll start gathering them together. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
"BL" == Ben Laurie <ben@algroup.co.uk> writes:
BL> I'll admit to being that person. A capability is, in essence, an BL> opaque bound method. Of course, for them to be much use, you BL> want the system to provide some other stuff, like not being able BL> to recreate capabilities (i.e. you get hold of them from on BL> high, and that's the _only_ way to get them). That seems like a funny defintion of a capability. That is, you seem to have chosen a particular way to represent capabilities and are using the representation to describe the abstract idea. A more general definition might be: A capability is an object that carries with it the authorization to perform some action. Informally, a capability can be thought of as a ticket. The ticket is good for some event and possession of the ticket is sufficient to attend the event. A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language. I assume you're suggesting that methods be thought of as capabilities, so that possession of a bound method object implies permission to invoke it. That seems like a reasonable design, but what about classes or functions or instances? The problem, which rexec solves after a fashion, is to prevent unauthorized copying of the capabilities, or more specifically of the access rights contained in the capability. That is, there's some object that checks tickets (the reference monitor). It needs to be able to inspect the ticket, but it shouldn't be possible for someone else to use that mechanism to forge a ticket. The problem for Python is that its introspection features make it impossible (?) for pure Python code to hide something. In Java, you could declare an instance variable private and know that the type system will prevent client code from accessing the variable directly. In Python, there is no private. Rexec provides a way to turn off some of the introspection in order to allow some confinement. If you can't extract the im_class, im_func, and im_self attributes of a bound method, then you can use a bound method as a capability without risk that the holder will break into the object. On the other hand, if you want to use some other kind of object as a capability, you must be sure that there isn't some introspection mechanism that allows the holder to get into the representation. If there is, rexec needs to turn it off. The problem with rexec is that the security code is smeared across the entire interpreter. Each object or instrospection facility would need to has to have a little bit of code that participates in rexec. And every change to the language needs to taken rexec into account. What's worse is that rexec turns off a set of introspection features, so you can't use any of those features in code that might get loaded in rexec. The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings. This approach seems promising because it is fairly self-contained. The proxy code can be largely independent of the rest of the interpreter, so that you can analyze it without scouring the entire source tree. It's also quite flexible. If you want to use instances is capabilities, you could, for example, use a proxy that only allows access to methods, not to instance variables. It's a simple mechanism that allows many policies, as opposed to rexec which couples policy and mechanism. Jeremy
Wow, how did this topic end up crossing over to this list while i wasn't looking? :0 Ben Laurie wrote:
I'll admit to being that person. A capability is, in essence, an opaque bound method. Of course, for them to be much use, you want the system to provide some other stuff, like not being able to recreate capabilities (i.e. you get hold of them from on high, and that's the _only_ way to get them).
Jeremy Hylton wrote:
That seems like a funny defintion of a capability.
A better definition of "capabilitiy" is "object reference". "Bound method" isn't enough, because a capability should be able to have state, and your system should have the flexibility to represent capabilities that share state with other capabilities, or ones that don't. The simple, straightforward way to model this is to just equate the word "capability" with "object reference".
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language.
I suppose there could be, but there is really only one obvious way: creating a capability is equivalent to creating an object -- which you can only do if you hold the constructor. A capability is copied into another object (for the security folks, "object" == "protection domain") when it is transmitted as an argument to a method call. To build a capability system, all you need to do is to constrain the transfer of object references such that they can only be transmitted along other object references. That's all. The problem for Python, as Jeremy explained, is that there are so many other ways of crawling into objects and pulling out bits of their internals. Off the top of my head, i only see two things that would have to be fixed to turn Python into a capability-secure system: 1. Access to each object is limited to its declared exposed interface; no introspection allowed. 2. No global namespace of modules (sys.modules etc.). If there is willingness to consider a "secure mode" for Python in which these two things are enforced, i would be interested in making it happen.
The problem, which rexec solves after a fashion, is to prevent unauthorized copying of the capabilities, or more specifically of the access rights contained in the capability.
No, this isn't necessary. There's no point in restricting copying. The real problem is the unlimited introspective access ("there is no private").
In Python, there is no private.
Side note (probably irrelevant): in some sense there is, but nobody uses it. Scopes are private. If you were to implement classes and objects using lambdas with message dispatch (i.e. the Scheme way, instead of having a separate "class" keyword), then the scoping would take care of all the private-ness for you.
The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings.
The proxy mechanism is interesting, but not for this purpose. A proxy is how you implement revocation of capabilities: if you insert a proxy in front of an object and grant access to that proxy, then you can revoke the access just by telling the proxy to stop responding. The mechanism by which the proxy also proxies objects passing through is what we capability-nerds call a "membrane", and it's a way of doing more powerful revocation. -- ?!ng
Ka-Ping Yee wrote:
Off the top of my head, i only see two things that would have to be fixed to turn Python into a capability-secure system:
1. Access to each object is limited to its declared exposed interface; no introspection allowed.
2. No global namespace of modules (sys.modules etc.).
If there is willingness to consider a "secure mode" for Python in which these two things are enforced, i would be interested in making it happen.
FWIW, I think that would be a useful model. Neil
Ka-Ping Yee wrote:
Wow, how did this topic end up crossing over to this list while i wasn't looking? :0
Ben Laurie wrote:
I'll admit to being that person. A capability is, in essence, an opaque bound method. Of course, for them to be much use, you want the system to provide some other stuff, like not being able to recreate capabilities (i.e. you get hold of them from on high, and that's the _only_ way to get them).
Jeremy Hylton wrote:
That seems like a funny defintion of a capability.
A better definition of "capabilitiy" is "object reference".
"Bound method" isn't enough, because a capability should be able to have state, and your system should have the flexibility to represent capabilities that share state with other capabilities, or ones that don't. The simple, straightforward way to model this is to just equate the word "capability" with "object reference".
You can it that way, too, but it strikes me as more unwieldy in practice. A bound method also has state because it is, of course, bound to an object reference, so I find that a more elegant way to do it.
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language.
I suppose there could be, but there is really only one obvious way: creating a capability is equivalent to creating an object -- which you can only do if you hold the constructor. A capability is copied into another object (for the security folks, "object" == "protection domain") when it is transmitted as an argument to a method call.
To build a capability system, all you need to do is to constrain the transfer of object references such that they can only be transmitted along other object references. That's all.
The problem for Python, as Jeremy explained, is that there are so many other ways of crawling into objects and pulling out bits of their internals.
Off the top of my head, i only see two things that would have to be fixed to turn Python into a capability-secure system:
1. Access to each object is limited to its declared exposed interface; no introspection allowed.
2. No global namespace of modules (sys.modules etc.).
If there is willingness to consider a "secure mode" for Python in which these two things are enforced, i would be interested in making it happen.
I believe you just described rexec. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
"KPY" == Ka-Ping Yee <ping@zesty.ca> writes:
KPY> Wow, how did this topic end up crossing over to this list while KPY> i wasn't looking? :0 You sure react quick for someone who isn't looking <wink>.
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language.
KPY> I suppose there could be, but there is really only one obvious KPY> way: creating a capability is equivalent to creating an object KPY> -- which you can only do if you hold the constructor. A KPY> capability is copied into another object (for the security KPY> folks, "object" == "protection domain") when it is transmitted KPY> as an argument to a method call. KPY> To build a capability system, all you need to do is to KPY> constrain the transfer of object references such that they can KPY> only be transmitted along other object references. That's all. I don't follow you hear. What does it mean to "transmit along other object references?" That is, everything in Python is an object and the only kind of references that exist are object references. I think, based on your the rest of your mail, that we're largely on the same page, but I'd like to make sure I understand where you're coming from. I don't quite follow the definition of protection domain either, as most of the literature I'm familiar with (not much of it about capabilities specifically) talks about a protection domain as the set of objects a principal has access to. The natural way to extend that to capabilities seems to me to be that a protection domain is the set of capabilities possessed by a principal. Are these questions are off-topic for python-dev? At any rate, it still seems like there are a variety of ways to realize capabilities in a programming language. For example, ZODB uses a special base class called Persistent to mark persistent objects. One could imagine using the same approach so that only some objects have capabilities associated with them. KPY> The problem for Python, as Jeremy explained, is that there are KPY> so many other ways of crawling into objects and pulling out KPY> bits of their internals. KPY> Off the top of my head, i only see two things that would have KPY> to be fixed to turn Python into a capability-secure system: KPY> 1. Access to each object is limited to its declared exposed KPY> interface; no introspection allowed. KPY> 2. No global namespace of modules (sys.modules etc.). KPY> If there is willingness to consider a "secure mode" for Python KPY> in which these two things are enforced, i would be interested KPY> in making it happen. I think there is interest and I agree with your problem statement. I'd rephrase 2 to make it more general. Control access to other modules. The import statement is just as much of a problem as sys.modules, right? In a secure environment, you have to control what code can be loaded in the first place.
In Python, there is no private.
KPY> Side note (probably irrelevant): in some sense there is, but KPY> nobody uses it. Scopes are private. If you were to implement KPY> classes and objects using lambdas with message dispatch KPY> (i.e. the Scheme way, instead of having a separate "class" KPY> keyword), then the scoping would take care of all the KPY> private-ness for you. I was aware of Rees's dissertation when I did the nested scopes and, partly as a result, did not provide any introspection mechanism for closures. That is, you can get at a function's func_closure slot but there's no way to look inside the cells from Python. I was thinking that closures could replace Bastions. It stills seems possible, but on several occasions I've wished I could introspect about closures from Python code. I'm also unsure that the idea flies so well for Python, because you really want secure Python to be as much like regular Python as possible. If the mechanism is based on functions, it seems hard to make it work naturally for classes and instances.
The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings.
KPY> The proxy mechanism is interesting, but not for this purpose. KPY> A proxy is how you implement revocation of capabilities: if you KPY> insert a proxy in front of an object and grant access to that KPY> proxy, then you can revoke the access just by telling the proxy KPY> to stop responding. Sure, you can use proxies for revocation, but that's not what I was trying to say. I think the fundamental problem for rexec is that you don't have a security kernel. The code for security gets scatter throughout the interpreter. It's hard to have much assurance in the security when its tangled up with everything else in the language. You can use a proxy for an object to deal with goal #1 above -- enforce an interface for an object. I think about this much like a hardware capability architecture. The protected objects live in the capability segment and regular code can't access them directly. The only access is via a proxy object that is bound to the capability. Regardless of proxy vs. rexec, I'd be interested to hear what you think about a sound way to engineer a secure Python. Jeremy
My attentions was drawn to this unanswered email, so here goes... Jeremy Hylton wrote:
"KPY" == Ka-Ping Yee <ping@zesty.ca> writes:
KPY> Wow, how did this topic end up crossing over to this list while KPY> i wasn't looking? :0
You sure react quick for someone who isn't looking <wink>.
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language.
KPY> I suppose there could be, but there is really only one obvious KPY> way: creating a capability is equivalent to creating an object KPY> -- which you can only do if you hold the constructor. A KPY> capability is copied into another object (for the security KPY> folks, "object" == "protection domain") when it is transmitted KPY> as an argument to a method call.
KPY> To build a capability system, all you need to do is to KPY> constrain the transfer of object references such that they can KPY> only be transmitted along other object references. That's all.
I don't follow you hear. What does it mean to "transmit along other object references?" That is, everything in Python is an object and the only kind of references that exist are object references.
He's actually going slightly in circles here. The idea is that in order to acquire an object reference you either create the object, or are given the reference by another object you already have a reference to, or are given it by another object that has a reference to you. Where "you" is some object, of course. What is _not_ supposed to happen is finding objects by poking around in the symbol table, for example.
I think, based on your the rest of your mail, that we're largely on the same page, but I'd like to make sure I understand where you're coming from.
I don't quite follow the definition of protection domain either, as most of the literature I'm familiar with (not much of it about capabilities specifically) talks about a protection domain as the set of objects a principal has access to. The natural way to extend that to capabilities seems to me to be that a protection domain is the set of capabilities possessed by a principal.
That sounds right. The transitive closure of the capabilties possessed by a principal is also interesting, though the code in the objects determines whether you have access to any particular member of that set in practice.
Are these questions are off-topic for python-dev?
At any rate, it still seems like there are a variety of ways to realize capabilities in a programming language. For example, ZODB uses a special base class called Persistent to mark persistent objects. One could imagine using the same approach so that only some objects have capabilities associated with them.
This was the approach I tool initially but its substantially more messy than using bound methods.
KPY> The problem for Python, as Jeremy explained, is that there are KPY> so many other ways of crawling into objects and pulling out KPY> bits of their internals.
KPY> Off the top of my head, i only see two things that would have KPY> to be fixed to turn Python into a capability-secure system:
KPY> 1. Access to each object is limited to its declared exposed KPY> interface; no introspection allowed.
KPY> 2. No global namespace of modules (sys.modules etc.).
KPY> If there is willingness to consider a "secure mode" for Python KPY> in which these two things are enforced, i would be interested KPY> in making it happen.
I think there is interest and I agree with your problem statement. I'd rephrase 2 to make it more general. Control access to other modules. The import statement is just as much of a problem as sys.modules, right? In a secure environment, you have to control what code can be loaded in the first place.
Correct.
In Python, there is no private.
KPY> Side note (probably irrelevant): in some sense there is, but KPY> nobody uses it. Scopes are private. If you were to implement KPY> classes and objects using lambdas with message dispatch KPY> (i.e. the Scheme way, instead of having a separate "class" KPY> keyword), then the scoping would take care of all the KPY> private-ness for you.
I was aware of Rees's dissertation when I did the nested scopes and, partly as a result, did not provide any introspection mechanism for closures. That is, you can get at a function's func_closure slot but there's no way to look inside the cells from Python. I was thinking that closures could replace Bastions. It stills seems possible, but on several occasions I've wished I could introspect about closures from Python code. I'm also unsure that the idea flies so well for Python, because you really want secure Python to be as much like regular Python as possible. If the mechanism is based on functions, it seems hard to make it work naturally for classes and instances.
The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings.
KPY> The proxy mechanism is interesting, but not for this purpose. KPY> A proxy is how you implement revocation of capabilities: if you KPY> insert a proxy in front of an object and grant access to that KPY> proxy, then you can revoke the access just by telling the proxy KPY> to stop responding.
Sure, you can use proxies for revocation, but that's not what I was trying to say.
I think the fundamental problem for rexec is that you don't have a security kernel. The code for security gets scatter throughout the interpreter. It's hard to have much assurance in the security when its tangled up with everything else in the language.
You can use a proxy for an object to deal with goal #1 above -- enforce an interface for an object. I think about this much like a hardware capability architecture. The protected objects live in the capability segment and regular code can't access them directly. The only access is via a proxy object that is bound to the capability.
Regardless of proxy vs. rexec, I'd be interested to hear what you think about a sound way to engineer a secure Python.
I'm told that proxies actually rely on rexec, too. So, I guess whichever approach you take, you need rexec. The problem is that although you can think about proxies as being like a segmented architecture, you have to enforce that segmentation. And that means doing so throughout the interpreter, doesn't it? I suppose it might be possible to abstract things in some way to make that less widespread, but probably not without having an adverse impact on speed. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
I'm told that proxies actually rely on rexec, too. So, I guess whichever approach you take, you need rexec.
Yes and no. It's unclear what *you* mean when you say "rexec". There is a standard module by that name that employs Python's support for tighter security and sets up an entire restricted execution environment. And then there's the underlying facilities in Python, which allow you to override __import__ and all other built-ins; this facility is often called "restricted execution." Zope security proxies rely on the latter facilities, but not on the rexec module. I suggest that in order to avoid confusion, you should use "restricted execution" when that's what you mean, and use "rexec" only to refer to the standard module by that name.
The problem is that although you can think about proxies as being like a segmented architecture, you have to enforce that segmentation. And that means doing so throughout the interpreter, doesn't it? I suppose it might be possible to abstract things in some way to make that less widespread, but probably not without having an adverse impact on speed.
The built-in restricted execution facilities indeed do distinguish between two security domains: restricted and unrestricted. In restricted mode, certain introspection APIs are disallowed. Restricted execution is enabled as soon as a particular global scope's __builtins__ is not the standard __builtins__, which is by definition the __dict__ of the __builtin__ module (note __builtin__, which is a module, vs. __builtins__, which is a global). --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
I'm told that proxies actually rely on rexec, too. So, I guess whichever approach you take, you need rexec.
Yes and no. It's unclear what *you* mean when you say "rexec". There is a standard module by that name that employs Python's support for tighter security and sets up an entire restricted execution environment. And then there's the underlying facilities in Python, which allow you to override __import__ and all other built-ins; this facility is often called "restricted execution." Zope security proxies rely on the latter facilities, but not on the rexec module.
I suggest that in order to avoid confusion, you should use "restricted execution" when that's what you mean, and use "rexec" only to refer to the standard module by that name.
OK, I mean restricted execution.
The problem is that although you can think about proxies as being like a segmented architecture, you have to enforce that segmentation. And that means doing so throughout the interpreter, doesn't it? I suppose it might be possible to abstract things in some way to make that less widespread, but probably not without having an adverse impact on speed.
The built-in restricted execution facilities indeed do distinguish between two security domains: restricted and unrestricted. In restricted mode, certain introspection APIs are disallowed. Restricted execution is enabled as soon as a particular global scope's __builtins__ is not the standard __builtins__, which is by definition the __dict__ of the __builtin__ module (note __builtin__, which is a module, vs. __builtins__, which is a global).
Oh, I understand that, but the complaint was that it is spread all over the interpreter. One of the nice thing about hardware enforced segmentation is that you have a high assurance that it really is segemented. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
On Mon, 2003-03-03 at 08:42, Ben Laurie wrote:
I think the fundamental problem for rexec is that you don't have a security kernel. The code for security gets scatter throughout the interpreter. It's hard to have much assurance in the security when its tangled up with everything else in the language.
You can use a proxy for an object to deal with goal #1 above -- enforce an interface for an object. I think about this much like a hardware capability architecture. The protected objects live in the capability segment and regular code can't access them directly. The only access is via a proxy object that is bound to the capability.
Regardless of proxy vs. rexec, I'd be interested to hear what you think about a sound way to engineer a secure Python.
I'm told that proxies actually rely on rexec, too. So, I guess whichever approach you take, you need rexec.
The problem is that although you can think about proxies as being like a segmented architecture, you have to enforce that segmentation. And that means doing so throughout the interpreter, doesn't it? I suppose it might be possible to abstract things in some way to make that less widespread, but probably not without having an adverse impact on speed.
The boundary between the interpreter and the proxy is the generic type object API. The Python code does not know anything about the representation of a proxy object, except that it is a PyObject *. As a result, the only way to invoke operations on its is to go through the various APIs in the type object's table of function pointers. There are surely limits to how far the separation can go. I expect you can't inherit from a proxy for a class, such that the base class is in a different protection domain than the subclass. But I think there are fewer ad hoc restrictions than there are in rexec. I think this provides a pretty clean separation of concerns, even if the proxy object were a standard part of Python. The only code that should manipulate the proxy representation is its implementation. The only other step would be to convince yourself that Python does not inspect arbitrary parts of a concrete PyObject * in an unsafe way. Jeremy
Jeremy Hylton wrote:
On Mon, 2003-03-03 at 08:42, Ben Laurie wrote:
I think the fundamental problem for rexec is that you don't have a security kernel. The code for security gets scatter throughout the interpreter. It's hard to have much assurance in the security when its tangled up with everything else in the language.
You can use a proxy for an object to deal with goal #1 above -- enforce an interface for an object. I think about this much like a hardware capability architecture. The protected objects live in the capability segment and regular code can't access them directly. The only access is via a proxy object that is bound to the capability.
Regardless of proxy vs. rexec, I'd be interested to hear what you think about a sound way to engineer a secure Python.
I'm told that proxies actually rely on rexec, too. So, I guess whichever approach you take, you need rexec.
The problem is that although you can think about proxies as being like a segmented architecture, you have to enforce that segmentation. And that means doing so throughout the interpreter, doesn't it? I suppose it might be possible to abstract things in some way to make that less widespread, but probably not without having an adverse impact on speed.
The boundary between the interpreter and the proxy is the generic type object API. The Python code does not know anything about the representation of a proxy object, except that it is a PyObject *. As a result, the only way to invoke operations on its is to go through the various APIs in the type object's table of function pointers.
There are surely limits to how far the separation can go. I expect you can't inherit from a proxy for a class, such that the base class is in a different protection domain than the subclass. But I think there are fewer ad hoc restrictions than there are in rexec.
I think this provides a pretty clean separation of concerns, even if the proxy object were a standard part of Python. The only code that should manipulate the proxy representation is its implementation. The only other step would be to convince yourself that Python does not inspect arbitrary parts of a concrete PyObject * in an unsafe way.
I'm obviously missing something - surely you can say pretty much exactly the same thing about a bound method, just replace "type object" with "PyMethodObject"? And in either case, you also need to restrict access to the underlying libraries and (presumably) some of the builtin functions? BTW, Guido pointed out to me that I'm causing confusion by saying "rexec" when I really mean "restricted execution". In short, it seems to me that proxies and capabilities via bound methods both do the same basic thing: i.e. prevent inspection of what is behind the capability/proxy. Proxies add access control to decide whether you get to use them or not, whereas in a capability system simple posession of the capability is sufficient (i.e. they are like a proxy where the security check always says "yes"). You do access control using capabilities, instead of inside them. Am I not understanding proxies? Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
Ben Laurie wrote:
Jeremy Hylton wrote:
...
And in either case, you also need to restrict access to the underlying libraries and (presumably) some of the builtin functions?
You don't need restricted execution to make proxies work. In Zope, we choose to use restricted execution in cases where proxies don't work well. (For example, as I mentioned in another note, we can't currently proxy exceptions.)
BTW, Guido pointed out to me that I'm causing confusion by saying "rexec" when I really mean "restricted execution".
Right. I think that there is some confusion floating around wrt proxies (not your fault :) ...
In short, it seems to me that proxies and capabilities via bound methods both do the same basic thing: i.e. prevent inspection of what is behind the capability/proxy. Proxies add access control to decide whether you get to use them or not, whereas in a capability system simple posession of the capability is sufficient (i.e. they are like a proxy where the security check always says "yes"). You do access control using capabilities, instead of inside them.
Am I not understanding proxies?
You are understanding proxies as they are *applied* in Zope. This is understandable, since the information I sent you: http://cvs.zope.org/Zope3/src/zope/security/readme.txt?rev=HEAD&content-type=text/vnd.viewcvs-markup talks more about the higher-level application of proxies in Zope than about the basic proxy features. Really, Zope proxies are on about the same level as bound methods. They are a lower-level abstraction than capabilities. YOu could use them to implement capabilities or you could use them to implement a different approach, as we have done in Zope. As I mentioned in another Zope, I think proxies provide a better way to implement capabilities than bound methods because they provide access to objects with whole interfaces, rather than just individual functions or methods. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
[Jim]
You don't need restricted execution to make proxies work.
Um, I think that's a dangerous mistake, or a confusion in terminology. Without restricted execution, untrusted code would have access to sys.modules, and from there it would be able to access removeAllProxies. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
[Jim]
You don't need restricted execution to make proxies work.
Um, I think that's a dangerous mistake, or a confusion in terminology.
All I'm saying is that the proxy mechanism itself doesn't rely on restricted execution.
Without restricted execution, untrusted code would have access to sys.modules, and from there it would be able to access removeAllProxies.
All we need to be able to do is control imports. It turns out that to prevent access to sys.modules, we have to replace __builtins__, which has the side-effect of enabling restricted execution. You don't need anything but the ability to restrict imports and other unproxied access to sys.modules to use proxies. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
[Jim]
You don't need restricted execution to make proxies work.
[Guido]
Um, I think that's a dangerous mistake, or a confusion in terminology.
[Jim]
All I'm saying is that the proxy mechanism itself doesn't rely on restricted execution.
Without restricted execution, untrusted code would have access to sys.modules, and from there it would be able to access removeAllProxies.
All we need to be able to do is control imports. It turns out that to prevent access to sys.modules, we have to replace __builtins__, which has the side-effect of enabling restricted execution. You don't need anything but the ability to restrict imports and other unproxied access to sys.modules to use proxies.
Turns out this was another terminology misunderstanding. I think of the ability to overload __import__ and set __builtins__ as part of the restricted execution implementation, because that's why they were implemented. Jim thought that these were separate features, and that restricted execution in the interpreter only referred to the closing off of some introspection attributes (e.g. im_self, __dict__ and func_globals). --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
[Jim]
You don't need restricted execution to make proxies work.
Um, I think that's a dangerous mistake, or a confusion in terminology.
All I'm saying is that the proxy mechanism itself doesn't rely on restricted execution.
Without restricted execution, untrusted code would have access to sys.modules, and from there it would be able to access removeAllProxies.
All we need to be able to do is control imports. It turns out that to prevent access to sys.modules, we have to replace __builtins__, which has the side-effect of enabling restricted execution. You don't need anything but the ability to restrict imports and other unproxied access to sys.modules to use proxies. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
On Sun, 2003-03-09 at 07:03, Guido van Rossum wrote:
[Jim]
You don't need restricted execution to make proxies work.
Um, I think that's a dangerous mistake, or a confusion in terminology.
Without restricted execution, untrusted code would have access to sys.modules, and from there it would be able to access removeAllProxies.
Guido and I discovered that we were not using the same terminology in our own discussions. Guido suggests the following terms: rexec -- the rexec module in the Python standard library restricted execution -- the features in the Python code depending on PyEval_GetRestricted(). We still need a term to refer to an arbitrary mechanism for providing a secure environment for untrusted code. (I had been using "restricted execution" to mean this.) Perhaps a "safe interpreter"? Jeremy
Really, Zope proxies are on about the same level as bound methods.
Another difference is that proxies were *designed* for securing off all access. Bound methods have introspection facilities which allow you to go around them. Restricted execution tries to fence off those introspection facilities, but there may be a hole in the fence. --Guido van Rossum (home page: http://www.python.org/~guido/)
Jeremy Hylton wrote:
"BL" == Ben Laurie <ben@algroup.co.uk> writes:
BL> I'll admit to being that person. A capability is, in essence, an BL> opaque bound method. Of course, for them to be much use, you BL> want the system to provide some other stuff, like not being able BL> to recreate capabilities (i.e. you get hold of them from on BL> high, and that's the _only_ way to get them).
That seems like a funny defintion of a capability. That is, you seem to have chosen a particular way to represent capabilities and are using the representation to describe the abstract idea. A more general definition might be: A capability is an object that carries with it the authorization to perform some action. Informally, a capability can be thought of as a ticket. The ticket is good for some event and possession of the ticket is sufficient to attend the event.
I agree - I was trying to choose an example (just as you have) that would get the flavour of a capability across to any Python programmer.
A capability system must have some rules for creating and copying capabilities, but there is more than one way to realize those rules in a programming language. I assume you're suggesting that methods be thought of as capabilities, so that possession of a bound method object implies permission to invoke it. That seems like a reasonable design, but what about classes or functions or instances?
The idea I had was that these are all icing on the cake. If I can secure bound methods, I have what I want, which is something that enforces the properties needed to have a capability that imposes minimum overhead on the programmer. If we also get access to classes, function or instances in a way that is secure, then that's great. But if we don't, its not a huge loss.
The problem, which rexec solves after a fashion, is to prevent unauthorized copying of the capabilities, or more specifically of the access rights contained in the capability. That is, there's some object that checks tickets (the reference monitor). It needs to be able to inspect the ticket, but it shouldn't be possible for someone else to use that mechanism to forge a ticket.
I don't like this. There is no "reference monitor" that checks tickets, if you implement as I have suggested. You either have them or you don't. The ticket is the method. The method is the ticket. But, of course, you can have implementations that are less direct, I agree.
The problem for Python is that its introspection features make it impossible (?) for pure Python code to hide something. In Java, you could declare an instance variable private and know that the type system will prevent client code from accessing the variable directly. In Python, there is no private.
Rexec provides a way to turn off some of the introspection in order to allow some confinement. If you can't extract the im_class, im_func, and im_self attributes of a bound method, then you can use a bound method as a capability without risk that the holder will break into the object. On the other hand, if you want to use some other kind of object as a capability, you must be sure that there isn't some introspection mechanism that allows the holder to get into the representation. If there is, rexec needs to turn it off.
Quite so.
The problem with rexec is that the security code is smeared across the entire interpreter. Each object or instrospection facility would need to has to have a little bit of code that participates in rexec. And every change to the language needs to taken rexec into account. What's worse is that rexec turns off a set of introspection features, so you can't use any of those features in code that might get loaded in rexec.
I agree that this is a problem.
The Zope proxy approach seems a little more promising, because it centralizes all the security machinery in one object, a security proxy. A proxy for an object can appear virtually indistinguishable for the object itself, except that type(proxy) != type(object_being_proxied). The proxy guarantees that any object returned through the proxy is wrapped in its own proxy, except for simple immutable objects like ints or strings.
This approach seems promising because it is fairly self-contained. The proxy code can be largely independent of the rest of the interpreter, so that you can analyze it without scouring the entire source tree. It's also quite flexible. If you want to use instances is capabilities, you could, for example, use a proxy that only allows access to methods, not to instance variables. It's a simple mechanism that allows many policies, as opposed to rexec which couples policy and mechanism.
I will admit to not being thoroughly familiar with Zope proxies, but I'd love to be persuaded. Right now, I have two instant issues with them: a) They do not appear to be simple to use, in the slightest. One of the beauties of using opaque bound methods is that they are trivial and natural to use. No programmer should find it difficult, or even a noticable overhead. In fact, I observed whilst doing limited experiments in this area that it led to what seemed to me to be an improvement in style (for example, if I wanted to have something read a configuration file, rather than passing the name of the file, a capability approach is to pass a file reader already bound to that file - this is, IMO, rather more elegant). b) I don't understand how they avoid all the problems that are left lying around in the language itself without rexec. I did look at the Zope proxy stuff. I found it very hard to understand, so I may well be totally missing the point. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff
[Guido van Rossum]
Actually I was attempting to find a solution not just for properties but for other situations as well. E.g. someone might want to define capabilities, or event handlers, or ...
[Greg Ewing]
But anyway, here's another idea:
def foo as property: def __get__(self): ... def __set__(self, x): ...
[Guido van Rossum]
I don't like things that reuse 'def', unless the existing 'def' is a special case and not just an alternative branch in the grammar.
I think Greg's idea ("as" or "is") could satisfy these conditions well (generic solution, and unadorned "def" as special case). A standard function definition (simple, unadorned "def") is equivalent to: def f as function: suite A standard method definition is as above, or if a distinction is required or useful it could be equivalent to: def m(self) as method: suite This syntax could be used to remove a recent wart, making generators explicit: def g as generator: suite Of course, it would be a syntax error if the generator suite didn't contain a "yield" statement. This syntax has aesthetic appeal. I know nothing about its implementability. What namespace these new modifier terms would live in, I also don't know. The syntax proposal reads well and seems like it could be a good general-purpose solution. +1 -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv
David Goodger <goodger@python.org> writes:
This syntax could be used to remove a recent wart, making generators explicit:
def g as generator: suite
Of course, it would be a syntax error if the generator suite didn't contain a "yield" statement.
You have to talk faster than that to convince me this is a wart. Cheers, M. -- Finding a needle in a haystack is a lot easier if you burn down the haystack and scan the ashes with a metal detector. -- the Silicon Valley Tarot (another one nicked from David Rush)
[Guido van Rossum]
Actually I was attempting to find a solution not just for properties but for other situations as well. E.g. someone might want to define capabilities, or event handlers, or ...
[Greg Ewing]
But anyway, here's another idea:
def foo as property: def __get__(self): ... def __set__(self, x): ...
[Guido van Rossum]
I don't like things that reuse 'def', unless the existing 'def' is a special case and not just an alternative branch in the grammar.
I think Greg's idea ("as" or "is") could satisfy these conditions well (generic solution, and unadorned "def" as special case).
A standard function definition (simple, unadorned "def") is equivalent to:
def f as function: suite
In this syntax, where does the list of formal parameters go? Also, this requires that the scope rules for the suite are exactly as they currently are for functions (including references to non-locals). That's all fine and dandy, but then I don't understand how the poor implementation of property is going to extract __get__ etc. from the local variables of the function body after executing it.
A standard method definition is as above, or if a distinction is required or useful it could be equivalent to:
def m(self) as method: suite
OK, I can buy that. I've got a little bit of philosophy on the use of keywords vs. punctuation. While punctuation is concise, and often traditional for things like arithmetic operations, giving punctuation too much power can lead to loss of readability. (So yes, I regret significant trailing commas in print and tuples -- just a little bit.) So I can see the attraction of placing "as foo, bar" instead of "[foo, bar]" between the argument list and the colon.
This syntax could be used to remove a recent wart, making generators explicit:
def g as generator: suite
Of course, it would be a syntax error if the generator suite didn't contain a "yield" statement.
But then 'generator' would have to be recognized by the parse as a magic keyword. I thought that the idea was that there could be a list of arbitrary expressions after the 'as' (or inside the square brackets). If it has to be keywords, you lose lots of flexibility.
This syntax has aesthetic appeal. I know nothing about its implementability.
That's too bad, because implementabilty makes or breaks a language feature.
What namespace these new modifier terms would live in, I also don't know. The syntax proposal reads well and seems like it could be a good general-purpose solution.
The devil is in the namespace details. Unless we can sort those out, all proposals are created equal. --Guido van Rossum (home page: http://www.python.org/~guido/)
On Fri, Jan 31, 2003, Guido van Rossum wrote:
David Goodger:
A standard function definition (simple, unadorned "def") is equivalent to:
def f as function: suite
In this syntax, where does the list of formal parameters go?
def f(spam, eggs) as function: suite
Also, this requires that the scope rules for the suite are exactly as they currently are for functions (including references to non-locals).
That's all fine and dandy, but then I don't understand how the poor implementation of property is going to extract __get__ etc. from the local variables of the function body after executing it.
Sort of. Each keyword can handle the thunk differently. For the property keyword, it'd be handled more similarly to a class. In fact, class then becomes def C as class: suite
This syntax could be used to remove a recent wart, making generators explicit:
def g as generator: suite
Of course, it would be a syntax error if the generator suite didn't contain a "yield" statement.
But then 'generator' would have to be recognized by the parse as a magic keyword. I thought that the idea was that there could be a list of arbitrary expressions after the 'as' (or inside the square brackets). If it has to be keywords, you lose lots of flexibility.
You allow both. The square brackets operate on the thunk (which is probably all that's needed for classmethod and staticmethod); different kinds of thunk parsing/compiling require keywords: def C as class: def foo(x) [staticmethod]: return x*2 def bar(self, y) as generator: for i in y: yield self.foo(i) def spam as property [classmethod]: def __get__(cls): ... The classmethod object would of course need to be able to differentiate thunk kinds, assuming you wanted to allow this at all -- but it doesn't require a syntax change to add class properties if they didn't exist.
What namespace these new modifier terms would live in, I also don't know. The syntax proposal reads well and seems like it could be a good general-purpose solution.
The devil is in the namespace details. Unless we can sort those out, all proposals are created equal.
The keywords go in the "defining thunk" namespace. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Argue for your limitations, and sure enough they're yours." --Richard Bach
That's all fine and dandy, but then I don't understand how the poor implementation of property is going to extract __get__ etc. from the local variables of the function body after executing it.
Sort of. Each keyword can handle the thunk differently. For the property keyword, it'd be handled more similarly to a class. In fact, class then becomes
def C as class: suite
Um, the whole point of this syntax is that property, synchronized etc. do *not* have to be keywords -- they are just callables. So the compiler cannot look at what the thunk is used for. We need uniform treatment of all thunks. (Even if it means that the thunk's consumer has to work a little harder.)
But then 'generator' would have to be recognized by the parse as a magic keyword. I thought that the idea was that there could be a list of arbitrary expressions after the 'as' (or inside the square brackets). If it has to be keywords, you lose lots of flexibility.
You allow both. The square brackets operate on the thunk (which is probably all that's needed for classmethod and staticmethod); different kinds of thunk parsing/compiling require keywords:
def C as class: def foo(x) [staticmethod]: return x*2 def bar(self, y) as generator: for i in y: yield self.foo(i) def spam as property [classmethod]: def __get__(cls): ...
I don't think this is a problem we're trying to solve. --Guido van Rossum (home page: http://www.python.org/~guido/)
From: "Guido van Rossum" <guido@python.org>
So the compiler cannot look at what the thunk is used for. We need uniform treatment of all thunks. (Even if it means that the thunk's consumer has to work a little harder.)
is it a correct assumption that generalized thunks be it, and so argumenting against them is wasting my time?
From: "Guido van Rossum" <guido@python.org>
So the compiler cannot look at what the thunk is used for. We need uniform treatment of all thunks. (Even if it means that the thunk's consumer has to work a little harder.)
is it a correct assumption that generalized thunks be it, and so argumenting against them is wasting my time?
Not at all. This is still wide open. I happen to like generalized thunks because they remind me of Ruby blocks. But I realize there are more ways to skin this cat. Keep it coming! And remember, I'm arguing in my spare time. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
From: "Guido van Rossum" <guido@python.org>
From: "Guido van Rossum" <guido@python.org>
So the compiler cannot look at what the thunk is used for. We need uniform treatment of all thunks. (Even if it means that the thunk's consumer has to work a little harder.)
is it a correct assumption that generalized thunks be it, and so argumenting against them is wasting my time?
Not at all. This is still wide open. I happen to like generalized thunks because they remind me of Ruby blocks. But I realize there are more ways to skin this cat. Keep it coming!
question, do you want thunks to be able to take arguments? is being able to write something like this a target? iterate(list): (x): print x
Not at all. This is still wide open. I happen to like generalized thunks because they remind me of Ruby blocks. But I realize there are more ways to skin this cat. Keep it coming!
question, do you want thunks to be able to take arguments? is being able to write something like this a target?
iterate(list): (x): print x
Since we already have a for loop, I'm not sure what the use case would be, but it might be nice, yes. Feel free to call YAGNI on it until a better use case is found. But I imagine it would be easy enough to add on later in the design -- thunks can have free variables anyway, so providing a way to bind some of those ahead of time (and designate them as bindable) should be easy. --Guido van Rossum (home page: http://www.python.org/~guido/)
From: "Guido van Rossum" <guido@python.org>
Not at all. This is still wide open. I happen to like generalized thunks because they remind me of Ruby blocks. But I realize there are more ways to skin this cat. Keep it coming!
question, do you want thunks to be able to take arguments? is being able to write something like this a target?
iterate(list): (x): print x
Since we already have a for loop, I'm not sure what the use case would be, but it might be nice, yes. Feel free to call YAGNI on it until a better use case is found. But I imagine it would be easy enough to add on later in the design -- thunks can have free variables anyway, so providing a way to bind some of those ahead of time (and designate them as bindable) should be easy.
Go for it, write a PEP. Seriously: if redundance is not an issue for you, then is just a matter of taste (and I don't want to argue about that) and readability/transparence of what some code do (maybe someone else will argue about that). Some final remarks from my part: - With nested scope cells any kind of scoping can be achieved, so that's not a problem up to speed-wise unbalance with normal inline suites. - Non-local 'return' and 'break' can be achieved with special exceptions and uniquely identifying thunk sites, ! BUT yield inside a thunk cannot be implemented in Jython - I think that there should be a *distinguashable* syntax for introducing/using thunks with inline-suite scoping vs. 'class'-like scoping. I'm very very very uncomfortable with the idea of variable-geometry scoping, that means that I would have to go read some hairy code defining 'foo' in order to know: def f(): x=3 foo: x=2 whether the second x= is modifying the local x to f, or is just local to foo suite. - for completeness you may want to add a keyword to directly return something from a thunk to its caller, instead of non-locally return with 'return' from thunk site: def f(lst): iterate(lst): (x): if pred(x): return x # non-local return out of f def change(lst): substitute(lst): (x): value 2*x # return value to thunk caller L=[1,2,3] change(L) L is now [2,4,6]
From: "Samuele Pedroni" <pedronis@bluewin.ch>
- I think that there should be a *distinguashable* syntax for introducing/using thunks with inline-suite scoping vs. 'class'-like scoping. I'm very very very uncomfortable with the idea of variable-geometry scoping, that means that I would have to go read some hairy code defining 'foo' in order to know:
def f(): x=3 foo: x=2
whether the second x= is modifying the local x to f, or is just local to foo suite.
So syntax-wise I personally would go for: do iterate(lst): (x): print x introducing a new-keyword 'do' (or something similar) and that would imply inline-suite-like scoping. And [KEYW-TO-ESTABLISH-OR-NOTHING] property foo: ... [KEYW-TO-ESTABLISH-OR-NOTHING] interface.interface I(J,K): ... implying 'class'-like scoping and not allowing the thunk maybe to take arguments or have break/continue/return(/value) in it. regards
----- Original Message ----- From: "Samuele Pedroni" <pedronis@bluewin.ch>
So syntax-wise I personally would go for:
do iterate(lst): (x): print x
introducing a new-keyword 'do' (or something similar) and that would imply inline-suite-like scoping.
this clarify scoping, still while what 'return' should do is more or less clear cut (non-local return), what break or continue should do is ambiguous: for lst in lists: do iterate(lst): (x): if not x: break # should break 'do' or 'for' print x vs for lst in lists: do synchronized(lock): if not x: break # should probably break 'for' print x so maybe we need a way to differentiate also this, so two different keyword regards.
Guido:
I happen to like generalized thunks because they remind me of Ruby blocks. But I realize there are more ways to skin this cat.
Seems to me it's more a case of there being several different animals we're considering skinning, and we're trying to find one tool to skin them all, and in the darkness find them... oops, sorry, mixing a metaphor with an allusion... Anyway, I think I agree with the comment made earlier that maybe we're trying to unify too many things. All of these things seem like they would benefit from some kind of code-block mechanism, but have differing namespace requirements. Let's see if I can categorise them: * Defining a function: The code block executes in a new optimised local namespace. The local namespace is discarded when the block returns. * Defining a class: The code block executes with a new dictionary as its local namespace. The local namespace is retained when the block returns. * Defining a property: Same as defining a class. * New control structures: The code block does not have a local namespace of its own, but shares one with the surrounding code. So there appear to be at least two, and possibly three, different ways that the code block will need to be compiled, depending on its intended usage. Therefore, the intended usage will have to be made known somehow at compile time. David Goodger's solution to this is to have the compiler treat each "as xxx" as a special case. I can understand Gudio wanting something more general than this. But it looks to me at the moment like we're going to need about three different syntaxes, one for each of the above code block usages. (1) We already have one for defining a function: def foo(args): ... We just have to be clear that this is *not* equivalent to any instance of (2) below, because the thunk usage is different. (2) Class-definition-like usages: def foo as something: ... Defining a class could be a special case of this, e.g. def myclass as type: ... I can understand Guido's reluctance to re-use "def" for this, given that (1) is not a special case of it. But I haven't thought of anything better yet. The best I've come up with so far is namespace foo(something): ... with "namespace" being a new keyword. For example, a property would be defined with namespace foo(property): ... But I don't like it much myself for various reasons: it's too verbose, and it wouldn't really sound right calling a property a namespace (it's not, the namespace would only be an implementation detail of the mechanism of it's creation). (3) Control-structure-like usages: expression: ... var = expression: ... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
On Fri, Jan 31, 2003, Guido van Rossum wrote:
Aahz:
Guido:
That's all fine and dandy, but then I don't understand how the poor implementation of property is going to extract __get__ etc. from the local variables of the function body after executing it.
Sort of. Each keyword can handle the thunk differently. For the property keyword, it'd be handled more similarly to a class. In fact, class then becomes
def C as class: suite
Um, the whole point of this syntax is that property, synchronized etc. do *not* have to be keywords -- they are just callables.
So the compiler cannot look at what the thunk is used for. We need uniform treatment of all thunks. (Even if it means that the thunk's consumer has to work a little harder.)
I'm not sure I agree with this assertion. I think that there *do* need to be different kinds of thunks. Perhaps it's only necessary for there to be two or three different kinds of generic thunks (at least "creates new local scope" and "does not create new local scope"). Note that what follows after the "as" need not be an actual Python keyword, but it does need to be something significant to the parser/compiler, similar to the way "from __future__" works. I think that requiring thunk types to be built into the language is reasonable. If we're going the generic route, we could do something like this: class C: def foo as (immediate, local) [property]: def __get__(self): ... I suppose we could agree that all thunks create a new local scope and that all thunks get executed immediately (like classes and modules, but unlike functions), but I would prefer to not build that restriction into the syntax. If we don't agree on this (or on some other set of properties that all thunks have), we can't have uniform treatment of thunks as you desire. It occurs to me that "thunk" would make a good keyword if we're adding one. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Argue for your limitations, and sure enough they're yours." --Richard Bach
On Fri, Jan 31, 2003, Greg Ewing wrote:
But anyway, here's another idea:
def foo as property: def __get__(self): ... def __set__(self, x): ...
+1 I know Guido has already said he doesn't like this, but it reads very Pythonically to me, and IMO there's already precedent for abusing "def" because of generators. It's too bad this idea didn't show up before generators existed. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Argue for your limitations, and sure enough they're yours." --Richard Bach
[Samuele]
an alternative (if parseable, I have not fully thought about that) would be to leave out the KEYW-TO-BE and try to parse directly
kind name [ '(' expr,... ')' ] [ maybe [] extended syntax ]:
where kind could be any general expr or better only a qualified name that are NOT keywords, so we would possible have:
property foo: <suite>
interface.interface I(J,K): <suite>
all working as specified like 'class' and with its scope rules.
The problem is that this is much harder for the LL(1) parser; otherwise I like it fine. The name being defined (foo and I above) would be available to the code implementing 'kind', which is useful. What's still missing is a way to add formal parameters to the thunk -- I presume that J and K are evaluated before interface.interface is called. The thunk syntax could be extended to allow this; maybe this can too. E.g.: e:(x, y): S would create a thunk with two formal parameters, x and y; and when e calls the thunk, it has to call it with two arguments which will be placed in x and y. But this is a half-baked idea, and the syntax I show here is ambiguous.
Control flow statements would still have to be added to the language one by one (I find that ok and pythonic).
I disagree; there's a number of abstractions like synchronized that would be very useful to have.
Also because specifying and implementing implicit thunk with proper scoping and non-local return etc does not (to me) seem worth the complication.
See my other post.
About extending or generalizing function 'def' beyond [] extended syntax, I don't see a compelling case.
Me neither. --Guido van Rossum (home page: http://www.python.org/~guido/)
[Guido van Rossum]
What's still missing is a way to add formal parameters to the thunk -- I presume that J and K are evaluated before interface.interface is called. The thunk syntax could be extended to allow this; maybe this can too. E.g.:
e:(x, y): S
Would something along the lines of:: foo = e(x,y):: S Work? The double colons (assuming the parser can handle this; don't see why not but I am knowledge of parsers amounts to squat) would allow for an obvious differentiation between a pseudo inline block that is being used in an assignment statement and where a single colon is normally used. And since there is no keyword here specifically delineating that it is a define or something it kind of lets it look like a call on e which is sort of what it is doing. A stretch, I know, but hey, every bit helps. =) -Brett
On Thu, 30 Jan 2003, Brett Cannon wrote:
[Guido van Rossum]
What's still missing is a way to add formal parameters to the thunk -- I presume that J and K are evaluated before interface.interface is called. The thunk syntax could be extended to allow this; maybe this can too. E.g.:
e:(x, y): S
Would something along the lines of::
foo = e(x,y):: S
What about: foo = e: : x,y S ? Or something like above but with some keyword: foo = e: with x,y S or foo = e: for x,y S (where 'with' or 'for' are like global) Any other thing in the same line with "foo = e" are not nice, IMHO. Sincerely yours, Roman Suzi -- rnd@onego.ru =\= My AI powered by Linux RedHat 7.3
participants (19)
-
Aahz
-
barry@python.org
-
Ben Laurie
-
Brett Cannon
-
David Goodger
-
Fredrik Lundh
-
Greg Ewing
-
Guido van Rossum
-
holger krekel
-
Jeremy Hylton
-
jeremy@zope.com
-
Jim Fulton
-
Ka-Ping Yee
-
Michael Hudson
-
Moore, Paul
-
Neil Schemenauer
-
Roman Suzi
-
Samuele Pedroni
-
Shane Holloway (IEEE)