Re: [Python-ideas] make __closure__ writable

Yury Selivanov wrote:
Which you can do with a decorator. Would this do what you want? def f_with_new_closure(f, closure): return types.FunctionType(f.__code__, f.__globals__, f.__name__, f.__defaults__, closure) Cheers, Mark.

Yes, your approach will work if your decorator is the only one applied. But, as I said, if you have many of them (see below), you can't just return a new function out of your decorator, you need to change the underlying "in-place". Consider the following: def modifier(func): orig_func = func while func.__wrapped__: func = func.__wrapped__ # patch func.__code__ and func.__closure__ return orig_func # no need to wrap anything def some_decorator(func): def wrapper(*args, **kwargs): # some code return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper @modifier @some_decorator def foo(): # this code needs to be verified/augmented/etc So, in the above snippet, if you don't want to discard the @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. In a complex framework, where you can't guarantee that your magic decorator will always be called first, rewriting the __closure__ attribute is the only way. Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it. - Yury On 2012-03-16, at 3:24 PM, Mark Shannon wrote:

On 2012-03-16, at 9:44 PM, Andrew Svetlov wrote:
I'm ok with mutable __closure__ but can you point the real use case?
Well, we need mutable __closure__ to be able to inject some constants or objects to the namespace the corresponding __code__ works with. If you want to know why on earth did we need to mess with the __code__ object at all, that's to control the execution of the 'finally' statement in generator-based coroutines. We modify the __code__ of generators to signal when they are executing in their 'finally' blocks, and when they are, we never abort the execution (by timeout, for instance). The code we inject needs to call one function, and for now, we just inject that function to the generator's __globals__, but the cleaner solution would be to just modify its __closure__. BTW, that's the real problem many coroutine-based frameworks will encounter some day. While this may all sound too complicated, the case is real. We had an option to either patch CPython (and later PyPy), or to inject the needed opcodes in the __code__ object directly. We found that the latter is more preferable. So as I said: I see no reason in protecting the __closure__ attribute, when the __code__ object is writeable. - Yury Selivanov http://sprymix.com

I've created an issue: http://bugs.python.org/issue14369 - Yury

Yury Selivanov wrote:
I've created an issue: http://bugs.python.org/issue14369
I think that creating an issue may be premature, given that you have had no positive feedback on the idea. I still think making __closure__ mutable is unnecessary. If you insist that it is it, then please provide an example which would work with your proposed change, but cannot be made to work using types.FunctionType() to create a new closure. Cheers, Mark.

I did provide such example earlier in this thread. I'm copying and pasting it to this mail. Please read the example carefully, as it explains why returning new types.FunctionType() is not enough. ---- Yes, your approach will work if your decorator is the only one applied. But, as I said, if you have many of them (see below), you can't just return a new function out of your decorator, you need to change the underlying "in-place". Consider the following: def modifier(func): orig_func = func while func.__wrapped__: func = func.__wrapped__ # patch func.__code__ and func.__closure__ return orig_func # no need to wrap anything def some_decorator(func): def wrapper(*args, **kwargs): # some code return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper @modifier @some_decorator def foo(): # this code needs to be verified/augmented/etc So, in the above snippet, if you don't want to discard the @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. In a complex framework, where you can't guarantee that your magic decorator will always be called first, rewriting the __closure__ attribute is the only way. Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it. On 2012-03-20, at 5:34 AM, Mark Shannon wrote:

On Tue, Mar 20, 2012 at 11:06 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it.
FWIW, I'm current +0 on the idea (based on Yury's reasoning here about updating already wrapped functions), but it's going to be a while before I can dig into the code and see if there are any *good* reasons we protect __closure__ (rather than that protection merely being an implementation artifact). I can't recall any off the top of my head, but that part of the code is complex enough that I don't completely trust my memory on that point. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2012-03-20, at 9:15 AM, Nick Coghlan wrote:
Well, it seems that it is an implementation artifact. I've run across the CPython code, and got an impression that changing __closure__ to be writeable won't break anything. Writing malicious values to it may segfault python, but that's no different from doing so with __code__. In some way, __closure__ is already writeable, since you can pass it to the 'types.FunctionObject', and create a broken function, hence I don't see a reason in making it readonly. The patch attached to the issue in the tracker passes all unittests, and adds one more - specifically for testing __closure__ modification on a live function object. - Yury

On Tue, Mar 20, 2012 at 7:31 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-0 (for now :) "it won't break anything" isn't the only metric that should be considered here. Would this change make it too easy for people to write hard-to-debug code without realizing it? Would this create an attractive nuisance? I don't have enough Guido-fu to answer these questions. Though __code__ is already writable, I don't think it's a good comparison about which to reason. The object bound to __code__ is much more complex (and guarded) than the one bound to __closure__. I expect that it would be much easier to do the wrong thing with __closure__. As well, it seems to me that hacking __closure__ would be much more tempting. Will we see a "significantly" higher number of bugs about segfaults where we have to respond with "don't do that"? Probably not. But should any solution here guard (at some expense) against such mistakes that currently are much more difficult to make? Nick already alluded to double-checking the code somewhat to that effect. I'm not opposed in principle to making __closure__ writable, but worry that not all the angles are being considered here. -eric

On Fri, Mar 23, 2012 at 1:44 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Yes, while I'm in favour of the writable closure attribute idea in principle, the details of how we access the closure array are the kind of thing I'm worried about when I say I need to check the source code before commenting on the implementation details. Setting "f.__closure__ = []" is a lot easier than crafting the necessary bytecode to cause problems with the current setup, so "Can the new behaviour be abused to segfault CPython with pure Python code?" is exactly the right question to be asking. With Victor's recent work to close some longstanding segfault vulnerabilities, I really don't want us to be adding anything that goes in the other direction. However, I won't be doing that investigation myself until my broadband provider finally finishes setting up the connection at my new place, so if anyone wants to cast an appropriately paranoid eye over Yury's patch in the meantime, please go ahead :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2012-03-23, at 12:08 AM, Nick Coghlan wrote:
Well, it does look easier, but on the other hand you can substitute the __code__ object with just two more lines, and it may be as fatal as writing wrong __closure__.
I get the point of segfault vulnerabilities and don't want to introduce any new ways either. Will adding two checks: *) in closure_setter check that all items in the tuple are cells AND that the length of new __closure__ greater or equal than __code__ freevars; *) in code_setter check that __code__ freevars are less or equal to the __closure__ length; be sufficient? This way you won't be able to segfault python, but will have a freedom to manipulate the __closure__ by introducing more vars to it. - Yury

As one of the resident crackpot/idealists I don't know that my +1 means much, but I have a decent amount of code where I jump through a lot of hoops to emulate writable __closure__, including copying a function using FunctionType(), then replace one instance of a function with another multiple places in a function graph; I also do a lot of lambda wrapping and unwrapping for the same purpose. This is primarily relevant relative to symbolic computation graphs, such as dataflow structures, computer algebra systems, etc.

Yury Selivanov wrote:
So why won't this work? def f_with_new_closure(f, closure): return types.FunctionType(f.__code__, f.__globals__, f.__name__, f.__defaults__, closure) def modifier(func, closure): if func.__wrapped__: while func.__wrapped__.__wrapped__: func = func.__wrapped__ func.__wrapped__ = f_with_new_closure(func.__wrapped__, closure) else: return f_with_new_closure(func, closure) if func.__wrapped__: return f_with_new_closure(func, f_with_new_closure(func.__wrapped__)) else: return f_with_new_closure(func, closure) Cheers, Mark.

Because usually you write decorators as functions, not classes. And when you do the former style, you usually do it in the following way: def decorator(func): def wrapper(*args, **kwargs): return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper Now, let's use it: @decorator def some_func(): pass OK. At this point, 'some_func' object has a '__wrapped__' attribute, that points to original 'some_func' function. But whatever you write to 'some_func.__wrapped__' won't change anything, as the 'wrapper' will continue to call old 'some_func'. Instead of assigning something to __wrapped__, we need to change it in-place, by doing '__wrapped__.__closure__ = new_closure'. On 2012-03-20, at 10:43 AM, Mark Shannon wrote:

On 2012-03-20, at 10:56 AM, Yury Selivanov wrote:
And as I told you in the first example: there is no problem when you have only one decorator. You can surely just return a new FunctionType(). But when you have many of them, such as: @decorator3 @your_magic_decorator_that_modifies_the_closure @decorator2 @decorator1 def some_func(): pass The only way to modify the __closure__ it to write to the __wrapped__ attribute of 'decorator1' wrapper. - Yury

While overriding __closure__ is not common case it's hard to do that if you really need. If __code__ is already writable there are no reason to prevent writing to __closure__ as well. I am +1 to push the patch in 3.3. On Tue, Mar 20, 2012 at 4:59 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-- Thanks, Andrew Svetlov

On Tue, Mar 20, 2012 at 6:06 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Couldn't something like the following work? def modifier(func): """Traverse the decorator "stack" and patch the bottom-most wrapped function.""" # relies on __wrapped__ being set at each level of the decorator stack and on the # wrapped function being bound in func.__closure__. if not hasattr(func, "__wrapped__"): # patch func.__code__ and func.__closure__ code = ... closure = ... else: code = func.__code__ closure = list(func.__closure__) closure[closure.index(func.__wrapped__)] = modifier(func.__wrapped__) return type(func)(code, func.__globals__, func.__name__, func.__defaults__, tuple(closure)) Also, I'm guessing that your actual use-case looks more like the following: from some_third_party_module import foo #assert foo.__wrapped__ == foo.__closure__[0] foo = modifier(foo) # hacks up foo.__wrapped__ Hacking the innards of an existing function object is touchy stuff, probably the riskiest kind of monkey-patching. You're basically taking the chance of breaking (in ugly, unexpected ways) other code that uses that function you just hacked. Still, there are certainly valid use-cases (and we're all consenting adults here). However, I'd hate for projects to start getting blamed for difficult-to-debug problems that are the result of some other project that did this sort of hack. It's nice when your expectations for a function's behavior (or any code for that matter) can remain stable, regardless of what libraries are installed along-side. -eric p.s. I want to reiterate my understanding that nearly everything involving the internals of functions is pretty delicate (read: fragile), in part due to being the focal point for optimization. Hacking it like this is therefore a delicate undertaking and definitely skirts the edges of creating implementation-specific code. Don't shy away. Just be extra cautious.

On 2012-03-23, at 2:19 AM, Eric Snow wrote:
Well, it certainly possible to hack on this level too, but I wouldn't do that ;) The only case I came up with after looking at your code was a way of extracting the decorated function if the decorator didn't set __wrapped__ attribute. But even that is just an idea.
I described our exact use case here: http://mail.python.org/pipermail/python-ideas/2012-March/014550.html
You're right. Once you start to work on that level you are on your own. But if you really forced to decide between the following three options: - introduce some really ugly syntax to support normal try..finally semantics in coroutine - patch cpython and restrict your framework from ever being used by anyone, and introduce additional hassles over deployment - work with python __code__ object to get around the problem I vote for the last option. If you have good unittests you shouldn't experience any problems, at least between python major releases.
Well, that's much more complicated problem. I don't think that making __closure__ writable will anyhow make it worse. Once again, you can screw up the __code__ object even now ;) - Yury

Yes, your approach will work if your decorator is the only one applied. But, as I said, if you have many of them (see below), you can't just return a new function out of your decorator, you need to change the underlying "in-place". Consider the following: def modifier(func): orig_func = func while func.__wrapped__: func = func.__wrapped__ # patch func.__code__ and func.__closure__ return orig_func # no need to wrap anything def some_decorator(func): def wrapper(*args, **kwargs): # some code return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper @modifier @some_decorator def foo(): # this code needs to be verified/augmented/etc So, in the above snippet, if you don't want to discard the @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. In a complex framework, where you can't guarantee that your magic decorator will always be called first, rewriting the __closure__ attribute is the only way. Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it. - Yury On 2012-03-16, at 3:24 PM, Mark Shannon wrote:

On 2012-03-16, at 9:44 PM, Andrew Svetlov wrote:
I'm ok with mutable __closure__ but can you point the real use case?
Well, we need mutable __closure__ to be able to inject some constants or objects to the namespace the corresponding __code__ works with. If you want to know why on earth did we need to mess with the __code__ object at all, that's to control the execution of the 'finally' statement in generator-based coroutines. We modify the __code__ of generators to signal when they are executing in their 'finally' blocks, and when they are, we never abort the execution (by timeout, for instance). The code we inject needs to call one function, and for now, we just inject that function to the generator's __globals__, but the cleaner solution would be to just modify its __closure__. BTW, that's the real problem many coroutine-based frameworks will encounter some day. While this may all sound too complicated, the case is real. We had an option to either patch CPython (and later PyPy), or to inject the needed opcodes in the __code__ object directly. We found that the latter is more preferable. So as I said: I see no reason in protecting the __closure__ attribute, when the __code__ object is writeable. - Yury Selivanov http://sprymix.com

I've created an issue: http://bugs.python.org/issue14369 - Yury

Yury Selivanov wrote:
I've created an issue: http://bugs.python.org/issue14369
I think that creating an issue may be premature, given that you have had no positive feedback on the idea. I still think making __closure__ mutable is unnecessary. If you insist that it is it, then please provide an example which would work with your proposed change, but cannot be made to work using types.FunctionType() to create a new closure. Cheers, Mark.

I did provide such example earlier in this thread. I'm copying and pasting it to this mail. Please read the example carefully, as it explains why returning new types.FunctionType() is not enough. ---- Yes, your approach will work if your decorator is the only one applied. But, as I said, if you have many of them (see below), you can't just return a new function out of your decorator, you need to change the underlying "in-place". Consider the following: def modifier(func): orig_func = func while func.__wrapped__: func = func.__wrapped__ # patch func.__code__ and func.__closure__ return orig_func # no need to wrap anything def some_decorator(func): def wrapper(*args, **kwargs): # some code return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper @modifier @some_decorator def foo(): # this code needs to be verified/augmented/etc So, in the above snippet, if you don't want to discard the @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. In a complex framework, where you can't guarantee that your magic decorator will always be called first, rewriting the __closure__ attribute is the only way. Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it. On 2012-03-20, at 5:34 AM, Mark Shannon wrote:

On Tue, Mar 20, 2012 at 11:06 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it.
FWIW, I'm current +0 on the idea (based on Yury's reasoning here about updating already wrapped functions), but it's going to be a while before I can dig into the code and see if there are any *good* reasons we protect __closure__ (rather than that protection merely being an implementation artifact). I can't recall any off the top of my head, but that part of the code is complex enough that I don't completely trust my memory on that point. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2012-03-20, at 9:15 AM, Nick Coghlan wrote:
Well, it seems that it is an implementation artifact. I've run across the CPython code, and got an impression that changing __closure__ to be writeable won't break anything. Writing malicious values to it may segfault python, but that's no different from doing so with __code__. In some way, __closure__ is already writeable, since you can pass it to the 'types.FunctionObject', and create a broken function, hence I don't see a reason in making it readonly. The patch attached to the issue in the tracker passes all unittests, and adds one more - specifically for testing __closure__ modification on a live function object. - Yury

On Tue, Mar 20, 2012 at 7:31 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-0 (for now :) "it won't break anything" isn't the only metric that should be considered here. Would this change make it too easy for people to write hard-to-debug code without realizing it? Would this create an attractive nuisance? I don't have enough Guido-fu to answer these questions. Though __code__ is already writable, I don't think it's a good comparison about which to reason. The object bound to __code__ is much more complex (and guarded) than the one bound to __closure__. I expect that it would be much easier to do the wrong thing with __closure__. As well, it seems to me that hacking __closure__ would be much more tempting. Will we see a "significantly" higher number of bugs about segfaults where we have to respond with "don't do that"? Probably not. But should any solution here guard (at some expense) against such mistakes that currently are much more difficult to make? Nick already alluded to double-checking the code somewhat to that effect. I'm not opposed in principle to making __closure__ writable, but worry that not all the angles are being considered here. -eric

On Fri, Mar 23, 2012 at 1:44 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Yes, while I'm in favour of the writable closure attribute idea in principle, the details of how we access the closure array are the kind of thing I'm worried about when I say I need to check the source code before commenting on the implementation details. Setting "f.__closure__ = []" is a lot easier than crafting the necessary bytecode to cause problems with the current setup, so "Can the new behaviour be abused to segfault CPython with pure Python code?" is exactly the right question to be asking. With Victor's recent work to close some longstanding segfault vulnerabilities, I really don't want us to be adding anything that goes in the other direction. However, I won't be doing that investigation myself until my broadband provider finally finishes setting up the connection at my new place, so if anyone wants to cast an appropriately paranoid eye over Yury's patch in the meantime, please go ahead :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2012-03-23, at 12:08 AM, Nick Coghlan wrote:
Well, it does look easier, but on the other hand you can substitute the __code__ object with just two more lines, and it may be as fatal as writing wrong __closure__.
I get the point of segfault vulnerabilities and don't want to introduce any new ways either. Will adding two checks: *) in closure_setter check that all items in the tuple are cells AND that the length of new __closure__ greater or equal than __code__ freevars; *) in code_setter check that __code__ freevars are less or equal to the __closure__ length; be sufficient? This way you won't be able to segfault python, but will have a freedom to manipulate the __closure__ by introducing more vars to it. - Yury

As one of the resident crackpot/idealists I don't know that my +1 means much, but I have a decent amount of code where I jump through a lot of hoops to emulate writable __closure__, including copying a function using FunctionType(), then replace one instance of a function with another multiple places in a function graph; I also do a lot of lambda wrapping and unwrapping for the same purpose. This is primarily relevant relative to symbolic computation graphs, such as dataflow structures, computer algebra systems, etc.

Yury Selivanov wrote:
So why won't this work? def f_with_new_closure(f, closure): return types.FunctionType(f.__code__, f.__globals__, f.__name__, f.__defaults__, closure) def modifier(func, closure): if func.__wrapped__: while func.__wrapped__.__wrapped__: func = func.__wrapped__ func.__wrapped__ = f_with_new_closure(func.__wrapped__, closure) else: return f_with_new_closure(func, closure) if func.__wrapped__: return f_with_new_closure(func, f_with_new_closure(func.__wrapped__)) else: return f_with_new_closure(func, closure) Cheers, Mark.

Because usually you write decorators as functions, not classes. And when you do the former style, you usually do it in the following way: def decorator(func): def wrapper(*args, **kwargs): return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper Now, let's use it: @decorator def some_func(): pass OK. At this point, 'some_func' object has a '__wrapped__' attribute, that points to original 'some_func' function. But whatever you write to 'some_func.__wrapped__' won't change anything, as the 'wrapper' will continue to call old 'some_func'. Instead of assigning something to __wrapped__, we need to change it in-place, by doing '__wrapped__.__closure__ = new_closure'. On 2012-03-20, at 10:43 AM, Mark Shannon wrote:

On 2012-03-20, at 10:56 AM, Yury Selivanov wrote:
And as I told you in the first example: there is no problem when you have only one decorator. You can surely just return a new FunctionType(). But when you have many of them, such as: @decorator3 @your_magic_decorator_that_modifies_the_closure @decorator2 @decorator1 def some_func(): pass The only way to modify the __closure__ it to write to the __wrapped__ attribute of 'decorator1' wrapper. - Yury

While overriding __closure__ is not common case it's hard to do that if you really need. If __code__ is already writable there are no reason to prevent writing to __closure__ as well. I am +1 to push the patch in 3.3. On Tue, Mar 20, 2012 at 4:59 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-- Thanks, Andrew Svetlov

On Tue, Mar 20, 2012 at 6:06 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Couldn't something like the following work? def modifier(func): """Traverse the decorator "stack" and patch the bottom-most wrapped function.""" # relies on __wrapped__ being set at each level of the decorator stack and on the # wrapped function being bound in func.__closure__. if not hasattr(func, "__wrapped__"): # patch func.__code__ and func.__closure__ code = ... closure = ... else: code = func.__code__ closure = list(func.__closure__) closure[closure.index(func.__wrapped__)] = modifier(func.__wrapped__) return type(func)(code, func.__globals__, func.__name__, func.__defaults__, tuple(closure)) Also, I'm guessing that your actual use-case looks more like the following: from some_third_party_module import foo #assert foo.__wrapped__ == foo.__closure__[0] foo = modifier(foo) # hacks up foo.__wrapped__ Hacking the innards of an existing function object is touchy stuff, probably the riskiest kind of monkey-patching. You're basically taking the chance of breaking (in ugly, unexpected ways) other code that uses that function you just hacked. Still, there are certainly valid use-cases (and we're all consenting adults here). However, I'd hate for projects to start getting blamed for difficult-to-debug problems that are the result of some other project that did this sort of hack. It's nice when your expectations for a function's behavior (or any code for that matter) can remain stable, regardless of what libraries are installed along-side. -eric p.s. I want to reiterate my understanding that nearly everything involving the internals of functions is pretty delicate (read: fragile), in part due to being the focal point for optimization. Hacking it like this is therefore a delicate undertaking and definitely skirts the edges of creating implementation-specific code. Don't shy away. Just be extra cautious.

On 2012-03-23, at 2:19 AM, Eric Snow wrote:
Well, it certainly possible to hack on this level too, but I wouldn't do that ;) The only case I came up with after looking at your code was a way of extracting the decorated function if the decorator didn't set __wrapped__ attribute. But even that is just an idea.
I described our exact use case here: http://mail.python.org/pipermail/python-ideas/2012-March/014550.html
You're right. Once you start to work on that level you are on your own. But if you really forced to decide between the following three options: - introduce some really ugly syntax to support normal try..finally semantics in coroutine - patch cpython and restrict your framework from ever being used by anyone, and introduce additional hassles over deployment - work with python __code__ object to get around the problem I vote for the last option. If you have good unittests you shouldn't experience any problems, at least between python major releases.
Well, that's much more complicated problem. I don't think that making __closure__ writable will anyhow make it worse. Once again, you can screw up the __code__ object even now ;) - Yury
participants (7)
-
Andrew Svetlov
-
Eric Snow
-
Ethan Furman
-
Mark Shannon
-
Nathan Rice
-
Nick Coghlan
-
Yury Selivanov