Overloading assignment concrete proposal (Re: Re: Operator as first class citizens -- like in scala -- or yet another new operator?)
The thread on operators as first-class citizens keeps getting vague ideas about assignment overloading that wouldn't actually work, or don't even make sense. I think it's worth writing down the simplest design that would actually work, so people can see why it's not a good idea (or explain why they think it would be anyway). in pseudocode, just as x += y means this: xval = globals()['x'] try: result = xval.__iadd__(y) except AttributeError: result = xval.__add__(y) globals()['x'] = result … x = y would mean this: try: xval = globals()['x'] result = xval.__iassign__(y) except (LookupErrorr, AttributeError): result = y globals()['x'] = result If you don't understand why this would work or why it wouldn't be a great idea (or want to nitpick details), read on; otherwise, you can skip the rest of this message. --- First, why is there even a problem? Because Python doesn't even have "variables" in the same sense that languages like C++ that allow assignment overloading do. In C++, a variable is an "lvalue", a location with identity and type, and an object is just a value that lives in a location. So assignment is an operation on variables: x = 2 is the same as XClass::operator=(&x, y). In Python, an object is a value that lives wherever it wants, with identity and type, and a variable is just a name that can be bound to a value in a namespace. So assignment is an operation on namespaces, not on variables: x = 2 is the same as dict.__settem__(globals(), 'x', 2). The same thing is true for more complicated assignments. For example, a.x = 2 is just an operation on a's namespace instead of the global namespace: type(a).__setattr__(a, 'x', 2). Likewise, a.b['x'] = 2 is type(a.b).__setitem__(a.b, 'x', 2), And so on, --- But Python allows overloading augmented assignment. How does that work? There's a perfectly normal namespace lookup at the start and namespace store at the end—but in between, the existing value of the target gets to specify the value being assigned. Immutable types like int don't define __iadd__, and __add__ creates and returns a new object. So, x += y ends up the same as x = x + y. But mutable types like list define an __iadd__ that mutates self in-place and then returns self, so x gets harmlessly rebound to the same object it was already bound to. So x += y ends up the same as x.extend(y); x = x. The exact same technique would work for overloading normal assignment. The only difference is that x += y is illegal if x is unbound, while x = y obviously has to be legal (and mean there is no value to intercept the assignment). So, the fallback happens when xval doesn't define __iassign__, but also when x isn't bound at all. So, for immutable types like eint, and almost all mutable types like list—and when x is unbound—x = y does the same thing it always did. But special types that want to act like transparent mutable handles define an __iassign__ that mutates self in place and returns self, so x gets harmlessly rebound to the same object. So x = y ends up the same as, say, x.set_target(y); x = x. This all works the same if the variables are local rather than global, or for more complicated targets like attribution or subscription, and even for target lists; the intercept still happens the same way, between the (more complicated) lookup and storage steps. --- Now, why is this a bad idea? First, the benefit of __iassign__ is a lot smaller than __iadd__. A sizable fraction of "x += y" statements are for mutable "x" values, but only a rare handful of "x = y" statements would be for special handle "x" values. Even the same cost for a much smaller benefit would be a much harder sell. But the runtime performance cost difference is huge. If augmented assignment weren't overloadable, it would still have to lookup the value, lookup and call a special method on it, and store the value. The only cost overloading adds is trying two special methods instead of one, which is tiny. But regular assignment doesn't have to do a value lookup or a special method call at all, only a store; adding those steps would roughly double the cost of every new variable assignment, and even more for every reassignment. And assignments are very common in Python, even within inner loops, so we're talking about a huge slowdown to almost every program out there. Also, the fact that assignment always means assignment makes Python code easier both for humans to skim, and for automated programs to process. Consider, for example, a static type checker like mypy. Today, x = 2 means that x must now be an int, always. But if x could be a Signal object with an overloaded __iassign__, then, x = 2 might mean that x must now be an int, or it might mean that x must now be whatever type(x).__iassign__ returns. Finally, the complexity of __iassign__ is at least a little higher than __iadd__. Notice that in my pseudocode above, I cheated—obviously the xval = and result = lines are not supposed to recursively call the same pseudocode, but to directly store a value in new temporary local variable. In the real implementation, there wouldn't even be such a temporary variable (in CPython, the values would just be pushed on the stack), but for documenting the behavior, teaching it to students, etc., that doesn't matter. Being precise here wouldn't be hugely difficult, but it is a little more difficult than with __iadd__, where there's no similar potential confusion even possible. On Wednesday, June 19, 2019, 10:54:04 AM PDT, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote: On Jun 18, 2019, at 12:43, nate lust <natelust@linux.com> wrote: I have been following this discussion for a long time, and coincidentally I recently started working on a project that could make use of assignment overloading. (As an aside it is a configuration system for a astronomical data analysis pipeline that makes heavy use of descriptors to work around historical decisions and backward compatibility). Our system makes use of nested chains of objects and descriptors and proxy object to manage where state is actually stored. The whole system could collapse down nicely if there were assignment overloading. However, this works OK most of the time, but sometimes at the end of the chain things can become quite complicated. I was new to this code base and tasked with making some additions to it, and wished for an assignment operator, but knew the data binding model of python was incompatible from p. This got me thinking. I didnt actually need to overload assignment per-say, data binding could stay just how it was, but if there was a magic method that worked similar to how __get__ works for descriptors but would be called on any variable lookup (if the method was defined) it would allow for something akin to assignment. What counts as “variable lookup”? In particular: For example: class Foo: def __init__(self): self.value = 6 self.myself = weakref.ref(self) def important_work(self): print(self.value) … why doesn’t every one of those “self” lookups call self.__get_self__()? It’s a local variable being looked up by name, just like your “foo” below, and it finds the same value, which has the same __get_self__ method on its type. The only viable answer seems to that it does. So, to avoid infinite circularity, your class needs to use the same kind of workaround used for attribute lookup in classes that define __getattribute__ and/or __setattr__: def important_work(self): print(object.__get_self__(self).value) def __get_self__(self): return object.__get_self__(self).myself But even that won’t work here, because you still have to look up self to call the superclass method on it. I think it would require some new syntax, or at least something horrible involving locals(), to allow you to write the appropriate methods. def __get_self__(self): return self.myself Besides recursively calling itself for that “self” lookup, why doesn’t this also call weakref.ref.__get_self__ for that “myself” lookup? It’s an attribute lookup rather than a local namespace lookup, but surely you need that to work too, or as soon as you store a Foo instance in another object it stops overloading. For this case there’s at least an obvious answer: because weakref.ref doesn’t override that method, the variable lookup doesn’t get intercepted. But notice that this means every single value access in Python now has to do an extra special-method lookup that almost always does nothing, which is going to be very expensive. def __setattr__(self, name, value): self.value = value You can’t write __setattr__ methods this way. That assignment statement just calls self.__setattr__(‘value’, value), which will endlessly recurse. That’s why you need something like the object method call to break the circularity. Also, this will take over the attribute assignments in your __init__ method. And, because it ignores the name and always sets the value attribute, it means that self.myself = is just going to override value rather than setting myself. To solve both of these problems, you want a standard __setattr__ body here: def __setattr__(self, name, value): object.__setattr__(self, name, value) But that immediately makes it obvious that your __setattr__ isn’t actually doing anything, and could just be left out entirely. foo = Foo() # Create an instancefoo # The interpreter would return foo.myselffoo.value # The interpreter would return foo.myself.value foo = 19 # The interpreter would run foo.myself = 6 which would invoke foo.__setattr__('myself', 19) For this last one, why would it do that? There’s no lookup here at all, only an assignment. The only way to make this work would be for the interpreter to lookup the current value of the target on every assignment before assigning to it, so that lookup could be overloaded. If that were doable, then assignment would already be overloadable, and this whole discussion wouldn’t exist. But, even if you did add that, __get_self__ is just returning the value self.myself, not some kind of reference to it. How can the interpreter figure out that the weakref.ref value it got came from looking up the name “myself” on the Foo instance? (This is the same reason __getattr__ can’t help you override attribute setting, and a separate method __setattr__ is needed.) To make this work, you’d need a __set_self__ to go along with __get_self__. Otherwise, your changes not only don’t provide a way to do assignment overloading, they’d break assignment overloading if it existed. Also, all of the extra stuff you’re trying to add on top of assignment overloading can already be done today. You just want a transparent proxy: a class whose instances act like a reference to some other object, and delegate all methods (and maybe attribute lookups and assignments) to it. This is already pretty easy; you can define __getattr__ (and __setattr__) to do it dynamically, or you can do some clever stuff to create static delegating methods (and properties) explicitly at object-creation or class-creation time. Then foo.value returns foo.myself.value, foo.important_work() calls the Foo method but foo.__str__() calls foo.myself.__str__(), you can even make it pass isinstance checks if you want. The only thing it can’t do is overload assignment. I think the real problem here is that you’re thinking about references to variables rather than values, and overloading operators on variables rather than values, and neither of those makes sense in Python. Looking up, or assigning to, a local variable named “foo” is not an operation on “the foo variable”, because there is no such thing; it’s an operation on the locals namespace._______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4JMNZE... Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Jun 20, 2019 at 8:14 AM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
… x = y would mean this:
try: xval = globals()['x'] result = xval.__iassign__(y) except (LookupErrorr, AttributeError): result = y globals()['x'] = result
... Notice that in my pseudocode above, I cheated—obviously the xval = and result = lines are not supposed to recursively call the same pseudocode, but to directly store a value in new temporary local variable.
I'm rather curious how this would behave in a class context. Consider the following code: num = 10; lst = [20, 30, 40] class Spam: num += 1 lst += [50] print(num, lst, Spam.num, Spam.lst) Do you know what this will do in current Python, and what is your intention for this situation if we add a third name that uses the new __iassign__ protocol? ChrisA
On Jun 19, 2019, at 16:57, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 20, 2019 at 8:14 AM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
… x = y would mean this:
try: xval = globals()['x'] result = xval.__iassign__(y) except (LookupErrorr, AttributeError): result = y globals()['x'] = result
... Notice that in my pseudocode above, I cheated—obviously the xval = and result = lines are not supposed to recursively call the same pseudocode, but to directly store a value in new temporary local variable.
I'm rather curious how this would behave in a class context. Consider the following code:
num = 10; lst = [20, 30, 40] class Spam: num += 1 lst += [50] print(num, lst, Spam.num, Spam.lst)
Do you know what this will do in current Python, and what is your intention for this situation if we add a third name that uses the new __iassign__ protocol?
At least with CPython, I’m 99% sure that (unless you do something weird like exec this class body with custom namespaces) you’ll get 10, [20,30,40,50], 11, and [20,30,40,50]. The reason is that each += will compile to LOAD_NAME/INPLACE_ADD/STORE_NAME, and LOAD_NAME will fall back to the global when executed. I’m not quite as sure that Python the language definition requires that behavior, but I’d be a little surprised if any major implementation did anything different. So, I suppose this could be a problem: sig = Signal() # a type that defines __iassign__ class Spam: sig = 2 Here, the plausible alternative to calling sig.__iassign__(2) and then creating a Spam.sig that binds the result (which should be identical to sig, as with lst above) isn’t a NameError, but creating an Unrelated Spam.sig bound to 2. And people might actually want or expect that. I’m not sure which one people _would_ want or expect. The former is certainly easier to implement, but that isn’t a good reason to choose it. I think we’d need a realistic example of where this is actually useful before trying to decide what it should do. Anyway, since I’m proposing this idea specifically so it can be rejected, I don’t need to come up with a good answer; the fact that it’s potentially confusing In this context is just another reason to reject it, right? :)
There is nothing more humbling than sending nonsense out to an entire mailing list. It will teach me to stop an process my steam of consciousness critically instead of just firing off a message. You were right that I was not at all considering how python variables must of course work (both assignment and lookup). Your message was quite thorough, I hope it was out of a sense of teaching and not frustration. If the latter I am sorry to be the cause. Thinking about things the right way around, I dug in to the interpreter an hacked something up to add an assignment overload dunder method. A diff against python 3.7 can be found here: https://gist.github.com/natelust/063f60a4c2c8ad0d5293aa0eb1ce9514 There are other supporting changes, but the core idea is just that the type struct (typeobject.c) now has one more field (I called it tp_setself) that under normal circumstances is just 0. Then in the insertdict function (dictobject.c) (which does just what it sounds), post looking up the old value, and pre setting anything new, I added a block to check if tp_setself is defined on the type of the old value. If it is this means the user defined a __setself__ method on a type and it is called with the value that was to be assigned causing whatever side effects the user chose for that function, and the rest of insertdict body is never run. This is by no means pull request level of coding (Plenty of clean up, tests to run, and to see how this works in nestings etc.), but is a proof of concept. Out of a sense of transparency I will make note that there is an issue when __repr__ is called in the interpreter. Tt is causing insertdict to be called which could lead to an infinite recursion. This is reason for the if old_value != value check in the diff, as it prevents the recursion. However it would probably be better to not let the user build infinite cycles anyway. This has a runtime penalty (on all executed python sets) of a type stuct field lookup, a pointer offset on the struct (looking up the tp_setself field), and a comparison to null on all python code, which is to say not much at all. A runtime side effect is that the type struct grows in size by the width of a pointer. As other type fields have been added for things like async, I am guessing this is not a large issue. There is of course a runtime cost added to whatever is done in the __setself__ method, but that is something a user would expect since they are adding the method. The question about confusion reading code with this behavior still remains, i.e. this might be very surprising using a library that defines this. On this point I see the argument both ways and not really have an opinion. In some ways it is surprising that you can overload other operators but not assignment (like people might be familiar with from other languages). People don't seem to have an expectation that + always works the same way on any given library code, however python has long had the standing assignment behavior so it could be a surprise to have a change now. Below is how this runs on the python I have built on my system with the above patch: Python 3.7.4rc1+ (heads/3.7-dirty:7b2a913bd8, Jun 20 2019, 14:43:10) [GCC 7.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.
class Foo: def __init__(self, o): self.o = o def __setself__(self, v): self.v = v
f = Foo(5) print(f) <__main__.Foo object at 0x7f486bb8d300> print(f.o) 5 print(f.v) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute 'v' f = "hello world" print(f.v) hello world print(f) <__main__.Foo object at 0x7f486bb8d300>
class Bar: def __init__(self, o): self.o = o
b = Bar(5) print(b)
<__main__.Bar object at 0x7f486bb8d6f0>
print(b.o) 5 b = "hello world" print(b) hello world
On Thu, Jun 20, 2019 at 10:28 PM nate lust <natelust@linux.com> wrote:
There is nothing more humbling than sending nonsense out to an entire mailing list. It will teach me to stop an process my steam of consciousness critically instead of just firing off a message. You were right that I was not at all considering how python variables must of course work (both assignment and lookup). Your message was quite thorough, I hope it was out of a sense of teaching and not frustration. If the latter I am sorry to be the cause.
Thinking about things the right way around, I dug in to the interpreter an hacked something up to add an assignment overload dunder method. A diff against python 3.7 can be found here: https://gist.github.com/natelust/063f60a4c2c8ad0d5293aa0eb1ce9514
There are other supporting changes, but the core idea is just that the type struct (typeobject.c) now has one more field (I called it tp_setself) that under normal circumstances is just 0. Then in the insertdict function (dictobject.c) (which does just what it sounds), post looking up the old value, and pre setting anything new, I added a block to check if tp_setself is defined on the type of the old value. If it is this means the user defined a __setself__ method on a type and it is called with the value that was to be assigned causing whatever side effects the user chose for that function, and the rest of insertdict body is never run.
This is by no means pull request level of coding (Plenty of clean up, tests to run, and to see how this works in nestings etc.), but is a proof of concept. Out of a sense of transparency I will make note that there is an issue when __repr__ is called in the interpreter. Tt is causing insertdict to be called which could lead to an infinite recursion. This is reason for the if old_value != value check in the diff, as it prevents the recursion. However it would probably be better to not let the user build infinite cycles anyway.
This has a runtime penalty (on all executed python sets) of a type stuct field lookup, a pointer offset on the struct (looking up the tp_setself field), and a comparison to null on all python code, which is to say not much at all. A runtime side effect is that the type struct grows in size by the width of a pointer. As other type fields have been added for things like async, I am guessing this is not a large issue.
There is of course a runtime cost added to whatever is done in the __setself__ method, but that is something a user would expect since they are adding the method.
The question about confusion reading code with this behavior still remains, i.e. this might be very surprising using a library that defines this. On this point I see the argument both ways and not really have an opinion. In some ways it is surprising that you can overload other operators but not assignment (like people might be familiar with from other languages). People don't seem to have an expectation that + always works the same way on any given library code, however python has long had the standing assignment behavior so it could be a surprise to have a change now.
Below is how this runs on the python I have built on my system with the above patch:
Python 3.7.4rc1+ (heads/3.7-dirty:7b2a913bd8, Jun 20 2019, 14:43:10) [GCC 7.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.
class Foo: def __init__(self, o): self.o = o def __setself__(self, v): self.v = v
f = Foo(5) print(f) <__main__.Foo object at 0x7f486bb8d300> print(f.o) 5 print(f.v) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute 'v' f = "hello world" print(f.v) hello world print(f) <__main__.Foo object at 0x7f486bb8d300>
class Bar: def __init__(self, o): self.o = o
b = Bar(5) print(b)
<__main__.Bar object at 0x7f486bb8d6f0>
print(b.o) 5 b = "hello world" print(b) hello world
WoW ... I like it :) and this patch is going to be a good tutorial for me to further dive into CPython. This is more than 10K words to say "it is possible at least". Thanks nate.
On 06/20/2019 01:25 PM, nate lust wrote:
--> class Foo: ... def __init__(self, o): ... self.o = o ... def __setself__(self, v): ... self.v = v ...
--> f = Foo(5) --> print(f) <__main__.Foo object at 0x7f486bb8d300>
--> print(f.o) 5
print(f.v) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute 'v'
--> f = "hello world" --> print(f.v) hello world
--> print(f) <__main__.Foo object at 0x7f486bb8d300>
Thank you for doing the work of a proof-of-concept (and to Andrew Barnert for his excellent write-up). I think this shows exactly why it's a bad idea -- even though I knew what you were doing, having `f` not be a string after the assignment was extremely surprising. -1 -- ~Ethan~
I think it is only surprising because it is not something that is familiar. It's not like python python doesn't already have other assignment with different behaviors, for instance: a = "hello world" a = 6 print(a) -> 6 class Foo: @property def x(self): return self._internal @x.setter def x(self, value): self._internal *= value def __init__(self, x) self._internal = x a = Foo(4) print(a.x) a.x = 6 print(a.x) -> 24 This is surprising, but it doesn't mean properties are not useful. Same thing for generic use of descriptors. Most overloaded operators could have surprising behavior: a = Bar() b = Bar() a +=b type(a) -> str print(a) -> I eat all your kittens. It means that if you use code you should understand what it is doing, or at least what side effects it has. A few interesting things I thought to do with this behavior are: - True consts, if at say the module level you create instances of a class that define ___setattr__ and __setself__ (or __assign__, I just went for symmetry), and they both passed or raised a value error, then your consts would always be consts. - Expression templates, avoiding chained operations that may all be loops to differ to one more efficient loop - More natural syntax for coroutine sendto, or I guess any pipe or proxy like object - like the above context variables could make use of this - disk backed variables, on assignment things get synced to disk. - (copy/assign) like construction, this would be really useful to the pybind11 community. Allow you to take the properties of another instance of something without actually changing what you are pointing to like below: class Foo: def __init__(self, a, b,): self.a = a self.b = b def __setself__(self, other): self.a = other.a self.b = other.b class Bar: def __init__(self, foo): self.foo = foo one, two = Foo(5,6), Foo(8,9) bar, baz = Bar(one), Bar(one) one = two # copy like assignment bar.foo.a == 8 && bar.foo.b == 9 && baz.foo.a == 8 && baz.foo.b == 9 Some of these can already be done through method calls on things now, but that doesn't always "read" as well, and ends up looking very java like to me. I am also sure that given the opportunity people more creative than I can come up with all sorts of uses, similar to @classmethod or @static make good use of descriptors. If this had existed from the start or early on, it would be perfectly natural to read and use, but for now it would be somewhat shocking (in so far as it gets actually used) and I think that is the biggest minus. On Fri, Jun 21, 2019 at 4:17 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/20/2019 01:25 PM, nate lust wrote:
--> class Foo: ... def __init__(self, o): ... self.o = o ... def __setself__(self, v): ... self.v = v ...
--> f = Foo(5) --> print(f) <__main__.Foo object at 0x7f486bb8d300>
--> print(f.o) 5
print(f.v) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute 'v'
--> f = "hello world" --> print(f.v) hello world
--> print(f) <__main__.Foo object at 0x7f486bb8d300>
Thank you for doing the work of a proof-of-concept (and to Andrew Barnert for his excellent write-up). I think this shows exactly why it's a bad idea -- even though I knew what you were doing, having `f` not be a string after the assignment was extremely surprising.
-1
-- ~Ethan~ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AFPG55... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Jun 21, 2019, at 14:36, nate lust <natelust@linux.com> wrote:
I think it is only surprising because it is not something that is familiar.
Part of the problem may be that in your toy example, there really is no reason to overload assignment, so it’s surprising even after you get what it’s doing. If you worked out a more realistic example and demonstrated that with your patch, it might feel a lot less disconcerting. I feel even less convinced by your other potential uses, but again, if one of them were actually worked out, it might be quite different..
It's not like python python doesn't already have other assignment with different behaviors, for instance: a = "hello world" a = 6 print(a) -> 6
class Foo: @property def x(self): return self._internal @x.setter def x(self, value): self._internal *= value def __init__(self, x) self._internal = x
a = Foo(4) print(a.x) a.x = 6 print(a.x) -> 24
This is surprising, but it doesn't mean properties are not useful.
If this were the example given for why properties with setters should exist, it would probably get a -1 from everyone. But it doesn’t take much to turn this into an example that’s a lot more convincing. For example: self.ui.slider = BoundedIntegralSlider(0, 100) self.ui.slider.value = 101 print(self.ui.slider.value) # 100 self.ui.slider.value = sqrt(10) print(self.ui.slider.value) # 3 The same kind of thing may be true for your change.
A few interesting things I thought to do with this behavior are: - True consts, if at say the module level you create instances of a class that define ___setattr__ and __setself__ (or __assign__, I just went for symmetry), and they both passed or raised a value error, then your consts would always be consts.
I think this is confusing consts (values that can’t be replaced) with immutable values (values that can’t be changed). Overriding __setattr__ has nothing to do with constness, while it goes 80% of the way toward immutability. But also, I think it’s a bit weird for constness to be part of the value in the first place. (In C++ terms, a const variable doesn’t necessarily have a const value or vice-versa.) I think trying to fit it into the value might be what encourages confusing const and immutable. Also, shouldn’t classes and instances (even of __slots__ or @dataclass) “declare” constants just like modules? If you have to write the same thing completely differently, with different under-the-covers behavior, to accomplish the same basic concept, that’s a bit weird. A module-level @property or module __setattr__ seems like it would be a lot more consistent. (Although I’m not sure how well that ports to locals.)
- More natural syntax for coroutine sendto, or I guess any pipe or proxy like object
Making send look like assignment feels a lot less natural, not more. I’d expect something like <- (as used by most Erlang-inspired languages), but, more importantly, something different from =. Especially since you often do want to pass around and store coros, pipes, channels, etc., and if doing so actually send the coro to another coro because you’d reused the name, that would be very confusing.
If this had existed from the start or early on, it would be perfectly natural to read and use, but for now it would be somewhat shocking (in so far as it gets actually used) and I think that is the biggest minus.
I’m not sure that’s true. If we were talking about a language that has an lvalue objects-and-variables model but doesn’t allow assignment overloading (like Scala, I think?), sure. Some such languages allow it, some don’t, and it’s a pretty simple readability trade off. But languages with Smalltalk-style object models (especially those without declarations), not being able to overload assignment feels like a natural consequence: variables don’t have types, so a variable’s type can’t take control. You can’t do it in Smalltalk, or Ruby, or any of the Lisp object systems I know of, etc. So, even if Python had a clever workaround to that from the start, I think it would still feel surprising to most people. Of course the way descriptors, metaclasses, and a few other things work under the hood feels surprising until you get the point, but Python only has a very small number of such things—and they’re all used to build less surprising surface behavior (e.g., @property makes total sense to a novice, even if the implementation of it looks like black magic).
On Fri, Jun 21, 2019 at 4:17 PM Ethan Furman <ethan@stoneleaf.us> wrote: On 06/20/2019 01:25 PM, nate lust wrote:
--> class Foo: ... def __init__(self, o): ... self.o = o ... def __setself__(self, v): ... self.v = v ...
--> f = Foo(5) --> print(f) <__main__.Foo object at 0x7f486bb8d300>
--> print(f.o) 5
print(f.v) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute 'v'
--> f = "hello world" --> print(f.v) hello world
--> print(f) <__main__.Foo object at 0x7f486bb8d300>
Thank you for doing the work of a proof-of-concept (and to Andrew Barnert for his excellent write-up). I think this shows exactly why it's a bad idea -- even though I knew what you were doing, having `f` not be a string after the assignment was extremely surprising.
-1
-- ~Ethan~ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AFPG55... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CVYU43... Code of Conduct: http://python.org/psf/codeofconduct/
I will think about more interesting examples weekend, though my time will probably be more family focused so I might not get any messages of appreciable length out until sometime next week. I do get what you are saying on the const vs immutability thing, but I was thinking that because python works the way it does they would need to be related. If I have a module level variable, that was intended to be a constant, like below, it would need both. SAFETY_THRESHOLD = 100 def func(sys_temp): if sys_temp > SAFETY_THRESHOLD: turn_on_fan() Would not want someone outside the module (or someone adding new code years later in the module) redefining SAFETY_THRESHOLD. Now in something like C, a const on SAFETY_THRESHOLD would be fine without interior mutability, because a number is just a type, and thus cant be reassigned. In python it is a class instance with methods and such (substitute in string aka char[] or something). However in this case __setself__ (__assign__) would be plenty as the instances are immutable already. I assumed that there would be no __setself__ on an int, so to get the same sort of const like behavior (const + fundamental type) one would need to define MyInt, and make it interior immutable like python str or int, and also const via __setself__. Now its possible that I am being silly and not thinking through things correctly again, I keep stealing moments here or there to work on this project. If so I am happy to have someone correct me. Typing this out though does make me think of an interesting idea. If there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like: class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt Now I have not really thought through how this would work, if it could work, and how setself and getself may play with each other in execution, and it that would introduce even more weird behavior. It was just something that occurred to me when typing above. If they did play well together though, you could do something interesting like creating a variable that tracked its history such as: class HistoricVar: def __init__(self, initval): self._value self.history = [] def __getself__(self): return self._value def __setself__(self, value): self.history.append(self._value) self._value = value def go_back_n(self, n): for i in range(n): self.history.pop() self._value = self.history[-1] x = HistoricVar(1) print(x) -> 1 x = "hello world' print(x) -> "hello world" set_var_to_empty_dict(x) print(x) -> Dict x = "14" getattr(x, 'go_back_n')(2) # <- might need special handling for getattr to account for getself print(x) -> "hello world" try: for i in range(10) x = x + i if x > 17: raise ValueError('x cant be larger than 17') except: getattr(x, 'go_back_n')(1) Caveat to anyone not paying attention ^ that code has not been tried in any interpreter and the changes have been not made to support it anywhere, if it is at all possible. Now I know these are silly examples that could be done in other ways, but they are musings that fit into the scope of typing out in an email. Like I said I will see if more interesting things come to me over the weekend. If anything this thread is a fun and interesting thought experiment and a useful learning experiment for me to better understand the workings of python, and for that I thank you for your feed back and attention. Nate On Fri, Jun 21, 2019 at 6:24 PM Andrew Barnert <abarnert@yahoo.com> wrote:
On Jun 21, 2019, at 14:36, nate lust <natelust@linux.com> wrote:
I think it is only surprising because it is not something that is familiar.
Part of the problem may be that in your toy example, there really is no reason to overload assignment, so it’s surprising even after you get what it’s doing. If you worked out a more realistic example and demonstrated that with your patch, it might feel a lot less disconcerting.
I feel even less convinced by your other potential uses, but again, if one of them were actually worked out, it might be quite different..
It's not like python python doesn't already have other assignment with different behaviors, for instance: a = "hello world" a = 6 print(a) -> 6
class Foo: @property def x(self): return self._internal @x.setter def x(self, value): self._internal *= value def __init__(self, x) self._internal = x
a = Foo(4) print(a.x) a.x = 6 print(a.x) -> 24
This is surprising, but it doesn't mean properties are not useful.
If this were the example given for why properties with setters should exist, it would probably get a -1 from everyone.
But it doesn’t take much to turn this into an example that’s a lot more convincing. For example:
self.ui.slider = BoundedIntegralSlider(0, 100) self.ui.slider.value = 101 print(self.ui.slider.value) # 100 self.ui.slider.value = sqrt(10) print(self.ui.slider.value) # 3
The same kind of thing may be true for your change.
A few interesting things I thought to do with this behavior are: - True consts, if at say the module level you create instances of a class that define ___setattr__ and __setself__ (or __assign__, I just went for symmetry), and they both passed or raised a value error, then your consts would always be consts.
I think this is confusing consts (values that can’t be replaced) with immutable values (values that can’t be changed). Overriding __setattr__ has nothing to do with constness, while it goes 80% of the way toward immutability.
But also, I think it’s a bit weird for constness to be part of the value in the first place. (In C++ terms, a const variable doesn’t necessarily have a const value or vice-versa.) I think trying to fit it into the value might be what encourages confusing const and immutable.
Also, shouldn’t classes and instances (even of __slots__ or @dataclass) “declare” constants just like modules? If you have to write the same thing completely differently, with different under-the-covers behavior, to accomplish the same basic concept, that’s a bit weird. A module-level @property or module __setattr__ seems like it would be a lot more consistent. (Although I’m not sure how well that ports to locals.)
- More natural syntax for coroutine sendto, or I guess any pipe or proxy like object
Making send look like assignment feels a lot less natural, not more. I’d expect something like <- (as used by most Erlang-inspired languages), but, more importantly, something different from =. Especially since you often do want to pass around and store coros, pipes, channels, etc., and if doing so actually send the coro to another coro because you’d reused the name, that would be very confusing.
If this had existed from the start or early on, it would be perfectly natural to read and use, but for now it would be somewhat shocking (in so far as it gets actually used) and I think that is the biggest minus.
I’m not sure that’s true.
If we were talking about a language that has an lvalue objects-and-variables model but doesn’t allow assignment overloading (like Scala, I think?), sure. Some such languages allow it, some don’t, and it’s a pretty simple readability trade off.
But languages with Smalltalk-style object models (especially those without declarations), not being able to overload assignment feels like a natural consequence: variables don’t have types, so a variable’s type can’t take control. You can’t do it in Smalltalk, or Ruby, or any of the Lisp object systems I know of, etc. So, even if Python had a clever workaround to that from the start, I think it would still feel surprising to most people.
Of course the way descriptors, metaclasses, and a few other things work under the hood feels surprising until you get the point, but Python only has a very small number of such things—and they’re all used to build less surprising surface behavior (e.g., @property makes total sense to a novice, even if the implementation of it looks like black magic).
On Fri, Jun 21, 2019 at 4:17 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/20/2019 01:25 PM, nate lust wrote:
--> class Foo: ... def __init__(self, o): ... self.o = o ... def __setself__(self, v): ... self.v = v ...
--> f = Foo(5) --> print(f) <__main__.Foo object at 0x7f486bb8d300>
--> print(f.o) 5
print(f.v) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute 'v'
--> f = "hello world" --> print(f.v) hello world
--> print(f) <__main__.Foo object at 0x7f486bb8d300>
Thank you for doing the work of a proof-of-concept (and to Andrew Barnert for his excellent write-up). I think this shows exactly why it's a bad idea -- even though I knew what you were doing, having `f` not be a string after the assignment was extremely surprising.
-1
-- ~Ethan~ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AFPG55... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CVYU43... Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Jun 22, 2019 at 11:19 AM nate lust <natelust@linux.com> wrote:
Typing this out though does make me think of an interesting idea. If there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like:
class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt
Now I have not really thought through how this would work, if it could work...
How does print know to call getself, but type know not to? ChrisA
It probably doesn't, this was just something I typed up on the fly, so is unlikely the end result would be what you see above if it was actually implemented. The only way around that that I can think of now would be if there was two functions, an impl_dictget that actually did the lookup that type could use (and possibly getattr and the like) which would be called in the normal dict get which would just return if the type did not define __getself__ and would call it and return the result if it did. This is not at all dissimilar to how dict setting works now On Fri, Jun 21, 2019, 9:27 PM Chris Angelico <rosuav@gmail.com> wrote:
Typing this out though does make me think of an interesting idea. If
On Sat, Jun 22, 2019 at 11:19 AM nate lust <natelust@linux.com> wrote: there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like:
class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt
Now I have not really thought through how this would work, if it could
work...
How does print know to call getself, but type know not to?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3734V6... Code of Conduct: http://python.org/psf/codeofconduct/
This message is related to two previous threads, but was a sufficiently evolved to warrant a new topic. I am proposing that two new magic methods be added to python that will control assignment and loading of class instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if present, the __getself__ method of that instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names. On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves. Similar caveats and behaviors will apply here as well. A working implementation built against python 3.7 can be found here: https://github.com/natelust/cpython/tree/cloakingVars. This is not pull ready quality code, but the diffs may be interesting to read. An example for what is possible for this new behavior are instance level properties as seen in the demo at the end of this message. These changes have minimal impact on the runtime of existing code, and require no modifications to existing syntax other than the use of the names __setself__ and __getself__. A more detailed write-up with more examples can be found at https://github.com/natelust/CloakingVarWriteup/blob/master/writeup.md, with the example executable demo here: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py The demos include: * Variables which keep track of their assignment history, with ability to rollback (possibly useful with try except blocks) * Variables which write out their value to disk when assigned to * An implementation of context variables using only this new framework (does not implement tokens, but could be added) * const variables that can be used to protect module level 'constants' * Instance properties (reproduced below) that allow dynamically adding properties * An implementation of templated expression, to defer the addition of many arrays to a single for loop, saving possibly expensive python iterations. I am sure the community can come up with many more interesting ideas. class InstanceProperty: def __init__(self, wrapped, getter, setter=None): self.wrapped = wrapped self.getter = getter self.setter = setter def __getself__(self): return self.getter(self.wrapped) def __setself__(self, value): if self.setter: return self.setter(self.wrapped, value) class MachineState: def __init__(self): self._fields = {} def add_input(self, name, start): def getter(slf): return slf._fields[name] def setter(slf, value): ''' the state of a machine part can only be above zero or below 100 ''' if value < 0: value = 0 if value > 100: value = 100 slf._fields[name] = value setter(self, start) inst_prop = InstanceProperty(self, getter, setter) # noqa: F841 # Need to directly assign the instance property, or decloak it. setattr(self, name, getcloaked('inst_prop')) machine = MachineState() for letter, start in zip(['a', 'b', 'c'], [-1, 0, 1]): machine.add_input(letter, start) print(f"machine.a is {machine.a}") print(f"machine.b is {machine.b}") print(f"machine.c is {machine.c}") # Assign a value that is too high machine.c = 200 print(f"machine.c is {machine.c}") # Omited from this proposal but present in the linked documentation are # tools for getting the underlying variables, and or rebinding them. On Fri, Jun 21, 2019 at 9:34 PM nate lust <natelust@linux.com> wrote:
It probably doesn't, this was just something I typed up on the fly, so is unlikely the end result would be what you see above if it was actually implemented.
The only way around that that I can think of now would be if there was two functions, an impl_dictget that actually did the lookup that type could use (and possibly getattr and the like) which would be called in the normal dict get which would just return if the type did not define __getself__ and would call it and return the result if it did.
This is not at all dissimilar to how dict setting works now
On Fri, Jun 21, 2019, 9:27 PM Chris Angelico <rosuav@gmail.com> wrote:
Typing this out though does make me think of an interesting idea. If
On Sat, Jun 22, 2019 at 11:19 AM nate lust <natelust@linux.com> wrote: there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like:
class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt
Now I have not really thought through how this would work, if it could
work...
How does print know to call getself, but type know not to?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3734V6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Tue, Jun 25, 2019 at 2:11 PM nate lust <natelust@linux.com> wrote:
if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name.
I am very, very strongly opposed to this. It would mean that I couldn't trust variables any more. I want to be able to write for x in items: without worrying about whether one of the items might "stick" to x and infect every subsequent iteration of the loop, and any later loop in the same function that also uses x as a loop variable. I want to be able to clarify code by assigning a subexpression to a local variable with a descriptive name, without worrying about whether that's going to change the meaning of the code. I need local variables to be easy to reason about, because Python requires you to use them so much. The thought of the additional cognitive burden that the mere presence of this feature in the language would create, in practically everything I write, scares me.
On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves. Similar caveats and behaviors will apply here as well.
It seems very different to me. Magic attributes are declared in the class. Assignments to that attribute of an instance then go through the special code in the class. Your proposal doesn't have that split. Your proposal would be as if assigning a special value to an ordinary attribute permanently changed that attribute's behavior for that instance only, in a way controlled by the value, not by the class or even the instance. I would be fine with a proposal to declare special variables whose loading and storing behavior is controlled by Python code. If a declaration like metavariable foo = obj in a scope caused the compiler to generate calls to obj.__get__, obj.__set__ and obj.__delete__ instead of the usual LOAD_*, STORE_*, DELETE_* instructions for all mentions of foo in that scope, I would be fine with that. It would be more similar to descriptor attributes, it would have no runtime overhead if not used, and most importantly, I wouldn't have to learn about it if I didn't want to use it. Also you probably wouldn't need to invent new special method names for it, the existing ones would work.
* Variables which keep track of their assignment history, with ability to rollback (possibly useful with try except blocks) * Variables which write out their value to disk when assigned to * An implementation of context variables using only this new framework (does not implement tokens, but could be added) * const variables that can be used to protect module level 'constants' * Instance properties (reproduced below) that allow dynamically adding properties * An implementation of templated expression, to defer the addition of many arrays to a single for loop, saving possibly expensive python iterations.
I don't understand all of these but the ones I do understand seem like they only require some way to magic-ify the variable before using it. The part of your proposal that I strongly oppose is overloading the ordinary assignment syntax for this. Use a special declaration analogous to global/nonlocal and I think it's fine. -- Ben
On Tue, Jun 25, 2019 at 03:34:03PM -0700, Ben Rudiak-Gould wrote:
On Tue, Jun 25, 2019 at 2:11 PM nate lust <natelust@linux.com> wrote:
if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name.
I am very, very strongly opposed to this. It would mean that I couldn't trust variables any more.
But you never really could, not unless they were local variables. If they were variables in another namespace, whether a class or a module, and you accessed them with a dot, then in theory you couldn't tell what assignment would do. In practice, we all know what assignment will do, dot or no dot, nearly always, and when we don't, the code does the right thing. On this list we often worry about code that does weird things that we don't expect, but how often is that a problem in practice? You are right that if this proposal is successful, this will be one more thing that we have to take on faith that code *could* mess with but probably won't. But the reality is, so is nearly everything else except for literals like 1234 or None: functions, operators, method calls, imports, the list of things that might not do what we expect is very close to "everything in Python". In any case, I don't see this being used often outside of very narrow use-cases. I'm more concerned about every single assignment getting a performance hit. The compiler in C++ knows in advance which variables have this special method and which doesn't, and can do the right thing at compile time. The Python compiler can't: every single binding and unbinding needs to jump through hoops to check for this special method whether it is used or not. If Nate has a working prototype, I'd like to know how big a performance cost it carries. -- Steven
Steven, You may have seen the message I posted a little while ago detailing a bit more about the proposed changes with an example of how the interpreter will handle things like __getself__. I have working code here; https://github.com/natelust/cpython/tree/cloakingVars. I worked very hard to keep all the new behavior behind c if statements of the form if (x != NULL) so that the impact on existing code would be minimized. If you have a bench mark you prefer I would be happy to run it against my changes and mainline python 3.7 to see how they compare. On Thu, Jun 27, 2019 at 4:03 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jun 25, 2019 at 03:34:03PM -0700, Ben Rudiak-Gould wrote:
On Tue, Jun 25, 2019 at 2:11 PM nate lust <natelust@linux.com> wrote:
if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name.
I am very, very strongly opposed to this. It would mean that I couldn't trust variables any more.
But you never really could, not unless they were local variables. If they were variables in another namespace, whether a class or a module, and you accessed them with a dot, then in theory you couldn't tell what assignment would do.
In practice, we all know what assignment will do, dot or no dot, nearly always, and when we don't, the code does the right thing. On this list we often worry about code that does weird things that we don't expect, but how often is that a problem in practice?
You are right that if this proposal is successful, this will be one more thing that we have to take on faith that code *could* mess with but probably won't. But the reality is, so is nearly everything else except for literals like 1234 or None: functions, operators, method calls, imports, the list of things that might not do what we expect is very close to "everything in Python".
In any case, I don't see this being used often outside of very narrow use-cases. I'm more concerned about every single assignment getting a performance hit. The compiler in C++ knows in advance which variables have this special method and which doesn't, and can do the right thing at compile time. The Python compiler can't: every single binding and unbinding needs to jump through hoops to check for this special method whether it is used or not.
If Nate has a working prototype, I'd like to know how big a performance cost it carries.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WAUOSP... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Thu, Jun 27, 2019 at 12:23:24PM -0400, nate lust wrote:
If you have a bench mark you prefer I would be happy to run it against my changes and mainline python 3.7 to see how they compare.
Ultimately it will probably need to run against this: https://github.com/python/pyperformance for 3.8 and 3.9 alpha but we can start with some quick and dirty tests using timeit. Let's say: def test(): n = 0 def inner(): global g nonlocal n n = g x = n g = x y = x z = y return x for i in range(100): a = inner() from timeit import Timer t = Timer("test()", setup="from __main__ import test; g=0") t.repeat(repeat=5) I'm not speaking officially, but I would say that if this slows down regular assignments by more than 5%, the idea is dead in the water; if it slows it down by less than 1%, the performance objection is satisfied; between 1 and 5 means we get to argue cost versus benefit. (The above is just my opinion. Others may disagree.) -- Steven
Steven, Sorry about taking a few days to get back to you. Here is the exact code I ran: def test(): n = 0 def inner(): global g nonlocal n n = g x = n g = x y = x z = y return x for i in range(100): a = inner() from timeit import Timer g = 0 t = Timer("test()", setup="from __main__ import test; g=0") res = t.repeat(repeat=20) from statistics import mean, stdev print(f"ALl: {res}") print(f"mean {mean(res)}, std {stdev(res)}") For my branch the results were: ALl: [8.909581271989737, 8.897892987995874, 9.055693186994176, 9.103679533989634, 9.0795843389933, 9.12056165598915, 9.125767157005612, 9.117257817997597, 9.113885553990258, 9.180963805003557, 9.239156291994732, 9.318854127981467, 9.296847557998262, 9.313092978001805, 9.284125670004869, 9.259817042999202, 9.244616173004033, 9.271513198997127, 9.335984965000534, 9.258596728992416] mean 9.176373602246167, std 0.12835175852148933 The results for mainline python 3.7 are: ALl: [9.005807315988932, 9.081591005000519, 9.12138073798269, 9.174804927984951, 9.233709035004722, 9.267144601995824, 9.323436667007627, 9.314979821007, 9.265707976999693, 9.24289796501398, 9.236994076985866, 9.310381392017007, 9.206289929017657, 9.211337374988943, 9.206687778991181, 9.215082932991209, 9.221178130013868, 9.213595701992745, 9.206646608014125, 9.224334346014075] mean 9.214199416250631, std 0.07610134120369169 There is some variation in the mean and std for each of these each time the program is run, which I suspect is due to scheduling on which core the job is launched. These results are typical however. I can launch each job a number of times to create meta runtime distributions, but I felt that was better left to proper bench-marking programs. It seems like these numbers are within statistical uncertainty of each other. Nate On Thu, Jun 27, 2019 at 3:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Jun 27, 2019 at 12:23:24PM -0400, nate lust wrote:
If you have a bench mark you prefer I would be happy to run it against my changes and mainline python 3.7 to see how they compare.
Ultimately it will probably need to run against this:
https://github.com/python/pyperformance
for 3.8 and 3.9 alpha but we can start with some quick and dirty tests using timeit. Let's say:
def test(): n = 0 def inner(): global g nonlocal n n = g x = n g = x y = x z = y return x for i in range(100): a = inner()
from timeit import Timer t = Timer("test()", setup="from __main__ import test; g=0") t.repeat(repeat=5)
I'm not speaking officially, but I would say that if this slows down regular assignments by more than 5%, the idea is dead in the water; if it slows it down by less than 1%, the performance objection is satisfied; between 1 and 5 means we get to argue cost versus benefit.
(The above is just my opinion. Others may disagree.)
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SJBOCU... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
Nate, I find this mightily interesting! I think it's worth discussing at length. Is there any chance you'll want to move the discussion to the richer context here? https://discuss.python.org/c/ideas Regards, On Tue, Jun 25, 2019 at 5:00 PM nate lust <natelust@linux.com> wrote:
This message is related to two previous threads, but was a sufficiently evolved to warrant a new topic.
I am proposing that two new magic methods be added to python that will control assignment and loading of class instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if present, the __getself__ method of that instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names.
On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves. Similar caveats and behaviors will apply here as well.
A working implementation built against python 3.7 can be found here: https://github.com/natelust/cpython/tree/cloakingVars. This is not pull ready quality code, but the diffs may be interesting to read.
An example for what is possible for this new behavior are instance level properties as seen in the demo at the end of this message.
These changes have minimal impact on the runtime of existing code, and require no modifications to existing syntax other than the use of the names __setself__ and __getself__.
A more detailed write-up with more examples can be found at https://github.com/natelust/CloakingVarWriteup/blob/master/writeup.md, with the example executable demo here: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py
The demos include: * Variables which keep track of their assignment history, with ability to rollback (possibly useful with try except blocks) * Variables which write out their value to disk when assigned to * An implementation of context variables using only this new framework (does not implement tokens, but could be added) * const variables that can be used to protect module level 'constants' * Instance properties (reproduced below) that allow dynamically adding properties * An implementation of templated expression, to defer the addition of many arrays to a single for loop, saving possibly expensive python iterations.
I am sure the community can come up with many more interesting ideas.
class InstanceProperty:
def __init__(self, wrapped, getter, setter=None): self.wrapped = wrapped self.getter = getter self.setter = setter
def __getself__(self): return self.getter(self.wrapped)
def __setself__(self, value): if self.setter: return self.setter(self.wrapped, value)
class MachineState: def __init__(self): self._fields = {}
def add_input(self, name, start): def getter(slf): return slf._fields[name]
def setter(slf, value): ''' the state of a machine part can only be above zero or below 100 ''' if value < 0: value = 0 if value > 100: value = 100 slf._fields[name] = value setter(self, start) inst_prop = InstanceProperty(self, getter, setter) # noqa: F841 # Need to directly assign the instance property, or decloak it. setattr(self, name, getcloaked('inst_prop'))
machine = MachineState()
for letter, start in zip(['a', 'b', 'c'], [-1, 0, 1]): machine.add_input(letter, start)
print(f"machine.a is {machine.a}") print(f"machine.b is {machine.b}") print(f"machine.c is {machine.c}")
# Assign a value that is too high machine.c = 200
print(f"machine.c is {machine.c}") # Omited from this proposal but present in the linked documentation are # tools for getting the underlying variables, and or rebinding them.
On Fri, Jun 21, 2019 at 9:34 PM nate lust <natelust@linux.com> wrote:
It probably doesn't, this was just something I typed up on the fly, so is unlikely the end result would be what you see above if it was actually implemented.
The only way around that that I can think of now would be if there was two functions, an impl_dictget that actually did the lookup that type could use (and possibly getattr and the like) which would be called in the normal dict get which would just return if the type did not define __getself__ and would call it and return the result if it did.
This is not at all dissimilar to how dict setting works now
On Fri, Jun 21, 2019, 9:27 PM Chris Angelico <rosuav@gmail.com> wrote:
Typing this out though does make me think of an interesting idea. If
On Sat, Jun 22, 2019 at 11:19 AM nate lust <natelust@linux.com> wrote: there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like:
class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt
Now I have not really thought through how this would work, if it could
work...
How does print know to call getself, but type know not to?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3734V6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TMU3MJ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Juancarlo *Añez*
I am happy to move this discussion to where ever is appropriate. I won't get to it in the next few hours (bed time for my kid) so if you would like feel free to move discussion there, and I guess I can watch this email thread for if you do. Otherwise I will do it when I am free. Nate On Tue, Jun 25, 2019, 6:36 PM Juancarlo Añez <apalala@gmail.com> wrote:
Nate,
I find this mightily interesting! I think it's worth discussing at length.
Is there any chance you'll want to move the discussion to the richer context here? https://discuss.python.org/c/ideas
Regards,
On Tue, Jun 25, 2019 at 5:00 PM nate lust <natelust@linux.com> wrote:
This message is related to two previous threads, but was a sufficiently evolved to warrant a new topic.
I am proposing that two new magic methods be added to python that will control assignment and loading of class instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if present, the __getself__ method of that instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names.
On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves. Similar caveats and behaviors will apply here as well.
A working implementation built against python 3.7 can be found here: https://github.com/natelust/cpython/tree/cloakingVars. This is not pull ready quality code, but the diffs may be interesting to read.
An example for what is possible for this new behavior are instance level properties as seen in the demo at the end of this message.
These changes have minimal impact on the runtime of existing code, and require no modifications to existing syntax other than the use of the names __setself__ and __getself__.
A more detailed write-up with more examples can be found at https://github.com/natelust/CloakingVarWriteup/blob/master/writeup.md, with the example executable demo here: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py
The demos include: * Variables which keep track of their assignment history, with ability to rollback (possibly useful with try except blocks) * Variables which write out their value to disk when assigned to * An implementation of context variables using only this new framework (does not implement tokens, but could be added) * const variables that can be used to protect module level 'constants' * Instance properties (reproduced below) that allow dynamically adding properties * An implementation of templated expression, to defer the addition of many arrays to a single for loop, saving possibly expensive python iterations.
I am sure the community can come up with many more interesting ideas.
class InstanceProperty:
def __init__(self, wrapped, getter, setter=None): self.wrapped = wrapped self.getter = getter self.setter = setter
def __getself__(self): return self.getter(self.wrapped)
def __setself__(self, value): if self.setter: return self.setter(self.wrapped, value)
class MachineState: def __init__(self): self._fields = {}
def add_input(self, name, start): def getter(slf): return slf._fields[name]
def setter(slf, value): ''' the state of a machine part can only be above zero or below 100 ''' if value < 0: value = 0 if value > 100: value = 100 slf._fields[name] = value setter(self, start) inst_prop = InstanceProperty(self, getter, setter) # noqa: F841 # Need to directly assign the instance property, or decloak it. setattr(self, name, getcloaked('inst_prop'))
machine = MachineState()
for letter, start in zip(['a', 'b', 'c'], [-1, 0, 1]): machine.add_input(letter, start)
print(f"machine.a is {machine.a}") print(f"machine.b is {machine.b}") print(f"machine.c is {machine.c}")
# Assign a value that is too high machine.c = 200
print(f"machine.c is {machine.c}") # Omited from this proposal but present in the linked documentation are # tools for getting the underlying variables, and or rebinding them.
On Fri, Jun 21, 2019 at 9:34 PM nate lust <natelust@linux.com> wrote:
It probably doesn't, this was just something I typed up on the fly, so is unlikely the end result would be what you see above if it was actually implemented.
The only way around that that I can think of now would be if there was two functions, an impl_dictget that actually did the lookup that type could use (and possibly getattr and the like) which would be called in the normal dict get which would just return if the type did not define __getself__ and would call it and return the result if it did.
This is not at all dissimilar to how dict setting works now
On Fri, Jun 21, 2019, 9:27 PM Chris Angelico <rosuav@gmail.com> wrote:
Typing this out though does make me think of an interesting idea. If
On Sat, Jun 22, 2019 at 11:19 AM nate lust <natelust@linux.com> wrote: there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like:
class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt
Now I have not really thought through how this would work, if it
could work...
How does print know to call getself, but type know not to?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3734V6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TMU3MJ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Juancarlo *Añez*
I created a discussion topic located here https://discuss.python.org/t/a-proposal-and-implementation-to-add-assignment... On Tue, Jun 25, 2019 at 6:41 PM nate lust <natelust@linux.com> wrote:
I am happy to move this discussion to where ever is appropriate. I won't get to it in the next few hours (bed time for my kid) so if you would like feel free to move discussion there, and I guess I can watch this email thread for if you do. Otherwise I will do it when I am free. Nate
On Tue, Jun 25, 2019, 6:36 PM Juancarlo Añez <apalala@gmail.com> wrote:
Nate,
I find this mightily interesting! I think it's worth discussing at length.
Is there any chance you'll want to move the discussion to the richer context here? https://discuss.python.org/c/ideas
Regards,
On Tue, Jun 25, 2019 at 5:00 PM nate lust <natelust@linux.com> wrote:
This message is related to two previous threads, but was a sufficiently evolved to warrant a new topic.
I am proposing that two new magic methods be added to python that will control assignment and loading of class instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if present, the __getself__ method of that instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names.
On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves. Similar caveats and behaviors will apply here as well.
A working implementation built against python 3.7 can be found here: https://github.com/natelust/cpython/tree/cloakingVars. This is not pull ready quality code, but the diffs may be interesting to read.
An example for what is possible for this new behavior are instance level properties as seen in the demo at the end of this message.
These changes have minimal impact on the runtime of existing code, and require no modifications to existing syntax other than the use of the names __setself__ and __getself__.
A more detailed write-up with more examples can be found at https://github.com/natelust/CloakingVarWriteup/blob/master/writeup.md, with the example executable demo here: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py
The demos include: * Variables which keep track of their assignment history, with ability to rollback (possibly useful with try except blocks) * Variables which write out their value to disk when assigned to * An implementation of context variables using only this new framework (does not implement tokens, but could be added) * const variables that can be used to protect module level 'constants' * Instance properties (reproduced below) that allow dynamically adding properties * An implementation of templated expression, to defer the addition of many arrays to a single for loop, saving possibly expensive python iterations.
I am sure the community can come up with many more interesting ideas.
class InstanceProperty:
def __init__(self, wrapped, getter, setter=None): self.wrapped = wrapped self.getter = getter self.setter = setter
def __getself__(self): return self.getter(self.wrapped)
def __setself__(self, value): if self.setter: return self.setter(self.wrapped, value)
class MachineState: def __init__(self): self._fields = {}
def add_input(self, name, start): def getter(slf): return slf._fields[name]
def setter(slf, value): ''' the state of a machine part can only be above zero or below 100 ''' if value < 0: value = 0 if value > 100: value = 100 slf._fields[name] = value setter(self, start) inst_prop = InstanceProperty(self, getter, setter) # noqa: F841 # Need to directly assign the instance property, or decloak it. setattr(self, name, getcloaked('inst_prop'))
machine = MachineState()
for letter, start in zip(['a', 'b', 'c'], [-1, 0, 1]): machine.add_input(letter, start)
print(f"machine.a is {machine.a}") print(f"machine.b is {machine.b}") print(f"machine.c is {machine.c}")
# Assign a value that is too high machine.c = 200
print(f"machine.c is {machine.c}") # Omited from this proposal but present in the linked documentation are # tools for getting the underlying variables, and or rebinding them.
On Fri, Jun 21, 2019 at 9:34 PM nate lust <natelust@linux.com> wrote:
It probably doesn't, this was just something I typed up on the fly, so is unlikely the end result would be what you see above if it was actually implemented.
The only way around that that I can think of now would be if there was two functions, an impl_dictget that actually did the lookup that type could use (and possibly getattr and the like) which would be called in the normal dict get which would just return if the type did not define __getself__ and would call it and return the result if it did.
This is not at all dissimilar to how dict setting works now
On Fri, Jun 21, 2019, 9:27 PM Chris Angelico <rosuav@gmail.com> wrote:
Typing this out though does make me think of an interesting idea. If
On Sat, Jun 22, 2019 at 11:19 AM nate lust <natelust@linux.com> wrote: there was something like __getself__ in addition to __setself__, you could implement things like MyInt. __getself__ would look something like:
class MyInt: def __init__(self, value): self.value = value def __getself__(self): return self.value def __setself__(self, value): raise ValueError("Cant set MyInt") x = MyInt(2) print(x) -> 2 type(x) -> MyInt
Now I have not really thought through how this would work, if it
could work...
How does print know to call getself, but type know not to?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3734V6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TMU3MJ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Juancarlo *Añez*
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Tue, Jun 25, 2019 at 06:35:48PM -0400, Juancarlo Añez wrote:
Is there any chance you'll want to move the discussion to the richer context here? https://discuss.python.org/c/ideas
Please don't. -- Steven
Steven, I apologize I was unaware that this was not the best suggestion, and had already created a topic. I can close it out if that would be best. On Tue, Jun 25, 2019 at 10:25 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jun 25, 2019 at 06:35:48PM -0400, Juancarlo Añez wrote:
Is there any chance you'll want to move the discussion to the richer context here? https://discuss.python.org/c/ideas
Please don't.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2BAURL... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Jun 25, 2019, at 14:00, nate lust <natelust@linux.com> wrote:
This message is related to two previous threads, but was a sufficiently evolved to warrant a new topic.
I am proposing that two new magic methods be added to python that will control assignment and loading of class instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if present, the __getself__ method of that instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names.
Does this still have the same issues as your previous version, like sometimes working for locals and sometimes not, affecting values stored in dicts that aren’t namespaces, not working for namespaces that aren’t dicts (slots classes, custom modules, etc.), type doing the wrong thing, and so on?
I tried to address most of the above in the more detailed write-up I linked to. I didn't want to spam out a message that was too long, and the link provided a good way to get syntax highlighting etc. The code that is available in the linked github repository now treats lookups and sets in fastlocals the same way it does for global and module level namespaces. I changed the behavior so that it does not affect variables stored in a dictionary. If you have one of these "cloaking" variables in a dict, you will get the actual variable before __getself__ or __setself__ is called. This makes it behave the same way as it would in say a list. Only looking up a variable by name causes these methods to get triggered. My current implementation does not handle slots (or closures) but that I have looked into it, but I just didn't prioritize that work. I do not believe it will not be too hard to add though. I did a little bit of "magic" (inside the new lookup) to ensure that self does not have __getself__ called on it if it is used within its own methods. That keeps the code reading the same as existing python code. As for type, it responds to whatever py object is loaded. If a __getself__ is called then type will see the result that is returned. I have added a few (preliminary implemented) "builtins" to my demo that help address this. One is getcloaked, which will return the underlying variable if there is one, otherwise it just returns the normal python variable. This reads: type(x) => result of x.__getself__ type(getcloaked('x')) => the cloaking type i.e. the history variable in one example I have also added a setcloaked to ignore __setself__ and re-bind the name, cloaksset which returns if a variable implements __setself__, cloaksget which does the same for get, and iscloaking which returns if a variable implements any cloaking behavior. On Tue, Jun 25, 2019 at 7:08 PM Andrew Barnert <abarnert@yahoo.com> wrote:
On Jun 25, 2019, at 14:00, nate lust <natelust@linux.com> wrote:
This message is related to two previous threads, but was a sufficiently
evolved to warrant a new topic.
I am proposing that two new magic methods be added to python that will
instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if
control assignment and loading of class present, the __getself__ method of that
instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names.
Does this still have the same issues as your previous version, like sometimes working for locals and sometimes not, affecting values stored in dicts that aren’t namespaces, not working for namespaces that aren’t dicts (slots classes, custom modules, etc.), type doing the wrong thing, and so on?
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
nate lust writes:
On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves.
I don't think that's a correct interpretation. In all cases of assignment, a name is rebound (in some sense) in a separate namespace. In the case of an attribute, that namespace us explicit. It *is* the object, and we know the object's type: the type of objects that have that attribute. (Circular, do you say? Well, that circularity is the essence of duck typing.) What you want to do is make some object its own namespace, and that breaks the connection between "bare" names and the implicit namespace of the module or function that contains it. Your proposal will set magic loose in the world: it is no longer possible to reason with confidence about the behavior of objects denoted by bare names locally. This is action at a distance on *every name in a program*. Attribute notation makes action at a distance (ie, the interaction with other attributes of that object inherited from the caller) explicit. That's good; combination of mutable attributes which persist through a function invocation is the essence of object- oriented programming. But it needs to be explicit or brains will explode (more likely, they'll implode: instead of trying to figure out whether an involutary namespace is present, developers will just go ahead and assume it isn't, and pray they have a new job before it blows up). Ben's post parallel to this one explains detailed examples of How Things Can Go Wrong. I'm -1 on this (truncated from -1000). Python's simple, easy-to- reason about behavior for assignments should not be messed with. __getself__ and __setself__ are good names if we're going to have the behavior, but it should be explicitly attached to an operator, not invoked implicitly by assignment. (FWIW, I don't yet see a need for the behavior, but I'm waiting on promised examples.) Steve
Stephen, Thanks for the reply, it is a busy day at work today, so it is going to take me a little bit of time to sit down and really process all you have said. I wanted to drop you a message though and link you to the examples that I mentioned. https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py . If you want to see them in context with the outputs (to save cloning and building my patched python) they can be found at the end of this document: https://github.com/natelust/CloakingVarWriteup/blob/master/writeup.md. A direct link to the interpreter changes that support this can be found here: https://github.com/natelust/cpython/commit/3b3714694b9cd9e2b1b706661765050c3..., in case there is any question on how things are working, or just general interest. I do appreciate you, and everyone, taking their time to weight in on this, as it is very educational. Once I have a bit of spare brain power, I will fully process what you wrote and may have additional questions. Thank you On Wed, Jun 26, 2019 at 3:12 AM Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
nate lust writes:
On first read, that may be surprising, but it extends a behavior pattern that already exists for things like properties (and generically descriptors) to object instances themselves.
I don't think that's a correct interpretation. In all cases of assignment, a name is rebound (in some sense) in a separate namespace. In the case of an attribute, that namespace us explicit. It *is* the object, and we know the object's type: the type of objects that have that attribute. (Circular, do you say? Well, that circularity is the essence of duck typing.)
What you want to do is make some object its own namespace, and that breaks the connection between "bare" names and the implicit namespace of the module or function that contains it. Your proposal will set magic loose in the world: it is no longer possible to reason with confidence about the behavior of objects denoted by bare names locally. This is action at a distance on *every name in a program*.
Attribute notation makes action at a distance (ie, the interaction with other attributes of that object inherited from the caller) explicit. That's good; combination of mutable attributes which persist through a function invocation is the essence of object- oriented programming. But it needs to be explicit or brains will explode (more likely, they'll implode: instead of trying to figure out whether an involutary namespace is present, developers will just go ahead and assume it isn't, and pray they have a new job before it blows up).
Ben's post parallel to this one explains detailed examples of How Things Can Go Wrong.
I'm -1 on this (truncated from -1000). Python's simple, easy-to- reason about behavior for assignments should not be messed with. __getself__ and __setself__ are good names if we're going to have the behavior, but it should be explicitly attached to an operator, not invoked implicitly by assignment. (FWIW, I don't yet see a need for the behavior, but I'm waiting on promised examples.)
Steve
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Tue, Jun 25, 2019 at 11:01 PM nate lust <natelust@linux.com> wrote:
This message is related to two previous threads, but was a sufficiently evolved to warrant a new topic.
I am proposing that two new magic methods be added to python that will control assignment and loading of class instances. This means that if an instance is bound to a variable name, any attempts to rebind that name will result in a call to the __setself__ (name negotiable) of the instance already bound to that name. Likewise when a class instance bound to a name is loaded by the interpreter, if present, the __getself__ method of that instance will be called and its result will be returned instead. I have been internally calling these cloaking variables as they "cloak" the underlying instance, parallelling the idea of shadowing. Feel free to suggest better names.
Very interesting and this will be much better than introducing new operators for DSLs! This makes Python finally treating "=" operator symmetric as rest of the operators, and I will definitely stop using descriptors forever if this feature gets accepted! I find the objection reasoning very strange as none of the default behavior changed, and yet if you use this feature you do need to worry about the object behavior regarding assignment, this is true for descriptors and all other magics.
On 26/06/2019 08:34, Yanghao Hua wrote:
I find the objection reasoning very strange as none of the default behavior changed, and yet if you use this feature you do need to worry about the object behavior regarding assignment, this is true for descriptors and all other magics.
The problem is not the default behaviour. The problem is that the average reader of your code cannot know that something that appears to be an ordinary assignment has been redefined elsewhere to be something entirely herring. Your code stops being understandable to other people. The thing I keep coming back to in this whole discussion is the Zen line "Explicit is better than implicit". -- Rhodri James *-* Kynesim Ltd
On 26 Jun 2019, at 14:28, Rhodri James <rhodri@kynesim.co.uk> wrote:
On 26/06/2019 08:34, Yanghao Hua wrote: I find the objection reasoning very strange as none of the default behavior changed, and yet if you use this feature you do need to worry about the object behavior regarding assignment, this is true for descriptors and all other magics.
The problem is not the default behaviour. The problem is that the average reader of your code cannot know that something that appears to be an ordinary assignment has been redefined elsewhere to be something entirely herring. Your code stops being understandable to other people.
I 100% agree that this proposal is a bad idea. But I do have to play Devils advocate here. The the-code-is-understandable-at-face-value ship has already sailed. + doesn't mean add, it means calling a dunder function that can do anything. Foo.bar = 1 doesn't mean set bar to 1 but calling a dunder method. In python code basically can't be understood at face value. / Anders
On Thu, Jun 27, 2019 at 12:37 AM Anders Hovmöller <boxed@killingar.net> wrote:
On 26 Jun 2019, at 14:28, Rhodri James <rhodri@kynesim.co.uk> wrote:
On 26/06/2019 08:34, Yanghao Hua wrote: I find the objection reasoning very strange as none of the default behavior changed, and yet if you use this feature you do need to worry about the object behavior regarding assignment, this is true for descriptors and all other magics.
The problem is not the default behaviour. The problem is that the average reader of your code cannot know that something that appears to be an ordinary assignment has been redefined elsewhere to be something entirely herring. Your code stops being understandable to other people.
I 100% agree that this proposal is a bad idea. But I do have to play Devils advocate here.
The the-code-is-understandable-at-face-value ship has already sailed. + doesn't mean add, it means calling a dunder function that can do anything. Foo.bar = 1 doesn't mean set bar to 1 but calling a dunder method. In python code basically can't be understood at face value.
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically: def f1(x): return frob(x).spam def f2(x): f = frob(x) s = f.spam return s This correlation is critical to sane refactoring. You should be able to transform f1 into f2, and then insert print calls to see what 'f' and 's' are, without affecting the behaviour of the function. The proposed magic would change and completely break this; and as such, it violates programmer expectations *across many languages* regarding refactoring. ChrisA
On 26 Jun 2019, at 16:46, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 12:37 AM Anders Hovmöller <boxed@killingar.net> wrote:
On 26 Jun 2019, at 14:28, Rhodri James <rhodri@kynesim.co.uk> wrote:
On 26/06/2019 08:34, Yanghao Hua wrote: I find the objection reasoning very strange as none of the default behavior changed, and yet if you use this feature you do need to worry about the object behavior regarding assignment, this is true for descriptors and all other magics.
The problem is not the default behaviour. The problem is that the average reader of your code cannot know that something that appears to be an ordinary assignment has been redefined elsewhere to be something entirely herring. Your code stops being understandable to other people.
I 100% agree that this proposal is a bad idea. But I do have to play Devils advocate here.
The the-code-is-understandable-at-face-value ship has already sailed. + doesn't mean add, it means calling a dunder function that can do anything. Foo.bar = 1 doesn't mean set bar to 1 but calling a dunder method. In python code basically can't be understood at face value.
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring. You should be able to transform f1 into f2, and then insert print calls to see what 'f' and 's' are, without affecting the behaviour of the function. The proposed magic would change and completely break this; and as such, it violates programmer expectations *across many languages* regarding refactoring.
I'm out of the game for many years but isn't that potentially extremely different in C++ for example? / Anders
On Thu, Jun 27, 2019 at 1:41 AM Anders Hovmöller <boxed@killingar.net> wrote:
On 26 Jun 2019, at 16:46, Chris Angelico <rosuav@gmail.com> wrote:
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring. You should be able to transform f1 into f2, and then insert print calls to see what 'f' and 's' are, without affecting the behaviour of the function. The proposed magic would change and completely break this; and as such, it violates programmer expectations *across many languages* regarding refactoring.
I'm out of the game for many years but isn't that potentially extremely different in C++ for example?
In C++, there's awkwardnesses with simply returning a struct, so I'm going to tweak it so it returns a pointer instead (which is more equivalent to what Python does; also, it avoids questions of whether the object still exists - not an issue in Python because it's garbage collected). int f1(const char *x) { return frob(x)->spam(); } int f2(const char *x) { status *f = frob(x); int s = f->spam(); return s; } I don't think you can define what "f->spam" means, so I turned that into an actual method call, too. In any case, the refactoring is still valid, and the disassembly of these two functions will probably be identical (since C++ compilers are allowed to optimize out local variables). Point is, breaking a single expression into multiple sub-expressions is a standard action [1] when refactoring, and it should always be safe. Any time it isn't, you probably have either an awful language or an awful codebase. And you should post something to TheDailyWTF. ChrisA [1] meaning you can do one of them each round of combat, and still have enough time for a move action
On Wednesday, June 26, 2019, 11:53:30 AM PDT, Chris Angelico <rosuav@gmail.com> wrote:
I don't think you can define what "f->spam" means Well, you can, but only the first half of what it means, so I don't think this changes your point. If f is a pointer, then f->spam means to dereference the pointer f, then access the spam attribute of the result. If f is an instance of a user-defined class, rather than a raw pointer, it can redefine ->. But only to change the "dereference" part, not the "access the spam attribute" part. In other words, f->spam means something like (*(f.operator->())).spam and you can overload the operator->. Of course, under very restricted conditions, you can do something like this: #include <iostream> struct pstatus; struct status { int eggs, spam; } struct pstatus { status *p; status* operator->() { return reinterpret_cast<status*>(reinterpret_cast<char*>(p) - sizeof(int)); } }; pstatus frob() { return pstatus{new status{1, 2}}; } int main() { auto f = frob(); // eggs = 1, spam = 2 std::cout << f->spam; // accesses eggs, and prints 1 rather than 2 return 0; } Compile that with g++ or clang++ with -std=c++11 (we don't actually need any C++11 feature here, the code's just a bit shorter and simpler with auto, etc.), and it should compile without warnings, and print out 1 rather than 2 when you run it. The trick is that "access the spam attribute" actually means "access bytes 4-8 of the struct as an int", so if we have a pointer to 4 bytes before the start of the real struct, you get the int at bytes 0-4 of the real struct, which is the real eggs. (Actually, we should be using the difference between offsetof(spam) and offsetof(eggs), not sizeof(int), so it would be legal even with non-default alignment. But so many tiny things could turn this code into undefined behavior, even with that change, like just adding a virtual method to status, or referencing f->eggs in code that doesn't even run… so let's not worry about bulletproofing it.)
Anders Hovmöller wrote:
On 26 Jun 2019, at 16:46, Chris Angelico <rosuav@gmail.com> wrote:
it violates programmer expectations *across many languages* regarding refactoring.
isn't that potentially extremely different in C++ for example?
Yes, but it's pretty universal across languages with name-binding semantics for assignment (Lisp, Ruby, Javascript etc.) that assigning to a local name doesn't invoke any magic. -- Greg
On Jun 26, 2019, at 15:46, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Yes, but it's pretty universal across languages with name-binding semantics for assignment (Lisp, Ruby, Javascript etc.) that assigning to a local name doesn't invoke any magic.
I don’t think that’s quite true. Many languages with name-binding semantics simply don’t invoke any magic for any assignments. Among those that do, I don’t think there’s anything universal about allowing it on some scopes but not local. For example, in Lisp, if you replace the setf macro, that replaces local symbol sets just as much as closure, global, and dynamic sets, and even non-symbol sets like to generalized-variables. The fact that Python’s namespaces are object-oriented, and Python makes it very easy to replace or hook class instance namespaces, a bit harder to replace or hook global namespaces, and very hard to replace or hook local namespaces, is probably specific to Python. And think about this: If you had code involving “x = 2” that stopped working when you moved it from local to a module init function to the top level of the module because it was no longer doing anything that could be construed as binding x, would you think “Something is broken”, or “That’s ok, x isn’t local, so I wasn’t expecting assignment to mean binding”? (Code that actually does something fancy with namespaces, your expectations on moving it might be different, of course—but in that case, you’re already thinking about the namespace semantics, so you’re prepared for that.)
Andrew Barnert wrote:
in Lisp, if you replace the setf macro,
Okay, maybe Lisp wasn't a good example. In the presence of macros, all bets are off. :-( But at least if you haven't redefined the universe, local bindings in Lisp behave predictably.
If you had code involving “x = 2” that stopped working when you moved it from local to a module init function to the top level ... would you think “That’s ok, x isn’t local, so I wasn’t expecting assignment to mean binding”?
I wouldn't think "x isn't local", because it still *is* local from the point of view of that piece of code. If we're talking about current Python, I definitely would think something was broken, because assignment to a bare name, in any context, isn't supposed to be able to do anything weird. -- Greg
On Jun 27, 2019, at 16:18, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Andrew Barnert wrote:
in Lisp, if you replace the setf macro,
Okay, maybe Lisp wasn't a good example. In the presence of macros, all bets are off. :-( But at least if you haven't redefined the universe, local bindings in Lisp behave predictably.
Right, but if you haven’t redefined the universe, global (and dynamic and so on) bindings behave predictably too. There’s nothing special about locals that makes them untouchable, or even harder to touch, then any other namespace. And I think that’s more typical among name-binding languages than Python’s “locals are special, nobody can touch them”. Python’s behavior seems to be mainly a side effect of the fast-locals optimization. (That restrictive side effect might well be a good thing for people maintaining 10-year codebases and trying to keep them readable, but those are usually not the kind of benefits that other language designers steal from Python.)
If you had code involving “x = 2” that stopped working when you moved it from local to a module init function to the top level ... would you think “That’s ok, x isn’t local, so I wasn’t expecting assignment to mean binding”?
I wouldn't think "x isn't local", because it still *is* local from the point of view of that piece of code.
That’s great way to put it. And that’s why I think if it’s not acceptable to add these books to locals, it’s at least questionable behavior for globals (or, conversely, if it is acceptable to hook global this way, it should be acceptable to hook locals too). There are ways in which locals and global act differently, for good reasons—but otherwise, they act as similarly as possible, which is what allows you to think of globals as local to global code.
On Wed, Jun 26, 2019 at 4:47 PM Chris Angelico <rosuav@gmail.com> wrote: [...]
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring. You should be able to transform f1 into f2, and then insert print calls to see what 'f' and 's' are, without affecting the behaviour of the function. The proposed magic would change and completely break this; and as such, it violates programmer expectations *across many languages* regarding refactoring.
Chris, I might need a bit more education here. I am trying to guess what you meant to say here is if in f2(), f or s is already defined, it will then change the "usual" understanding. In this case, I am assuming when you start to assign in f2() a local variable f or s with an object that has __set/getself__(), then you know later on assigning to it will mean differently. How is this different from descriptors? when you do x.y = z and if y is a descriptor you don't expect x.y is now pointing/binding to z, you have to understand the object behavior anyway. I do not see how this is different in the set/getself() case.
On Thu, Jun 27, 2019 at 5:29 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 4:47 PM Chris Angelico <rosuav@gmail.com> wrote: [...]
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring. You should be able to transform f1 into f2, and then insert print calls to see what 'f' and 's' are, without affecting the behaviour of the function. The proposed magic would change and completely break this; and as such, it violates programmer expectations *across many languages* regarding refactoring.
Chris, I might need a bit more education here. I am trying to guess what you meant to say here is if in f2(), f or s is already defined, it will then change the "usual" understanding. In this case, I am assuming when you start to assign in f2() a local variable f or s with an object that has __set/getself__(), then you know later on assigning to it will mean differently. How is this different from descriptors? when you do x.y = z and if y is a descriptor you don't expect x.y is now pointing/binding to z, you have to understand the object behavior anyway. I do not see how this is different in the set/getself() case.
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes" to both, then when ISN'T getself called? If the answer is "no" to both, then when IS it called? And if they're different, then you have the refactoring headache that I described. That's a problem that simply doesn't happen with descriptors, because this example is using local variables - local variables are the obvious way to refactor anything. You're proposing allowing values to redefine local variables. That is completely different from descriptors and other forms of magic, which allow a namespace to define how it behaves. I'm basically done trying to explain this to you. Multiple people have tried to explain how this is NOT the same as descriptors, and you keep coming back to "but descriptors can do this too". They cannot. You are granting magical powers to *values*, not to namespaces. That makes basically ALL code impossible to reason about. ChrisA
On Wed, Jun 26, 2019 at 10:16 PM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 5:29 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 4:47 PM Chris Angelico <rosuav@gmail.com> wrote: [...]
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring. You should be able to transform f1 into f2, and then insert print calls to see what 'f' and 's' are, without affecting the behaviour of the function. The proposed magic would change and completely break this; and as such, it violates programmer expectations *across many languages* regarding refactoring.
Chris, I might need a bit more education here. I am trying to guess what you meant to say here is if in f2(), f or s is already defined, it will then change the "usual" understanding. In this case, I am assuming when you start to assign in f2() a local variable f or s with an object that has __set/getself__(), then you know later on assigning to it will mean differently. How is this different from descriptors? when you do x.y = z and if y is a descriptor you don't expect x.y is now pointing/binding to z, you have to understand the object behavior anyway. I do not see how this is different in the set/getself() case.
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes" to both, then when ISN'T getself called? If the answer is "no" to both,
What's the problem for the "yes" case? If you define such an object of course __get/setself__() is always called, and f1() is still equal to f2().
then when IS it called? And if they're different, then you have the refactoring headache that I described. That's a problem that simply doesn't happen with descriptors, because this example is using local variables - local variables are the obvious way to refactor anything.
What's the point if it is never called? I cannot make any sense from this example unfortunately, maybe I am too dumb ...
You're proposing allowing values to redefine local variables. That is completely different from descriptors and other forms of magic, which allow a namespace to define how it behaves.
First, this is not my proposal. The one I had is a new operator, and it is NOT allowing values to redefine local variables. But I do like this one a lot more indeed. Second, isn't descriptor doing exactly the same thing, but just within your so called "namespace"? Isn't the top level itself a namespace? isn't every function local itself a namespace? The entire descriptor concept in my view is a awkward concept that is having too many odds and special cases. And this one, on the other hand, is truly generic and universal.
I'm basically done trying to explain this to you. Multiple people have tried to explain how this is NOT the same as descriptors, and you keep coming back to "but descriptors can do this too". They cannot. You are granting magical powers to *values*, not to namespaces. That makes basically ALL code impossible to reason about.
Sheep can never explain to wolfs why it is a good idea to be vegetarian, can they? And like wise vice versa. Please help to teach me and trust that a 10 years python developer could still learn a thing or two. And don't be too childish ;-) I really just don't get the point (if there is one).
On Thu, Jun 27, 2019 at 6:50 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 10:16 PM Chris Angelico <rosuav@gmail.com> wrote:
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes" to both, then when ISN'T getself called? If the answer is "no" to both,
What's the problem for the "yes" case? If you define such an object of course __get/setself__() is always called, and f1() is still equal to f2().
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used? The moment you try to return this object anywhere or do literally anything with it, it will devolve to the result of getself, and the original object is gone. ChrisA
On Wed, Jun 26, 2019 at 11:00 PM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 6:50 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 10:16 PM Chris Angelico <rosuav@gmail.com> wrote:
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes" to both, then when ISN'T getself called? If the answer is "no" to both,
What's the problem for the "yes" case? If you define such an object of course __get/setself__() is always called, and f1() is still equal to f2().
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used? The moment you try to return this object anywhere or do literally anything with it, it will devolve to the result of getself, and the original object is gone.
No, it won't -- getself() will/can return self, setself(self, other) will type-checking other and re-interpret them into integers, and do the magic (e.g. signal.next = integer). I implemented exactly the same thing using signal[:] overriding get/setitem(). I mean, how to use it is up to the user, there are endless possibilities. You can choose to return self, or something entirely different, the point is you now have control over "=" operator as you can for the other operators.
On Thu, Jun 27, 2019 at 7:11 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 11:00 PM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 6:50 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 10:16 PM Chris Angelico <rosuav@gmail.com> wrote:
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes" to both, then when ISN'T getself called? If the answer is "no" to both,
What's the problem for the "yes" case? If you define such an object of course __get/setself__() is always called, and f1() is still equal to f2().
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used? The moment you try to return this object anywhere or do literally anything with it, it will devolve to the result of getself, and the original object is gone.
No, it won't -- getself() will/can return self, setself(self, other) will type-checking other and re-interpret them into integers, and do the magic (e.g. signal.next = integer). I implemented exactly the same thing using signal[:] overriding get/setitem(). I mean, how to use it is up to the user, there are endless possibilities. You can choose to return self, or something entirely different, the point is you now have control over "=" operator as you can for the other operators.
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess. ChrisA
This is the example I was talking about specifically: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py#L76. There are other possibilities as well, I would be happy to explain my Ideas directly, I am not sure exactly everything Yanghao is saying as I have not been able to follow it very closely. On Wed, Jun 26, 2019 at 5:20 PM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 7:11 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 11:00 PM Chris Angelico <rosuav@gmail.com>
wrote:
On Thu, Jun 27, 2019 at 6:50 AM Yanghao Hua <yanghao.py@gmail.com>
wrote:
On Wed, Jun 26, 2019 at 10:16 PM Chris Angelico <rosuav@gmail.com>
wrote:
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes"
to
both, then when ISN'T getself called? If the answer is "no" to both,
What's the problem for the "yes" case? If you define such an object of course __get/setself__() is always called, and f1() is still equal to f2().
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used? The moment you try to return this object anywhere or do literally anything with it, it will devolve to the result of getself, and the original object is gone.
No, it won't -- getself() will/can return self, setself(self, other) will type-checking other and re-interpret them into integers, and do the magic (e.g. signal.next = integer). I implemented exactly the same thing using signal[:] overriding get/setitem(). I mean, how to use it is up to the user, there are endless possibilities. You can choose to return self, or something entirely different, the point is you now have control over "=" operator as you can for the other operators.
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZBWKD4... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On 2019-06-26 14:27, nate lust wrote:
This is the example I was talking about specifically: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py#L76. There are other possibilities as well, I would be happy to explain my Ideas directly, I am not sure exactly everything Yanghao is saying as I have not been able to follow it very closely.
I've been reading this discussion off and on, and just looked at your example. It only confirms what I've thought since the beginning: none of the examples or use cases described provide even remotely sufficient justification for changing the behavior of assignment to a bare name. They are light years away from being sufficient justification. In the unlikely event that you want something that can track all the values that have been assigned to it, I just don't see any reason why you can't make that thing be an attribute of an object. Then the descriptor protocol already handles everything under discussion here. If you don't like typing you can name the object "x" so you just have to type "x.foo = 1". Adding descriptor-like behavior to bare names massively increases the potential confusion and, as far as I can see, the only "benefit" proposed is that you don't have to type a dot. I'm frankly quite surprised that this discussion has gone on so long. I just can't see any benefit to any of the proposals that gets anywhere close to justifying the increased difficulty of reasoning about code where any assignment to a local variable can call arbitrary code under the hood. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Jun 26, 2019 at 11:16 PM Chris Angelico <rosuav@gmail.com> wrote: [...]
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess.
Sure, below is my code snippet for signal get/set, using "L[:] = thing" syntax and overriding get/setitem(). In this case, signal follow a declaration (e.g. x = signal()) and use (e.g. x[:] = thing, thing = x[...]) paradigm. getitem always return a integer or delegates it to another object (self.current). 207 def __setitem__(self, key, value): 208 if self.typ == IN: 209 raise SignalError("IN type signal cannot be assigned") 210 if self.edge != 0: 211 raise SignalError("edge signal cannot be assigned") 212 elif isinstance(value, self.__class__): 213 self._set_next(key, value.current) 214 delay = Delay(0, value.current, signal=self) 215 self.sim.delta_next.append(delay) 216 elif isinstance(value, Delay): 217 value.next = value.delay + self.sim.current_time # relative delay to absolute time 218 value.signal = self 219 self._set_next(key, value.value) 220 if value.delay == 0: 221 self.sim.delta_next.append(value) 222 else: 223 self.sim._event(value) 224 else: 225 self._set_next(key, value) 226 delay = Delay(0, value, signal=self) 227 delay.next = self.sim.current_time 228 self.sim.delta_next.append(delay) 230 def __getitem__(self, key): 231 ''' 232 Slicing of signals 233 ''' 234 if self.value_type != int: 235 return self.current.__getitem__(key) # support generic payloads, not just int, so delegate it to "current" 236 # int signal always returns an int 237 if key in [slice(None, None, None), ...]: 238 return self.current # behaves like an int if explicitly indexed 239 elif isinstance(key, slice): 240 start = key.start 241 stop = key.stop 242 step = key.step 243 if key.step: 244 step = key.step 245 else: 246 step = 1 247 if start < stop: # the little endian case 248 if stop > self.width: 249 raise SignalError("slice stop is too big: %d (> width %d)" % (stop, self.width)) 250 stop += 1 251 else: # the big endian case 252 if start >= self.width: 253 raise SignalError("slice start is too big: %d (>= width %d)" % (start, self.width)) 254 stop -= 1 255 step = 0 - step 256 vrange = range(start, stop, step) 257 length = len(vrange) 258 current = self.current 259 value_list = [] 260 #print("current:", current) 261 for i in vrange: 262 v = (current >> i) & 0x1 263 #print("v:", v) 264 value_list.append(v) 265 return self._get_value(value_list) 266 elif isinstance(key, int): 267 if key >= self.width: 268 raise SignalError("key is too big: %d (>= width %d)" % (key, self.width)) 269 return (self.current >> key) & 0x1 270 elif isinstance(key, tuple): 271 raise SignalError("signal indices must be integers or slices, not tuple") And this is how it can be used by a user: 8 @block("hdl") 9 def ADD(name, 10 a: signal(32, IN), 11 b: signal(32, IN), 12 out: signal(32, OUT), 13 ): 14 15 x = signal(1, OUT, "x") 16 17 @always(a, b) 18 def add(): 19 out[:] = (a + b) EOF.
Chris Angelico writes:
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess.
It's possible that __getself__ would be implemented "halfway". That is, if __getself__ is present, it is invoked, and performs *side effects* (a simple example would be an access counter/logger for the object). Then the compiler loads that object, discarding any return value of the method. I think this is the semantics that the proponents are thinking of.
On Jun 26, 2019, at 21:45, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Chris Angelico writes:
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess.
It's possible that __getself__ would be implemented "halfway". That is, if __getself__ is present, it is invoked, and performs *side effects* (a simple example would be an access counter/logger for the object). Then the compiler loads that object, discarding any return value of the method. I think this is the semantics that the proponents are thinking of.
The compiler has to load the object before calling __getself__, not after, or it has nothing to call __getself__ on, right? Anyway, I don’t think this really avoids most of the problems. It does get rid of the danger of infinite regress, but I think Nate already solved that. And I think everything else is an inherent consequence of trying to answer “I want to hook this operation on variables” with “here’s an hook on values instead”. Values aren’t the same thing as variables. And there doesn’t seem to be any way to avoid that with an OO dunder protocol, because variables aren’t objects, only values are. But why does it have to be OO? Tcl had this functionality decades ago, “trace variable spam read counter” and now every read of the variable “spam” (not every read of the current value in “spam” no matter where from) calls counter(). It was used for all kinds of things besides the debugging purpose it was originally added for. In fact, Python developers still use it with Tk, to hook UI bits where Tk forgot to add events or validators. And actually, you can already do that in Python: set f_trace_opcodes on the frame and sys.settrace, then on every opcode, if it’s the right kind of LOAD and the arg is the right name, trigger the callback. Of course this is slow and clumsy, but it means someone can build an example today and show “Look, this works, and it’s useful, except that it’s way too slow. Unless you’ve got a better way to make this efficient enough, that’s exactly why I need this feature.”
Andrew Barnert writes:
On Jun 26, 2019, at 21:45, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Chris Angelico writes:
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess.
It's possible that __getself__ would be implemented "halfway". That is, if __getself__ is present, it is invoked, and performs *side effects* (a simple example would be an access counter/logger for the object). Then the compiler loads that object, discarding any return value of the method. I think this is the semantics that the proponents are thinking of.
The compiler has to load the object before calling __getself__, not after, or it has nothing to call __getself__ on, right?
Correct. I should have said "leaves the object alone rather than substituting the value of the method".
Anyway, I don’t think this really avoids most of the problems.
Agreed. I just thought that it was worth clarifying that __getself__ could be entirely about side effects by definition.
And I think everything else is an inherent consequence of trying to answer “I want to hook this operation on variables”
Yeah, I really don't understand why this is desirable in Python, but if it is,
“here’s an hook on values instead”
is not the way to do it. All roads lead to "We don't need a proof of concept implementation, we need a proof of utility application."
All, Thanks for the feedback that everyone has provided. My time will be in short supply in the near term, and so I am going to focus on taking all the input that people have provided and if possible flesh out a stronger proposal. I figured this would also slow the flood of emails on this thread that people are receiving. I am still more than happy to discuss this with anyone who wants to talk about this, either on this thread, or direct mail if anyone wants to drop me a line. Over the next day or so I will try and address any outstanding questions that people have raised on this thread that are not related to work that I need to go off and do, or tests that need run. Thank you all again for the input, and helping to ask questions that are hard to see from the perspective I have. On Fri, Jun 28, 2019 at 4:04 AM Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Andrew Barnert writes:
On Jun 26, 2019, at 21:45, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Chris Angelico writes:
Then I completely don't understand getself. Can you give an example of how it would be used? So far, it just seems like an utter total mess.
It's possible that __getself__ would be implemented "halfway". That is, if __getself__ is present, it is invoked, and performs *side effects* (a simple example would be an access counter/logger for the object). Then the compiler loads that object, discarding any return value of the method. I think this is the semantics that the proponents are thinking of.
The compiler has to load the object before calling __getself__, not after, or it has nothing to call __getself__ on, right?
Correct. I should have said "leaves the object alone rather than substituting the value of the method".
Anyway, I don’t think this really avoids most of the problems.
Agreed. I just thought that it was worth clarifying that __getself__ could be entirely about side effects by definition.
And I think everything else is an inherent consequence of trying to answer “I want to hook this operation on variables”
Yeah, I really don't understand why this is desirable in Python, but if it is,
“here’s an hook on values instead”
is not the way to do it. All roads lead to "We don't need a proof of concept implementation, we need a proof of utility application." _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VIQ522... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
Chris, There are a lot of messages for me to catch up on from today, but I am heading home from work, and your most recent one is the easiest to address quickly. I want to start out by saying I agree with many of the objections and concerns raised here, and there were some I had not thought about. This is why I put the proposal out, to learn more and perhaps make a stronger proposal (I like the idea raised last night of denoting these variables with a keyword to denote they are special and should be treated as such). Yanhao, I do not think you are doing this proposal any good. I appreciate you trying to go to bat for it, but there are many good concerns here that would be good to hear out and address rather than trying to dismiss them. To your last message Chris, I just wanted to point out one way I envision this may be used used (Although it is still a bit of a toy). In my longer examples.py file I have a history variable that tracks all the values that were bound to the name (using __setself__) in a history list. __getself__ would then always return the most recently set value when loaded. I introduced a "built in" called getcloaked what would allow fetching the actual cloaking variable such that it could be used. In this case that would be something like getcloaked('var').rollback_history(2) to move back to a previous assignment. This could potentially be used in say a debugger, or try except or the like. As you say this would only be good from within a single scope, unless the return statement of the function was: return getcloaked('var') (or conversely a function was called like foo(getcloaked('var") to pass it into scope). I do think that this proposal needs work (or possibly thrown out all together if it could not be refined), and all the ideas and questions were exactly what I was hoping for, as there is more that others will be able to see than I am alone. On Wed, Jun 26, 2019 at 5:04 PM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 6:50 AM Yanghao Hua <yanghao.py@gmail.com> wrote:
On Wed, Jun 26, 2019 at 10:16 PM Chris Angelico <rosuav@gmail.com>
wrote:
Let's suppose that frob() returns something that has a __getself__ method. Will f1 trigger its call? Will f2? If the answer is "yes" to both, then when ISN'T getself called? If the answer is "no" to both,
What's the problem for the "yes" case? If you define such an object of course __get/setself__() is always called, and f1() is still equal to f2().
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used? The moment you try to return this object anywhere or do literally anything with it, it will devolve to the result of getself, and the original object is gone.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DXRITY... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Wed, Jun 26, 2019 at 11:24 PM nate lust <natelust@linux.com> wrote: [...]
Yanhao, I do not think you are doing this proposal any good. I appreciate you trying to go to bat for it, but there are many good concerns here that would be good to hear out and address rather than trying to dismiss them.
I was trying to understand the concerns ... not dismissing them. most of the concerns is of course valid, but many seems to be extremely general/high-level which seems more of subjective matter rather than objective technical discussions. I can back off if you think I am not helping here ... it is your battle just remember you are not the only one likes this feature.
On Jun 26, 2019, at 13:53, Chris Angelico <rosuav@gmail.com> wrote:
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used?
I think Yanghao has misunderstood the proposal. What Nate actually proposed is not that every time the value is accessed it calls __getself__, only accesses that are loads from a variable. The infinite regress problem is no worse than, say, __getattribute__ usually needing to call a method on super or object. But if I’m right, then your initial refactoring point is still unanswered. You can return or otherwise use the value of self or the return of the uncloak function or whatever without triggering another call to __getself__, but if you refactor the return statement by storing an intermediate value in a temporary local variable, accessing that variable will trigger another call to __getself__. I assume Nate is working on an answer to that along with all the other points that have been raised.
:Hello all, I am sorry to send this out to the list and then be so silent, work and children and all. I am going to try and address the issues raised in this thread, with apologies if I miss someones issue. Some of these are addressed in the documentation I wrote, but I know I can't expect people to take their time to go off and read a lot of material, I am thankful enough for the time that people put even into reading and responding. This message might get long, again, my apologies. I have put names by the response to that person, so you can read your section. The response may refer to other responses, but this was long enough without duplication. Again I might not have been completely through with each response in light of the length, but tried to hit the high points and conceptual discussions, if not the details. For these responses I may refer to some of the demos here: https://github.com/natelust/CloakingVarWriteup/blob/master/examples.py Ben Rudiak-Gould: Your point with iteration is well taken. Depending on the variable bound, it may end up transparent to you without you needing to care, as in the case of something like my HistoricVar demo. However it is a fair point, that it might not be transparent and you could end up with an exception thrown at some point. I would argue this is not much different than using any library code where you would need to understand the api, but documentation is not always up to par, and it is indeed one more thing to think about or check for. Your idea of marking a variable with something like metavar is an interesting idea, and definitely something I am going to think more on. If this is required for the new behavior in each function where it is used I don't think that is compatible with how I was envisioning this to work, it would require more thought. At this point one of the benefits of this type of metavar, is that I can declare it with whatever side effects I want (such as a callback on assignment) and pass it into existing third party code and not need that code to be altered, that code would not care that it is special. This is a case where the sense of responsibility is flipped from what you are worried about though, where I would be responsible for creating an appropriate variable (like I am for types now) when I am using your library. I am not sure how that would play well with a keyword at declaration, except that someone could read my code (or a library using this type of variable) and see that it was defined in a special way. Rhodri James: Your point seems similar Ben's above so perhaps that addresses some of your points. I think the idea is in what context will this code be used. If I am a library author then I might not want to expose this type of variable to outside the api without some sort of annotation. That does not mean it could not be really useful inside an api, or for instance used as an instance level property. Anders Hovmoller: Apologies about not getting the accents in your name as I type this. I appreciate the devils advocate stance. If this proposal went anywhere or not, I was hopping to learn a lot through the discussion, and process. Continuing the discussion really helps to that end. Chris Angelico: In my proposal you are correct that in functions such as: def f1(): x = metavar() return x def f2(): return metavar() would not have worked as expected. As soon that the variable became bound to the name, the LOAD operations that happened before a RETURN operation would call __getself__ and the result would be returned. A built in method wold have been required to get to the underlying object, such as in def f3(): x = metavar() return getcloaked('x') This means that those two functions would have resulted in different returns. (A similar problem would exist for function calls). I was never entirely happy with this behavior either. A little while ago I pushed a patch up to my cpython fork that made the behavior the same as existing python, if you use a metavar in the return statement, it will be returned from the function to the calling scope, where it will again behave as a metavar (I also made the same change for calling a function with a metavar as an argument) I hope some of the examples I linked to in previous messages helped, but I am happy to talk more about what the call chain looks like, and how I intend things to work. Greg Ewing: I think some of your points were also addressed above. I am happy to expand on any of them myself if the above has not addressed them. Andrew Barnert: I thank you for also playing devils advocate, and if not providing a "vote of confidence" then a willingness to make sure my ideas are heard, and that I have a chance to address and possibly make a better proposal. Brendan Barnwell: I can appreciate your position, and might well feel the same if I was a maintainer of the python code base. However, I will confess I did come away a little stung by the tone of your message. This may just be because it is a bit dearer to me, as it is something I have been laboring on. I tried to write up documentation and do do diligence to have material ready before coming to the list as to not present a half baked idea. That said, I certainly was hoping for feedback, either to create a better proposal, or at least learn something. Whomever is willing to discuss this with me, I am willing to listen, and or learn. When no one wants to discuss this, I will not labor the point. I feel like I have failed in communicating some of my ideas and use cases to you. In the case you bring up about tracking the history I could indeed handle tracking in the manor you described, if I was in control of all the code in use. If however, I created a metavar in one of my own applications, and passed it down into some preexisting image processing pipeline, there would be no way for me to control the behavior. WIth my proposal, I get to define a variable that can meet my needs as well as those of the library I may be using. Another example is template expressions from my examples.py (which may be a bad name, as in c++ they are templates, even though the concept is the same, the names dont carry over well) By creating a proxy object in an arithmetic method call, arithmetic operations can be deferred and tracked inside the proxies. This can be done any number of times, with the final proxy object being bound to the variable name, i.e. y = x1+ {<= this creates a proxy object, also with add defined} x2+x3+x4 When y is accessed __getself__ would be called, which could evaluate the stored expressions in a single loop and returned. This saves creating temporary object and potentially many expensive python for loops. (The use case I had mainly in mind is something like an array, but any object could make use of this) Another use case is that of defining const. With suitable definitions of __setself__ and __getself__ a variable could be wrapped that could then not be rebound. Many modules make use of constants, and it may be advantageous to protect these through enforcing const-ness. You could put them behind a class with __setattr__ set, but that only changes the immutability of the class instance, leaving the instance free to be re-bound. I know the Pybind11 people struggle with how best to track and account for the lack of const-ness in python. It would be a fair point to say python should not try to cater to other projects, but it would not hurt if it was a bonus. I know even just sticking to pure python, the software stack I work on has a sufficiently large number of moving pieces, which end user consumers to worry about, we would be very happy to have a little extra safely from const. Overall I am happy to have had feed back from everyone, and am grateful for the time that they have put in. If nothing else I have learned about what it takes to communicate ideas to a large number of people who are not co-located. On Wed, Jun 26, 2019 at 6:15 PM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
On Jun 26, 2019, at 13:53, Chris Angelico <rosuav@gmail.com> wrote:
Then in what circumstances will getself NOT be called? What is the point of having an object, if literally every reference to it will result in something else being used?
I think Yanghao has misunderstood the proposal. What Nate actually proposed is not that every time the value is accessed it calls __getself__, only accesses that are loads from a variable. The infinite regress problem is no worse than, say, __getattribute__ usually needing to call a method on super or object.
But if I’m right, then your initial refactoring point is still unanswered. You can return or otherwise use the value of self or the return of the uncloak function or whatever without triggering another call to __getself__, but if you refactor the return statement by storing an intermediate value in a temporary local variable, accessing that variable will trigger another call to __getself__. I assume Nate is working on an answer to that along with all the other points that have been raised.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/72KYUG... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On 27/06/2019 04:57, nate lust wrote:
However it is a fair point, that it might not be transparent and you could end up with an exception thrown at some point. I would argue this is not much different than using any library code where you would need to understand the api, but documentation is not always up to par, and it is indeed one more thing to think about or check for.
I'm afraid I think it's considerably worse than needing to understand a library's API. When I write something like: dastardly.plane.state = DO_SOMETHING_MUTTLEY I can accept that things might happen behind the scenes (Muttley might do something). I'm manipulating an object that the library has handed me, it's allowed to be a bit strange as long as that doesn't impinge on me much. Personally I would prefer a function interface as that's more explicit that magic may happen, but objects are (intuitively) allowed to be complex. On the other hand, when I write: meta_muttley = get_something_shiny_and_new() I do not expect anything magic to happen. There is no contextual clue that what looks like a straightforward name rebinding is going to do something quite different, and not leave me with something shiny and new after all. Suddenly a name isn't a simple name any more, and it's not at all obvious. -- Rhodri James *-* Kynesim Ltd
Sorry if this has already been answered, but if it has I missed it. From your demo: class HistoricVar: def __getself__(self): return self.var What stops the reference to 'self' here from invoking __getself__ again? -- Greg
This is one of the exception cases in my proposal, variables used inside their own methods don't have __getself__ called when they are loaded. The proposal basically reads: class instances that are looked up with a named variable will have __getself__ invoked when the interpreter loads them, except when they are used within their own methods, or when used as an argument to/from a function. That second part about functions was not part of the original proposal, but is something I am currently exploring to address some questions others have raised. I have not fully thought about what the consequences would be, and it may or may not stick around in the long term. I feel quite strongly about the first exception however, as it makes writing a class body manageable, without over burdening the author to put special escape functions all over the place. On Thu, Jun 27, 2019 at 7:42 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Sorry if this has already been answered, but if it has I missed it.
From your demo:
class HistoricVar:
def __getself__(self): return self.var
What stops the reference to 'self' here from invoking __getself__ again?
-- Greg _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/A43QPH... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
Yanghao Hua wrote:
this one, on the other hand, is truly generic and universal.
The problem is that it's *too* universal. A __getself__ method would be a sorcerer's-apprentice kind of magic that you can't escape when you don't want it. Suppose you want to inspect an object for debugging purposes. If it has a __getself__ method that returns something other than itself, it would be impossible to even look at it without it morphing into something else.
Isn't the top level itself a namespace? isn't every function local itself a namespace?
Yes, but namespace-based magic is confined to the namespace you define it on. The proposed object-based magic would be completely inescapable. -- Greg
Yanghao Hua wrote:
when you do x.y = z and if y is a descriptor you don't expect x.y is now pointing/binding to z, you have to understand the object behavior anyway. I do not see how this is different in the set/getself() case.
The difference is that the special behaviour is associated with 'x.y', not just 'y'. -- Greg
On Thu, Jun 27, 2019 at 12:46:58AM +1000, Chris Angelico wrote:
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring.
I'm not convinced that this is going to change under the proposal. Since neither f nor s already exist, they cannot overload assignment. Unless something in Nate's proposal is quite different from earlier concrete proposals, I don't think this is a slam-dunk criticism. I think that it is a problem is theory but not in practice. The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment: # Earlier: f = something_that_overloads_assignment() # Later on, refactor and re-use f: f = frob(x) # calls overloaded assignment s = f.spam # And then go back to using f process(f) # Oops, shoot yourself in the foot But the loading and pointing of the gun happened earlier, when you re-used an existing variable, not because of assignment overloading. As further evidence that this is not a problem in practice, I give you C++ as a data point. C++ is criticised on many, many grounds, e.g.: http://cshandley.co.uk/CppConsideredHarmful.html but it's quite hard (at least for me) to find any serious criticism of specifically assignment overloading. See for example: https://duckduckgo.com/?q=assignment+overloading+considered+harmful https://duckduckgo.com/?q=assignment+overloading+criticism If anyone wants to so their own searches, please let us know if you find any criticism grounded in *actual experience* with assignment overloading rather than theoretical fears. The closest to a practical criticism I have found is this one: http://www.finagle.org/blog/ruby-gotcha-chained-assignment but you'll notice that in this case Ruby's behaviour is exactly the same as Python's, and it involved assignment to an attribute. -- Steven
On Thu, Jun 27, 2019 at 11:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Jun 27, 2019 at 12:46:58AM +1000, Chris Angelico wrote:
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring.
I'm not convinced that this is going to change under the proposal. Since neither f nor s already exist, they cannot overload assignment. Unless something in Nate's proposal is quite different from earlier concrete proposals, I don't think this is a slam-dunk criticism. I think that it is a problem is theory but not in practice.
That would be true if the proposal were only for __setself__, but the __getself__ part makes things more complicated. I'm not 100% sure because I don't fully understand the proposal, but if the object returned by frob(x) has a __getself__ method, I think it would be called in f2 but NOT in f1 (because referencing the name "f" triggers the get, whereas simply chaining ".spam" onto the end of the function call doesn't). Does that change your view of it? ChrisA
I addressed some of the concerns you were responding to in the long email I wrote last night. I introduced a change to address this, see previous email for more details. On Thu, Jun 27, 2019 at 9:41 AM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 11:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Jun 27, 2019 at 12:46:58AM +1000, Chris Angelico wrote:
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring.
I'm not convinced that this is going to change under the proposal. Since neither f nor s already exist, they cannot overload assignment. Unless something in Nate's proposal is quite different from earlier concrete proposals, I don't think this is a slam-dunk criticism. I think that it is a problem is theory but not in practice.
That would be true if the proposal were only for __setself__, but the __getself__ part makes things more complicated. I'm not 100% sure because I don't fully understand the proposal, but if the object returned by frob(x) has a __getself__ method, I think it would be called in f2 but NOT in f1 (because referencing the name "f" triggers the get, whereas simply chaining ".spam" onto the end of the function call doesn't).
Does that change your view of it?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TK3EFW... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
There seems to be some confusion to what is going on with the __getself__ method. This is almost certainly dues to my lack of communicating things clearly. I am going to attempt to walk through what will happen in context of a code example. The important thing to keep in mind is that __getself__ is invoked only on named object access, i.e. LOAD_VALUE, and not on variables accessed from a container, or fetched from the stack (in the case of the cpython interpreter access. Say we have the following "expression template" from my examples.py. I will be using the types SimpleArray, and SimpleArrayExecutor, where the later has __getself__ defined. a, b, and c are instances of SimpleArray. d = a + b + c print(d) * a and b are named variables and are loaded from locals. * The defined add object is called, which returns a SimpleArrayExecutor which is placed back on the stack * this anonymous object is passed along with c to the add operator defined on SimpleArrayExecutor * the result is returned and bound to the name d * print is loaded onto the stack * d is loaded, and since it is named, its __getself__ is triggered, a single loop over the arrays happens, a new SimpleArray is constructed and is placed on the stack. * The call instruction is called with print and the returned SimpleArray on the stack. now for a similar case tmp = a + b d = tmp + c print(d) * a and be are loaded from locals * the call instruction happens with the defined add method * The result is a SimpleArrayExecutor and it is bound to the name tmp * tmp is accessed, because it defines __getself__ it is executed where a loop happens, and a SimpleArray is put onto the stack * c is loaded onto the stack and the call instruction is called with the add method * The result is a SimpleArrayExecutor and is stored with the name d * d is loaded, and since it is named, its __getself__ is triggered, a single loop over the arrays happens, a new SimpleArray is constructed and is placed on the stack. * The call instruction is called with print and the returned SimpleArray on the stack. This results in the same result, but with an extra loop. This can be avoided by using the built in getcloaked (the function name is up in the air) similar to the object.__getattr__ escape hatch. I.E.: tmp = a + b d = getcloaked(tmp) + c prtint(d) Now the behavior is the same as the first case, as getcloaked returns the metavariable that has not been bound to a name and so it is loaded right on the stack. There are two (three) important cases when I have exempt __getself__ from being called. First is when an object is used in methods defined within itself. This means that self can be used when defining methods without triggering recursive behavior. The other cases are calling or returning from a function. This is to ensure the following say consistent. f1(): x = MetaVar() return x f2(): return MetaVar() In f1 the return function evaluates if its return argument is the result of a metavar __getself__ call and if so, returns the metavar instead. This is all done behind what is essentially if (x != NULL) code blocks in c, so that the runtime hit on any code that does not use this feature is kept to a minimum. On Thu, Jun 27, 2019 at 9:41 AM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Jun 27, 2019 at 11:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Jun 27, 2019 at 12:46:58AM +1000, Chris Angelico wrote:
There are many things that can be implemented with dunders, yes, but in Python, I would expect these two functions to behave identically:
def f1(x): return frob(x).spam
def f2(x): f = frob(x) s = f.spam return s
This correlation is critical to sane refactoring.
I'm not convinced that this is going to change under the proposal. Since neither f nor s already exist, they cannot overload assignment. Unless something in Nate's proposal is quite different from earlier concrete proposals, I don't think this is a slam-dunk criticism. I think that it is a problem is theory but not in practice.
That would be true if the proposal were only for __setself__, but the __getself__ part makes things more complicated. I'm not 100% sure because I don't fully understand the proposal, but if the object returned by frob(x) has a __getself__ method, I think it would be called in f2 but NOT in f1 (because referencing the name "f" triggers the get, whereas simply chaining ".spam" onto the end of the function call doesn't).
Does that change your view of it?
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TK3EFW... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Fri, Jun 28, 2019 at 1:59 AM nate lust <natelust@linux.com> wrote:
d = a + b + c print(d)
tmp = a + b d = getcloaked(tmp) + c prtint(d)
Now the behavior is the same as the first case, as getcloaked returns the metavariable that has not been bound to a name and so it is loaded right on the stack.
Or is it getcloaked("tmp"), which has to magically locate something *by name*? Because calling getcloaked(tmp) would have to call __getself__. Unless it's a magical construct. In any case, you make it so that ANY refactoring has to call getcloaked, just in case there's a __getself__ lurking in the wings. That's a pretty terrible cost.
There are two (three) important cases when I have exempt __getself__ from being called. First is when an object is used in methods defined within itself. This means that self can be used when defining methods without triggering recursive behavior. The other cases are calling or returning from a function. This is to ensure the following say consistent.
f1(): x = MetaVar() return x
f2(): return MetaVar()
In f1 the return function evaluates if its return argument is the result of a metavar __getself__ call and if so, returns the metavar instead.
Eww. Extremely magical. And, remind me, what problem(s) is __getself__ solving? ChrisA
On Jun 27, 2019, at 08:59, nate lust <natelust@linux.com> wrote:
There are two (three) important cases when I have exempt __getself__ from being called. First is when an object is used in methods defined within itself. This means that self can be used when defining methods without triggering recursive behavior.
What happens in cases like the way operators are defined in Fraction (which the docs suggest is the best way to implement operators for your own custom numeric types): def _operator_fallbacks(monomorphic_operator, fallback_operator): def forward(a, b): # complicated stuff # ... def reverse(b, a): # more… # … return forward, backward def _add(a, b): da. db = a.denominator, b.denominator return Fraction(a.numerator * db + b.numerator * da, da * db)) __add__. __radd__ = _operator_fallbacks(_add, operator._add) # similar fallbacks calls for all the other ops The forward and reverse functions are not defined at class scope, but as locals within a class-scope function (one which isn’t called as a method), and they don’t call their arguments self, but they do get bound to the class names __add__ and __radd__ and eventually called as methods. So, does some part of that trigger the magic so that the a argument in forward and the a (or is it b?) argument in reverse get exempted, or not? If so, does it also trigger for the other argument iff a is b? I’m not even sure which behavior I’d want, much less which I think your code will do based on your description. Presumably this would come up in many real-world expression-template libraries. At least it does in those that already exist, like SymPy, where this: x = sympy.Symbol('x') y = 2 + x … calls x.__radd__(2) which returns a sympy.Add object using roughly similar code. (There’s no way to capture a “plain” value like 2 as part of an expression otherwise.)
The other cases are calling or returning from a function.
If calling a function is magic, but using an operator isn’t, doesn’t that mean that operator.add(a, b) is no longer equivalent to a+b (the whole reason the operator module exists), and np.matmul(x, y) and x @ y, and so on?
This is to ensure the following say consistent.
f1(): x = MetaVar() return x
f2(): return MetaVar()
But if it’s, say, a yield expression instead of a return statement, they’re not consistent anymore, right? So this fixes one common refactoring step, but only by making it work differently from other very similar refactoring steps.
In f1 the return function evaluates if its return argument is the result of a metavar __getself__ call and if so, returns the metavar instead.
What if your __getself__ is there for side effects rather than for replacing the value, as in your counter and logger examples? In that case, throwing away the return value (which is just x anyway) doesn’t make the refactoring idempotent; the extra side effects still happen. Also, what about these cases: return x or None # x is truthy return y or x # y is falsey return x if spam else y # spam is truthy return fixup(x) return x + y # y is 0 or “” or () or similar I’m not sure whether I expect, or want, the magic to trigger to throw away the result and return x instead. Or, if you meant that if checks the bytecode statically to see if it’s just returning the result of a load without doing anything in between, what if, say, x.__getself__() raises?
On Thu, Jun 27, 2019 at 11:39:04PM +1000, Chris Angelico wrote:
On Thu, Jun 27, 2019 at 11:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
I'm not convinced that this is going to change under the proposal. Since neither f nor s already exist, they cannot overload assignment. Unless something in Nate's proposal is quite different from earlier concrete proposals, I don't think this is a slam-dunk criticism. I think that it is a problem is theory but not in practice.
That would be true if the proposal were only for __setself__, but the __getself__ part makes things more complicated. I'm not 100% sure because I don't fully understand the proposal, but if the object returned by frob(x) has a __getself__ method, I think it would be called in f2 but NOT in f1 (because referencing the name "f" triggers the get, whereas simply chaining ".spam" onto the end of the function call doesn't).
Without getting into any debate over whether we need both __getself__ and __setself__, I assume that if __getself__ is called at all, it is called consistently. Any lookup of the object ought to call it, not just a bare name. If Nate's proposal says differently, I would hope it is an oversight rather than an intended feature. -- Steven
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable? def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff Under the proposal, we have no guarantee that each iteration through the loop will give us a new `f` to work with. Maybe yes, maybe no. Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
I think in this case, it would depend on what the metavar is designed to do. If f was a metavar that took in a value, had side effects, and then presented the variable back on load (that would be the __getself__ part of the proposal) I also included built ins for working with this type of variable, such as iscloaked('f') which would let you know if this is a metavar. There is a builtin called setcloaked(name, value) that will always trigger the existing python assignemnt, regardless of the type of variable. This is similar to the escape hatch of object.__setattr__ for classes. I would argue that you are calling process_the() because you have looked up its function and found that it does what you want it to do. The process of seeing its result would be the same. It is currently possible that a function called process_the would return an array (what you need to call munge_result) 99% of the time, but have a random clause to return a string, which would result in an exception thrown in your code. The same is possible with my proposal, if a library author wrote something that was hard to consume, it would be hard to consume. There are utilities available to gaurd against unexpected code currently (isinstance), and I am proposing the same. On Thu, Jun 27, 2019 at 9:58 AM David Mertz <mertz@gnosis.cx> wrote:
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable?
def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff
Under the proposal, we have no guarantee that each iteration through the loop will give us a new `f` to work with. Maybe yes, maybe no. Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action.
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5IQOHC... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
Obviously many kinds of errors are possible in existing Python. My quick sample made one by passing 'data' rather than my intended 'datum'. As you mention, maybe 'process_the()' will return the wrong kind of object some or all of the time. Mostly I would expect subsequent code to throw a ValueError or similar when trying to deal with that bad object, and the traceback would be fairly straightforward to diagnose. But allowing purported binding to instead perform mutation of the same object previously bound introduces a new category of errors, and ones that are generally more difficult to anticipate and debug. Moreover, this new magic is entirely needless. Properties already 100% cover the plausible need. It's really no harder to write 'f.new = mod_value()' than it is to write 'f = mod_value()' with magic behind the scenes. On Thu, Jun 27, 2019, 11:25 AM nate lust <natelust@linux.com> wrote:
I think in this case, it would depend on what the metavar is designed to do. If f was a metavar that took in a value, had side effects, and then presented the variable back on load (that would be the __getself__ part of the proposal) I also included built ins for working with this type of variable, such as iscloaked('f') which would let you know if this is a metavar. There is a builtin called setcloaked(name, value) that will always trigger the existing python assignemnt, regardless of the type of variable. This is similar to the escape hatch of object.__setattr__ for classes.
I would argue that you are calling process_the() because you have looked up its function and found that it does what you want it to do. The process of seeing its result would be the same. It is currently possible that a function called process_the would return an array (what you need to call munge_result) 99% of the time, but have a random clause to return a string, which would result in an exception thrown in your code. The same is possible with my proposal, if a library author wrote something that was hard to consume, it would be hard to consume. There are utilities available to gaurd against unexpected code currently (isinstance), and I am proposing the same.
On Thu, Jun 27, 2019 at 9:58 AM David Mertz <mertz@gnosis.cx> wrote:
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable?
def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff
Under the proposal, we have no guarantee that each iteration through the loop will give us a new `f` to work with. Maybe yes, maybe no. Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action.
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5IQOHC... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
That is true when it is code that you control, but if it is the case you want to use some specialized metavar in the context of someone else's library things are a bit different. This could be something such as using something like a debugger tracing through some execution where you want to record how the value was changed and where. On Thu, Jun 27, 2019 at 12:02 PM David Mertz <mertz@gnosis.cx> wrote:
Obviously many kinds of errors are possible in existing Python. My quick sample made one by passing 'data' rather than my intended 'datum'.
As you mention, maybe 'process_the()' will return the wrong kind of object some or all of the time. Mostly I would expect subsequent code to throw a ValueError or similar when trying to deal with that bad object, and the traceback would be fairly straightforward to diagnose.
But allowing purported binding to instead perform mutation of the same object previously bound introduces a new category of errors, and ones that are generally more difficult to anticipate and debug.
Moreover, this new magic is entirely needless. Properties already 100% cover the plausible need. It's really no harder to write 'f.new = mod_value()' than it is to write 'f = mod_value()' with magic behind the scenes.
On Thu, Jun 27, 2019, 11:25 AM nate lust <natelust@linux.com> wrote:
I think in this case, it would depend on what the metavar is designed to do. If f was a metavar that took in a value, had side effects, and then presented the variable back on load (that would be the __getself__ part of the proposal) I also included built ins for working with this type of variable, such as iscloaked('f') which would let you know if this is a metavar. There is a builtin called setcloaked(name, value) that will always trigger the existing python assignemnt, regardless of the type of variable. This is similar to the escape hatch of object.__setattr__ for classes.
I would argue that you are calling process_the() because you have looked up its function and found that it does what you want it to do. The process of seeing its result would be the same. It is currently possible that a function called process_the would return an array (what you need to call munge_result) 99% of the time, but have a random clause to return a string, which would result in an exception thrown in your code. The same is possible with my proposal, if a library author wrote something that was hard to consume, it would be hard to consume. There are utilities available to gaurd against unexpected code currently (isinstance), and I am proposing the same.
On Thu, Jun 27, 2019 at 9:58 AM David Mertz <mertz@gnosis.cx> wrote:
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable?
def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff
Under the proposal, we have no guarantee that each iteration through the loop will give us a new `f` to work with. Maybe yes, maybe no. Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action.
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5IQOHC... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
It's always pretty easy to turn it into "code you control." Just take whatever the plan value/object is and wrapped it in a class with a '._value' attribute that holds the original. From there, as as many properties as you like, each of which had whatever side effects you wish. That side effect might be mutating '._value', as you like. On Thu, Jun 27, 2019, 12:20 PM nate lust <natelust@linux.com> wrote:
That is true when it is code that you control, but if it is the case you want to use some specialized metavar in the context of someone else's library things are a bit different. This could be something such as using something like a debugger tracing through some execution where you want to record how the value was changed and where.
On Thu, Jun 27, 2019 at 12:02 PM David Mertz <mertz@gnosis.cx> wrote:
Obviously many kinds of errors are possible in existing Python. My quick sample made one by passing 'data' rather than my intended 'datum'.
As you mention, maybe 'process_the()' will return the wrong kind of object some or all of the time. Mostly I would expect subsequent code to throw a ValueError or similar when trying to deal with that bad object, and the traceback would be fairly straightforward to diagnose.
But allowing purported binding to instead perform mutation of the same object previously bound introduces a new category of errors, and ones that are generally more difficult to anticipate and debug.
Moreover, this new magic is entirely needless. Properties already 100% cover the plausible need. It's really no harder to write 'f.new = mod_value()' than it is to write 'f = mod_value()' with magic behind the scenes.
On Thu, Jun 27, 2019, 11:25 AM nate lust <natelust@linux.com> wrote:
I think in this case, it would depend on what the metavar is designed to do. If f was a metavar that took in a value, had side effects, and then presented the variable back on load (that would be the __getself__ part of the proposal) I also included built ins for working with this type of variable, such as iscloaked('f') which would let you know if this is a metavar. There is a builtin called setcloaked(name, value) that will always trigger the existing python assignemnt, regardless of the type of variable. This is similar to the escape hatch of object.__setattr__ for classes.
I would argue that you are calling process_the() because you have looked up its function and found that it does what you want it to do. The process of seeing its result would be the same. It is currently possible that a function called process_the would return an array (what you need to call munge_result) 99% of the time, but have a random clause to return a string, which would result in an exception thrown in your code. The same is possible with my proposal, if a library author wrote something that was hard to consume, it would be hard to consume. There are utilities available to gaurd against unexpected code currently (isinstance), and I am proposing the same.
On Thu, Jun 27, 2019 at 9:58 AM David Mertz <mertz@gnosis.cx> wrote:
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable?
def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff
Under the proposal, we have no guarantee that each iteration through the loop will give us a new `f` to work with. Maybe yes, maybe no. Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action.
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5IQOHC... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
On Thu, Jun 27, 2019 at 12:02:38PM -0400, David Mertz wrote:
Moreover, this new magic is entirely needless. Properties already 100% cover the plausible need.
I've been using Python since version 1.5 and I'm yet to learn a way to prevent re-binding of a simple (undotted) name: x = 1 # okay assert x == 1 x = 2 # raises an exception I'm always happy to learn something new, so if we really can do this with ``property`` I look forward to learning how. But I'm pretty sure you can't do it, not without shifting the goal posts and telling me that I don't really want to do what I said I want to do.
It's really no harder to write 'f.new = mod_value()' than it is to write 'f = mod_value()' with magic behind the scenes.
Well, it's really four key presses harder. But who's counting? -- Steven
On Thu, Jun 27, 2019 at 09:55:53AM -0400, David Mertz wrote:
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable?
def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff
Rewrite it in what way? What problem are you trying to solve? If you just want to "flush" the local variables at the end of each loop and ensure that they have been deleted, you can explicitly delete them: del f, g, datum If you are talking about the kind of refactoring Chris was talking about, I don't see any reason, or opportunity, to do that. If anything, you could reverse the refactoring: def frobnicate(data): stuff = [] for datum in data: stuff.append(munge_result(process_the(datum))) return stuff which leads to the obvious comprehension: return [munge_result(process_the(datum)) for datum in data] which allows us to eliminate the variable re-use: return list(map(lambda x: munge_result(process_the(x)), data)) [...]
Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action.
That's exactly the situation right now. Functions can perform arbitrary actions and return the same object each time. def process_the(arg): print("do something arbitrary") return None -- Steven
The variables (names) 'f' and 'g' are reused every time the loop iterates. You are correct that doing an explicit 'del' within the loop would presumably prevent the magic mutation-not-binding behavior under discussion. I still don't want the behavior, but I admit that's a pretty easy way to be more explicit if it were added. On Thu, Jun 27, 2019, 11:56 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Jun 27, 2019 at 09:55:53AM -0400, David Mertz wrote:
On Thu, Jun 27, 2019 at 9:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
How would you propose to write this code without reusing a variable?
def frobnicate(data): stuff = [] for datum in data: f = process_the(data) g = munge_result(f) stuff.append(g) return stuff
Rewrite it in what way? What problem are you trying to solve?
If you just want to "flush" the local variables at the end of each loop and ensure that they have been deleted, you can explicitly delete them:
del f, g, datum
If you are talking about the kind of refactoring Chris was talking about, I don't see any reason, or opportunity, to do that. If anything, you could reverse the refactoring:
def frobnicate(data): stuff = [] for datum in data: stuff.append(munge_result(process_the(datum))) return stuff
which leads to the obvious comprehension:
return [munge_result(process_the(datum)) for datum in data]
which allows us to eliminate the variable re-use:
return list(map(lambda x: munge_result(process_the(x)), data))
[...]
Without looking at the source code in `process_the()` we have no way of knowing whether `f` is being bound to a new object each time or some completely different arbitrary action.
That's exactly the situation right now. Functions can perform arbitrary actions and return the same object each time.
def process_the(arg): print("do something arbitrary") return None
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OCJYGI... Code of Conduct: http://python.org/psf/codeofconduct/
Steven D'Aprano wrote:
The only risk here is if your refactoring does something silly, such as reusing a variable which overrides assignment:
What if the variable is bound inside a loop? Does that count as silly? -- Greg
On 06/26/2019 07:34 AM, Anders Hovmöller wrote:
I 100% agree that this proposal is a bad idea. But I do have to play Devils advocate here.
The the-code-is-understandable-at-face-value ship has already sailed. + doesn't mean add, it means calling a dunder function that can do anything.
True, it can do anything -- but if the thing it does is not related to combining two things together and returning the result, people will be surprised and consider it a bad function. (Or they should. ;-)
Foo.bar = 1 doesn't mean set bar to 1 but calling a dunder method.
But the `Foo.` tells us Magic May Be Happening, and we still expect something reasonable to occur -- perhaps a boundary check, maybe a cache lookup, perhaps a type change to an equal-but-different representation (1.0 instead of 1, for example), or even setting other dependent variables. -- ~Ethan~
On Jun 26, 2019, at 07:34, Anders Hovmöller <boxed@killingar.net> wrote:
I 100% agree that this proposal is a bad idea. But I do have to play Devils advocate here.
The the-code-is-understandable-at-face-value ship has already sailed. + doesn't mean add, it means calling a dunder function that can do anything.
No, + does mean add. But Python doesn’t know what it means to add two Fraction or Decimal or ndarray objects, so if you’re the one writing that class, you have to tell it. It still means add—unless you lie to your readers. And you can always lie to your readers; dunder methods aren’t needed for that. Sure, you could define Fraction.__add__(self, other) to print self to a file named str(other) and return clock(), but you could just as easily store the numerator in an attribute named “denominator”, or name the class “EmployeeRecord” instead of “Fraction”, or store the Fraction 1/2 in a variable named “pathname”. It’s not up to Python to prevent you from lying to your readers.
Foo.bar = 1 doesn't mean set bar to 1 but calling a dunder method.
No, it does mean setting bar to 1. The only difference between __add__ and __setattr__ is that the latter has default behavior that works for many classes. If your type(Foo) class has disk-backed attributes or immutable attributes or attributes that dir in reverse order of assignment rather than arbitrary order, you have to tell Python how to do that. Unless you’re lying, you’re defining what it means to set the bar attribute to 1, not defining Foo.bar = 1 to mean something different from setting the bar attribute. The problem isn’t that __setself__ could be used to lie; the problem is that __setself__ can’t be used in a way that isn’t lying. None of the suggested examples are about providing a way to define what binding x to 1 means in the local/classdef/global namespace, they’re all about providing a way to make x = 1 not mean binding x to 1 in that namespace at all. In particular, the best example we’ve seen amounts to “Python doesn’t have a send operator like <- so instead of adding one, let’s allow people to misuse = to mean send instead of than assign”. The obvious way to justify this is by appeal to descriptors: the __set__ method isn’t there because people want to use descriptors directly, it’s there because people do want to use classes with custom attributes like properties, classmethods, etc. and descriptors make defining those classes easier. Maybe in a better example, we’d see that __setself__ is similarly there to make it easier to define namespaces with custom bindings easier, and people do want those namespaces. If so, the reason not everyone is convinced is that we’ve only seen bad examples so far. But then someone needs to give a good example.
In python code basically can't be understood at face value.
This is the Humpty Dumpty argument from Alice. English can’t be understood at face value if Humpty can use any word to mean anything he wanted, rather than what Alice expected that word to mean. And yet, among normal speakers—even with slightly different idiolects, even in discourses that explicitly redefine words (as with most math papers, which is probably what Lewis Carroll has in mind)—English actually can be understood, it just can’t prevent Humpty from misusing it.
Anders Hovmöller writes:
In python code basically can't be understood at face value.
Not really a problem. The #ToddlerInChief doesn't code; the rest of us are adults and use code by consent. Obfuscate your code by seriously violating the expections for the meaning of "+" or ".denominator" and we'll withdraw consent. Really, it's as easy as that. "We" know unacceptable obfuscation "when we see it." There's no "red line", but this is good enough. The problem with the proposal is that it obfuscates "in a good cause" by confounding the Pythonic semantics of "= as name binding" with the natural[1] but un-Pythonic[2] "= as changing a variable in-place". Somebody (Rhodri?) recently said something like "I keep coming back to the Zen: explicit is better than implicit." I think that says the whole argument against the proposal. Footnotes: [1] In other languages. [2] IMO, I do not speak for Guido.
On Jun 20, 2019, at 13:25, nate lust <natelust@linux.com> wrote:
There is nothing more humbling than sending nonsense out to an entire mailing list.
You’re something like the 300th person to suggest overloading assignment, but the only one of that 300 willing to think it through, much less create a patch. That’s hardly something to feel bad about!
Thinking about things the right way around, I dug in to the interpreter an hacked something up to add an assignment overload dunder method. A diff against python 3.7 can be found here: https://gist.github.com/natelust/063f60a4c2c8ad0d5293aa0eb1ce9514
There are other supporting changes, but the core idea is just that the type struct (typeobject.c) now has one more field (I called it tp_setself) that under normal circumstances is just 0. Then in the insertdict function (dictobject.c) (which does just what it sounds), post looking up the old value, and pre setting anything new, I added a block to check if tp_setself is defined on the type of the old value. If it is this means the user defined a __setself__ method on a type and it is called with the value that was to be assigned causing whatever side effects the user chose for that function, and the rest of insertdict body is never run.
Intercepting this at the namespace’s dict rather than at the store op is a really clever idea, and it seems like it should avoid the major performance hit as well. But I have a few questions. Does assigning a local variable actually call dictinsert anywhere? There is a locals dict, but I believe it’s only created and kept up to date with the fast locals array if you ask for it. What about other namespaces that aren’t dicts, like attributes of __slots__ classes? What about dicts that aren’t namespaces? For example, will x[y] = z call __setself__ on x[y] if x is a dict, but not if it’s a list (and if it’s a SortedDict it may or may not depending on internal details of how that’s implemented under the covers)? Will x += y call __setself__(self) after the __iadd__(y)? I’m not entirely sure what the right answer is for all of these situations. And that might depend on how you describe this protocol in the docs (the name “setself” doesn’t necessarily imply the same thing as “overloading assignment”).
You are right that there are many things to consider, and I too don't know what the right answer is. I was more interested in the challenge of doing it. Like I said this is only a demo, and would probably need consideration of things like this. I had thought of the dict/list thing, and as you say it will work one way if the setself var is in a dict, but in if x =[Foo()]; x[0]="hello world", would reassign the list. I dont know what the the least surprising thing would be, and it would strongly depend on how deeply someone thought about what is actually going on. What is not surprising at surface might be more surprising to someone more experienced or vice versa. It is true that fastlocals gets called in functions and so this is not respected, I didnt look into what it would take to change this, as was not ready to go down the rabbit hole as it were. I myself am of two minds on this, as I can see some cool things I or others could do with this, but also new limitations and or confusion. It would need to be very well publicized if it were a new feature, and probably come with some builtins to for instance check if a type implements this. On Fri, Jun 21, 2019 at 5:10 PM Andrew Barnert <abarnert@yahoo.com> wrote:
On Jun 20, 2019, at 13:25, nate lust <natelust@linux.com> wrote:
There is nothing more humbling than sending nonsense out to an entire mailing list.
You’re something like the 300th person to suggest overloading assignment, but the only one of that 300 willing to think it through, much less create a patch. That’s hardly something to feel bad about!
Thinking about things the right way around, I dug in to the interpreter an hacked something up to add an assignment overload dunder method. A diff against python 3.7 can be found here: https://gist.github.com/natelust/063f60a4c2c8ad0d5293aa0eb1ce9514
There are other supporting changes, but the core idea is just that the type struct (typeobject.c) now has one more field (I called it tp_setself) that under normal circumstances is just 0. Then in the insertdict function (dictobject.c) (which does just what it sounds), post looking up the old value, and pre setting anything new, I added a block to check if tp_setself is defined on the type of the old value. If it is this means the user defined a __setself__ method on a type and it is called with the value that was to be assigned causing whatever side effects the user chose for that function, and the rest of insertdict body is never run.
Intercepting this at the namespace’s dict rather than at the store op is a really clever idea, and it seems like it should avoid the major performance hit as well. But I have a few questions.
Does assigning a local variable actually call dictinsert anywhere? There is a locals dict, but I believe it’s only created and kept up to date with the fast locals array if you ask for it.
What about other namespaces that aren’t dicts, like attributes of __slots__ classes?
What about dicts that aren’t namespaces? For example, will x[y] = z call __setself__ on x[y] if x is a dict, but not if it’s a list (and if it’s a SortedDict it may or may not depending on internal details of how that’s implemented under the covers)?
Will x += y call __setself__(self) after the __iadd__(y)?
I’m not entirely sure what the right answer is for all of these situations. And that might depend on how you describe this protocol in the docs (the name “setself” doesn’t necessarily imply the same thing as “overloading assignment”).
-- Nate Lust, PhD. Astrophysics Dept. Princeton University
participants (14)
-
Anders Hovmöller
-
Andrew Barnert
-
Ben Rudiak-Gould
-
Brendan Barnwell
-
Chris Angelico
-
David Mertz
-
Ethan Furman
-
Greg Ewing
-
Juancarlo Añez
-
nate lust
-
Rhodri James
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Yanghao Hua