Yeah, I'm insane, opening another theory while I'm busily championing a PEP. But it was while writing up the other PEP that I came up with a possible syntax for this. In Python, as in most languages, method chaining requires the method to return its own object. class Count: def __init__(self): self.n = 0 def inc(self): self.n += 1 return self dracula = Count() dracula.inc().inc().inc() print(dracula.n) It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to, and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()). Method chaining is therefore far less common than it could be, with the result that, often, intermediate objects need to be separately named and assigned to. I pulled up one file from Lib/tkinter (happened to pick filedialog) and saw what's fairly typical of Python GUI code: ... self.midframe = Frame(self.top) self.midframe.pack(expand=YES, fill=BOTH) self.filesbar = Scrollbar(self.midframe) self.filesbar.pack(side=RIGHT, fill=Y) self.files = Listbox(self.midframe, exportselection=0, yscrollcommand=(self.filesbar, 'set')) self.files.pack(side=RIGHT, expand=YES, fill=BOTH) ... Every frame has to be saved away somewhere (incidentally, I don't see why self.midframe rather than just midframe - it's not used outside of __init__). With Tkinter, that's probably necessary (since the parent is part of the construction of the children), but in GTK, widget parenting is done in a more method-chaining-friendly fashion. Compare these examples of PyGTK and Pike GTK: # Cut down version of http://pygtk.org/pygtk2tutorial/examples/helloworld2.py import pygtk pygtk.require('2.0') import gtk def callback(widget, data): print "Hello again - %s was pressed" % data def delete_event(widget, event, data=None): gtk.main_quit() return False window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) box1 = gtk.HBox(False, 0) window.add(box1) button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all() gtk.main() //Pike equivalent of the above: void callback(object widget, string data) {write("Hello again - %s was pressed\n", data);} void delete_event() {exit(0);} int main() { GTK2.setup_gtk(); object button1, button2; GTK2.Window(GTK2.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->set_border_width(10) ->add(GTK2.Hbox(0,0) ->pack_start(button1 = GTK2.Button("Button 1"), 1, 1, 0) ->pack_start(button2 = GTK2.Button("Button 2"), 1, 1, 0) ) ->show_all() ->signal_connect("delete_event", delete_event); button1->signal_connect("clicked", callback, "button 1"); button2->signal_connect("clicked", callback, "button 2"); return -1; } Note that in the Pike version, I capture the button objects, but not the Hbox. There's no name ever given to that box. I have to capture the buttons, because signal_connect doesn't return the object (it returns a signal ID). The more complicated the window layout, the more noticeable this is: The structure of code using chained methods mirrors the structure of the window with its widgets containing widgets; but the structure of the Python equivalent is strictly linear. So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be: 1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self. This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though. This probably wouldn't interact at all with __getattr__ (because the attribute has to already exist for this to work), and definitely not with __setattr__ or __delattr__ (mutations aren't affected). How it interacts with __getattribute__ I'm not sure; whether it adds the wrapper around any returned functions or applies only to something that's looked up "the normal way" can be decided by ease of implementation. Supposing this were done, using the -> token that currently is used for annotations as part of 'def'. Here's how the PyGTK code would look: import pygtk pygtk.require('2.0') import gtk def callback(widget, data): print "Hello again - %s was pressed" % data def delete_event(widget, event, data=None): gtk.main_quit() return False window = (gtk.Window(gtk.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->connect("delete_event", delete_event) ->set_border_width(10) ->add(gtk.HBox(False, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ) ->show_all() ) gtk.main() Again, the structure of the code would match the structure of the window. Unlike the Pike version, this one can even connect signals as part of the method chaining. Effectively, x->y would be equivalent to chain(x.y): def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chainable Could be useful in a variety of contexts. Thoughts? ChrisA
I suggest you to take a look at cascades in Dart language And this article: http://en.wikipedia.org/wiki/Method_cascading Yury On 2/21/2014, 12:30 PM, Chris Angelico wrote:
Yeah, I'm insane, opening another theory while I'm busily championing a PEP. But it was while writing up the other PEP that I came up with a possible syntax for this.
In Python, as in most languages, method chaining requires the method to return its own object.
class Count: def __init__(self): self.n = 0 def inc(self): self.n += 1 return self
dracula = Count() dracula.inc().inc().inc() print(dracula.n)
It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to, and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()). Method chaining is therefore far less common than it could be, with the result that, often, intermediate objects need to be separately named and assigned to. I pulled up one file from Lib/tkinter (happened to pick filedialog) and saw what's fairly typical of Python GUI code:
... self.midframe = Frame(self.top) self.midframe.pack(expand=YES, fill=BOTH)
self.filesbar = Scrollbar(self.midframe) self.filesbar.pack(side=RIGHT, fill=Y) self.files = Listbox(self.midframe, exportselection=0, yscrollcommand=(self.filesbar, 'set')) self.files.pack(side=RIGHT, expand=YES, fill=BOTH) ...
Every frame has to be saved away somewhere (incidentally, I don't see why self.midframe rather than just midframe - it's not used outside of __init__). With Tkinter, that's probably necessary (since the parent is part of the construction of the children), but in GTK, widget parenting is done in a more method-chaining-friendly fashion. Compare these examples of PyGTK and Pike GTK:
# Cut down version of http://pygtk.org/pygtk2tutorial/examples/helloworld2.py import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) box1 = gtk.HBox(False, 0) window.add(box1) button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all()
gtk.main()
//Pike equivalent of the above: void callback(object widget, string data) {write("Hello again - %s was pressed\n", data);} void delete_event() {exit(0);}
int main() { GTK2.setup_gtk(); object button1, button2; GTK2.Window(GTK2.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->set_border_width(10) ->add(GTK2.Hbox(0,0) ->pack_start(button1 = GTK2.Button("Button 1"), 1, 1, 0) ->pack_start(button2 = GTK2.Button("Button 2"), 1, 1, 0) ) ->show_all() ->signal_connect("delete_event", delete_event); button1->signal_connect("clicked", callback, "button 1"); button2->signal_connect("clicked", callback, "button 2"); return -1; }
Note that in the Pike version, I capture the button objects, but not the Hbox. There's no name ever given to that box. I have to capture the buttons, because signal_connect doesn't return the object (it returns a signal ID). The more complicated the window layout, the more noticeable this is: The structure of code using chained methods mirrors the structure of the window with its widgets containing widgets; but the structure of the Python equivalent is strictly linear.
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though.
This probably wouldn't interact at all with __getattr__ (because the attribute has to already exist for this to work), and definitely not with __setattr__ or __delattr__ (mutations aren't affected). How it interacts with __getattribute__ I'm not sure; whether it adds the wrapper around any returned functions or applies only to something that's looked up "the normal way" can be decided by ease of implementation.
Supposing this were done, using the -> token that currently is used for annotations as part of 'def'. Here's how the PyGTK code would look:
import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = (gtk.Window(gtk.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->connect("delete_event", delete_event) ->set_border_width(10) ->add(gtk.HBox(False, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ) ->show_all() )
gtk.main()
Again, the structure of the code would match the structure of the window. Unlike the Pike version, this one can even connect signals as part of the method chaining.
Effectively, x->y would be equivalent to chain(x.y):
def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chainable
Could be useful in a variety of contexts.
Thoughts?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
While method chaining may be useful in general, I found that for GUI code, it is possible (and helps readability, IMO) to define a context manager that maintains a "current" object and allows calling methods on the object (I'm more used to PyQt but I guess it can easily be adapted to PyGTK). Something that allows code like (reusing your example): build = builder(gtk.Window(gtk.WINDOW_TOPLEVEL)) build.calls(("set_title", "Hello Buttons!"), ("connect", "delete_event", delete_event), ("set_border_width", 10)) with build.enter("add", gtk.HBox(False, 0)): build.enter("pack_start", gtk.Button("Button 1"), True, True, 0).call( "connect", "clicked", callback, "button 1") build.enter("pack_start", gtk.Button("Button 2"), True, True, 0).call( "connect", "clicked", callback, "button 2") build.call("show_all") The build object maintains a stack of current objects. "call{,s}" calls one or multiple methods on the topmost object; "enter" calls a method and returns the object added by this method (this requires to hard-code some knowledge about the API of the widget library) and also pushes the object on the stack if used in context manager form. Antony 2014-02-21 9:30 GMT-08:00 Chris Angelico <rosuav@gmail.com>:
Yeah, I'm insane, opening another theory while I'm busily championing a PEP. But it was while writing up the other PEP that I came up with a possible syntax for this.
In Python, as in most languages, method chaining requires the method to return its own object.
class Count: def __init__(self): self.n = 0 def inc(self): self.n += 1 return self
dracula = Count() dracula.inc().inc().inc() print(dracula.n)
It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to, and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()). Method chaining is therefore far less common than it could be, with the result that, often, intermediate objects need to be separately named and assigned to. I pulled up one file from Lib/tkinter (happened to pick filedialog) and saw what's fairly typical of Python GUI code:
... self.midframe = Frame(self.top) self.midframe.pack(expand=YES, fill=BOTH)
self.filesbar = Scrollbar(self.midframe) self.filesbar.pack(side=RIGHT, fill=Y) self.files = Listbox(self.midframe, exportselection=0, yscrollcommand=(self.filesbar, 'set')) self.files.pack(side=RIGHT, expand=YES, fill=BOTH) ...
Every frame has to be saved away somewhere (incidentally, I don't see why self.midframe rather than just midframe - it's not used outside of __init__). With Tkinter, that's probably necessary (since the parent is part of the construction of the children), but in GTK, widget parenting is done in a more method-chaining-friendly fashion. Compare these examples of PyGTK and Pike GTK:
# Cut down version of http://pygtk.org/pygtk2tutorial/examples/helloworld2.py import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) box1 = gtk.HBox(False, 0) window.add(box1) button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all()
gtk.main()
//Pike equivalent of the above: void callback(object widget, string data) {write("Hello again - %s was pressed\n", data);} void delete_event() {exit(0);}
int main() { GTK2.setup_gtk(); object button1, button2; GTK2.Window(GTK2.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->set_border_width(10) ->add(GTK2.Hbox(0,0) ->pack_start(button1 = GTK2.Button("Button 1"), 1, 1, 0) ->pack_start(button2 = GTK2.Button("Button 2"), 1, 1, 0) ) ->show_all() ->signal_connect("delete_event", delete_event); button1->signal_connect("clicked", callback, "button 1"); button2->signal_connect("clicked", callback, "button 2"); return -1; }
Note that in the Pike version, I capture the button objects, but not the Hbox. There's no name ever given to that box. I have to capture the buttons, because signal_connect doesn't return the object (it returns a signal ID). The more complicated the window layout, the more noticeable this is: The structure of code using chained methods mirrors the structure of the window with its widgets containing widgets; but the structure of the Python equivalent is strictly linear.
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though.
This probably wouldn't interact at all with __getattr__ (because the attribute has to already exist for this to work), and definitely not with __setattr__ or __delattr__ (mutations aren't affected). How it interacts with __getattribute__ I'm not sure; whether it adds the wrapper around any returned functions or applies only to something that's looked up "the normal way" can be decided by ease of implementation.
Supposing this were done, using the -> token that currently is used for annotations as part of 'def'. Here's how the PyGTK code would look:
import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = (gtk.Window(gtk.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->connect("delete_event", delete_event) ->set_border_width(10) ->add(gtk.HBox(False, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ) ->show_all() )
gtk.main()
Again, the structure of the code would match the structure of the window. Unlike the Pike version, this one can even connect signals as part of the method chaining.
Effectively, x->y would be equivalent to chain(x.y):
def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chainable
Could be useful in a variety of contexts.
Thoughts?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 2/21/2014 12:30 PM, Chris Angelico wrote:
It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to,
Off the top of my head, I cannot think of any methods that return self. (If there are some, someone please remind or inform me.)
and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()).
The rule for mutation methods is to return None unless there is something *other than* self to return, like list.pop and similar remove (mutate) and return methods. List.sort and list.reverse returned None from the beginning, long before sorted and reversed were added. -- Terry Jan Reedy
On Fri, Feb 21, 2014 at 1:37 PM, Terry Reedy <tjreedy@udel.edu> wrote:
Off the top of my head, I cannot think of any methods that return self. (If there are some, someone please remind or inform me.)
That's because I don't like this pattern. :-) But there's some code in the stdlib that uses it (IIRC Eric Raymond was a fan), and I'm sure there's lots of 3rd party Python code too. -- --Guido van Rossum (python.org/~guido)
On 02/21/2014 10:37 PM, Terry Reedy wrote:
On 2/21/2014 12:30 PM, Chris Angelico wrote:
It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to,
Off the top of my head, I cannot think of any methods that return self. (If there are some, someone please remind or inform me.)
It is (in my experience) a common practice, precisely to allow method chaining, in diverse libs. The most common cases happen at object construction, where chaining adds or changes diverse object properties. This is also at times combined with overloading of language features. In the following, from a parsing lib, the init method returns its object, and __call__ is overriden to a method that sets an match actions on the pattern: symbol_def = Compose(id, ":=", expr)(put_symbol) [This defines a patten for symbol defs (read: assignment of a new var); when the pattern matches, the symbol is put in a symbol table.] However, this is probably only ok for such libs that developpers are forced to study intensely before being able to use them efficiently. Otherwise, in itself the code is pretty unreadable I guess. Also, I don't find the idea of having a builtin construct for such hacks a good idea. Libs for which this may be practicle can return self --end of the story. d
On 2014-02-21, at 23:00 , spir <denis.spir@gmail.com> wrote:
Also, I don't find the idea of having a builtin construct for such hacks a good idea. Libs for which this may be practicle can return self --end of the story.
That has two issues though: 1. it makes chainability a decision of the library author, the library user gets to have no preference. This means e.g. you can't create a tree of elements in ElementTree in a single expression (AFAIK Element does not take children parameters). With cascading, the user can "chain" a library whose author did not choose to support chaining (in fact with cascading no author would ever need to support chaining again). 2. where a return value can make sense (and be useful) the author *must* make a choice. No way to chain `dict.pop()` since it returns the popped value, even if `pop` was only used for its removal-with-shut-up properties. With cascading the user can have his cake and eat it: he gets the return value if he wants it, and can keep "chaining" if he does not care.
From: Masklinn <masklinn@masklinn.net> Sent: Friday, February 21, 2014 2:43 PM
Also, I don't find the idea of having a builtin construct for such hacks a good idea. Libs for which this may be practicle can return self --end of
On 2014-02-21, at 23:00 , spir <denis.spir@gmail.com> wrote: the story.
That has two issues though:
1. it makes chainability a decision of the library author, the library user gets to have no preference. This means e.g. you can't create a tree of elements in ElementTree in a single expression (AFAIK Element does not take children parameters). With cascading, the user can "chain" a library whose author did not choose to support chaining (in fact with cascading no author would ever need to support chaining again).
2. where a return value can make sense (and be useful) the author *must* make a choice. No way to chain `dict.pop()` since it returns the popped value, even if `pop` was only used for its removal-with-shut-up properties. With cascading the user can have his cake and eat it: he gets the return value if he wants it, and can keep "chaining" if he does not care.
I think this is almost always a bad thing to do, and a feature that encourages/enables it is a bad feature just for that reason. If you just want to remove an element from a list or a dict, you use a del statement. There's no reason to call pop unless you want to result. Misusing pop to allow you to wedge a statement into an expression is exactly the same as any other misuse, akin to the "os.remove(path) except IOError: None" from the other thread. Look at it this way: If someone wrote this, would you congratulate him on his cleverness, or ask him to fix it before you waste time reviewing his code? [d.pop(key) for key in get_keys()]
On Sat, Feb 22, 2014 at 9:43 AM, Masklinn <masklinn@masklinn.net> wrote:
On 2014-02-21, at 23:00 , spir <denis.spir@gmail.com> wrote:
Also, I don't find the idea of having a builtin construct for such hacks a good idea. Libs for which this may be practicle can return self --end of the story.
That has two issues though:
1. it makes chainability a decision of the library author, the library user gets to have no preference. This means e.g. you can't create a tree of elements in ElementTree in a single expression (AFAIK Element does not take children parameters). With cascading, the user can "chain" a library whose author did not choose to support chaining (in fact with cascading no author would ever need to support chaining again).
Right. That's the main point behind this: it gives the *caller* the choice of whether to chain or not. That's really the whole benefit, right there. ChrisA
On 02/21/2014 11:43 PM, Masklinn wrote:
On 2014-02-21, at 23:00 , spir <denis.spir@gmail.com> wrote:
Also, I don't find the idea of having a builtin construct for such hacks a good idea. Libs for which this may be practicle can return self --end of the story.
That has two issues though:
1. it makes chainability a decision of the library author, the library user gets to have no preference. This means e.g. you can't create a tree of elements in ElementTree in a single expression (AFAIK Element does not take children parameters). With cascading, the user can "chain" a library whose author did not choose to support chaining (in fact with cascading no author would ever need to support chaining again).
I agree with you, here...
2. where a return value can make sense (and be useful) the author *must* make a choice. No way to chain `dict.pop()` since it returns the popped value, even if `pop` was only used for its removal-with-shut-up properties. With cascading the user can have his cake and eat it: he gets the return value if he wants it, and can keep "chaining" if he does not care.
... not there (if I understand you well; not quite 100% sure). In fact, I find this point rather a counter-argument, something to avoid (again, if I understand). What I mean is that executing given methods should have consistent effect; also, you should use the right method for the right task: if you don't want a stack's top item _and_ have it removed, then don't use 'pop', otherwise you are misleading readers (including yourself maybe, later) (note 'pop' is just a convenience utility for this very case; we could just read and remove in 2 steps). d
On 2014-02-22, at 11:23 , spir <denis.spir@gmail.com> wrote:
2. where a return value can make sense (and be useful) the author *must* make a choice. No way to chain `dict.pop()` since it returns the popped value, even if `pop` was only used for its removal-with-shut-up properties. With cascading the user can have his cake and eat it: he gets the return value if he wants it, and can keep "chaining" if he does not care.
... not there (if I understand you well; not quite 100% sure). In fact, I find this point rather a counter-argument, something to avoid (again, if I understand). What I mean is that executing given methods should have consistent effect
Executing the method always has the same effect on its subject. That it may not be used for the same purpose is a different issue and common: a[k] = v can be used to add a new (k, v) pair or to update a key to a new value (in fact Erlang's new map construct makes the difference and provides for an update-only version). Even more so for values returned by mutating methods, as far as I no there is no rule that they must be used if the method is only called for its side effects.
also, you should use the right method for the right task: if you don't want a stack's top item _and_ have it removed, then don't use 'pop', otherwise you are misleading readers (including yourself maybe, later) (note 'pop' is just a convenience utility for this very case; we could just read and remove in 2 steps).
Not only is `dict.pop` an expression (which del is not, and thus del can't be used in some contexts where pop would be useable, e.g. in a lambda) but dict.pop can also handle a non-existent value at the key, doing so with `del` requires adding a supplementary conditional. dict.pop's convenience oft makes it the right method for the right task, even if the case is "remove the key a from the dict if it's there". But if you don't like the pop example there are others. The sibling thread "Joining dicts again" for instance: with cascading it's a non-problem, you can just cascade `update` calls on a base dict and you get a fully correct (as opposed to dict(a, **b) which is only correct if all of b's keys are strings) single-expression union.
On Sun, Feb 23, 2014 at 2:44 AM, Alan Cristhian Ruiz <alan.cristh@gmail.com> wrote:
What is wrong with the current sintax?:
'abcd'\ .upper()\ .lower()\ .title()
It doesn't have each method operate on the original object. It's not easy to see with strings, but try this: list_of_numbers = [1,2] list_of_numbers.append(3) list_of_numbers.append(4) list_of_numbers.append(5) Now write that without repeating list_of_numbers all the way down the line. ChrisA
On 23 February 2014 01:54, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Feb 23, 2014 at 2:44 AM, Alan Cristhian Ruiz <alan.cristh@gmail.com> wrote:
What is wrong with the current sintax?:
'abcd'\ .upper()\ .lower()\ .title()
It doesn't have each method operate on the original object. It's not easy to see with strings, but try this:
list_of_numbers = [1,2] list_of_numbers.append(3) list_of_numbers.append(4) list_of_numbers.append(5)
Now write that without repeating list_of_numbers all the way down the line.
The thing is, this *isn't an accident*, it's a deliberate choice in the library design to distinguish between data flow pipelines that transform data without side effects, and repeated mutation of a single object (although reading Guido's earlier reply, that may be me retrofitting an explicit rationale onto Guido's personal preference). Mutation and transformation are critically different operations, and most requests for method chaining amount to "I want to use something that looks like a chained transformation to apply multiple mutating operations to the same object". The response from the core developers to that request is almost certainly always going to be "No", because it's a fundamentally bad idea to blur that distinction: you should be able to tell *at a glance* whether an operation is mutating an existing object or creating a new one (this is actually one of the problems with the iterator model: for iterators, rather than iterables, the "__iter__ returns self" implementation means that iteration becomes an operation with side effects, which can be surprising at times, usually because the iterator shows up as unexpectedly empty later on). Compare: seq = get_data() seq.sort() seq = sorted(get_data()) Now, compare that with the proposed syntax as applied to the first operation: seq = []->extend(get_data())->sort() That *looks* like it should be a data transformation pipeline, but it's not - each step in the chain is mutating the original object, rather than creating a new one. That's a critical *problem* with the idea, not a desirable feature. There are a few good responses to this: 1. Design your APIs as transformation APIs that avoid in-place operations with side effects. This is a really good choice, as stateless transformations are one of the key virtues of functional programming, and if a problem can be handled that way without breaking the reader's brain, *do it*. Profiling later on may reveal the need to use more efficient in-place operations, but externally stateless APIs are still a great starting point that are less likely to degenerate into an unmaintainable stateful mess over time (you can maintain temporary state *internally*, but from the API users' perspective, things should look like they're stateless). 2. Provide a clean "specification" API, that allows a complex object structure to be built from something simpler (see, for example, logging.dictConfig(), or the various declarative approaches to defining user interfaces, or the Python 3 open(), which can create multilayered IO stacks on behalf of the user) 3. If the core API is based on mutation, but there's a clean and fast copying mechanism, consider adding a transformation API around it that trades speed (due to the extra object copies) for clarity (due to the lack of reliance on side effects). There's also a somewhat hacky workaround that can be surprisingly effective in improving readability when working with tree structures: abuse context managers to make the indentation structure match the data manipulation structure. @contextmanager def make(obj): yield obj with make(gtk.Window(gtk.WINDOW_TOPLEVEL)) as window: window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) with make(gtk.HBox(False, 0)) as box1: window.add(box1) with make(gtk.Button("Button 1")) as button1: button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) with make(gtk.Button("Button 2")) as button2: button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all() Although even judicious use of vertical whitespace and comments can often be enough to provide a significant improvement: # Make the main window window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) # Make the box and add the buttons box1 = gtk.HBox(False, 0) window.add(box1) # Add Button 1 button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) # Add Button 2 button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) # And now we're done window.show_all() And adding a short internal helper function makes it even clearer: # Make the main window window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) # Make the box and add the buttons box1 = gtk.HBox(False, 0) window.add(box1) def add_button(box, label, callback_arg): button = gtk.Button(label) button.connect("clicked", callback, callback_arg) box.pack_start(button, True, True, 0) add_button(box, "Button 1", "button 1") add_button(box, "Button 2", "button 2") # And now we're done window.show_all() It's easy to write code that looks terrible - but to make the case for a syntax change, you can't use code that looks terrible as a rationale, when there are existing ways to refactor that code that make it substantially easier to read. It's only when the code is *still* hard to read after it has been refactored to be as beautiful as is currently possible that a case for new syntactic sugar can be made. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Feb 23, 2014 at 1:25 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
it's a fundamentally bad idea to blur that distinction: you should be able to tell *at a glance* whether an operation is mutating an existing object or creating a new one...
Compare:
seq = get_data() seq.sort()
seq = sorted(get_data())
Now, compare that with the proposed syntax as applied to the first operation:
seq = []->extend(get_data())->sort()
That *looks* like it should be a data transformation pipeline, but it's not - each step in the chain is mutating the original object, rather than creating a new one. That's a critical *problem* with the idea, not a desirable feature.
Except that it doesn't. The idea of using a different operator is that it should clearly be mutating the original object. It really IS obvious, at a glance, that it's going to be returning the existing object, because that operator means it will always be. I believe that naming things that don't matter is a bad idea. We don't write code like this: five = 5 two = 2 print("ten is",five*two) because the intermediate values are completely insignificant. It's much better to leave them unnamed. (Okay, they're trivial here, but suppose those were function calls.) In a GTK-based layout, you'll end up creating a whole lot of invisible widgets whose sole purpose is to control the layout of other widgets. In a complex window, you might easily have dozens of those. (Same happens in Tkinter, from what I gather, but I haven't much looked into that.) Naming those widgets doesn't improve readability - in fact, it damages it, because you're left wondering which insignificant box is which. Leaving them unnamed and just part of a single expression emphasizes their insignificance. ChrisA
On Feb 22, 2014, at 18:48, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Feb 23, 2014 at 1:25 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That *looks* like it should be a data transformation pipeline, but it's not - each step in the chain is mutating the original object, rather than creating a new one. That's a critical *problem* with the idea, not a desirable feature.
Except that it doesn't. The idea of using a different operator is that it should clearly be mutating the original object. It really IS obvious, at a glance, that it's going to be returning the existing object, because that operator means it will always be.
The difference between the look of nested statements and giant expressions in Python is much larger than the difference between the look of . and ->. One structure means you're doing imperative stuff, mutating one value on each line. The other means you're doing declarative stuff, transforming objects into new temporary objects. That distinction is huge, and the fact that it's immediately visible in Python i's one of the strengths of Python over most other "multi-paradigm" languages.
I believe that naming things that don't matter is a bad idea. We don't write code like this:
five = 5 two = 2 print("ten is",five*two)
But in real life code, this would be something like rows * columns, and even if rows and columns are constant, they're constants you might want to change in a future version of the code, so you _would_ name them. And if, as you say, they're actually function calls, not constants, I think most people would write: rows = consoleobj.getparam('ROWS') cols = consoleobj.getparam('COLS') cells = rows * cols ... rather than try to cram it all in one line.
because the intermediate values are completely insignificant. It's much better to leave them unnamed. (Okay, they're trivial here, but suppose those were function calls.) In a GTK-based layout, you'll end up creating a whole lot of invisible widgets whose sole purpose is to control the layout of other widgets. In a complex window, you might easily have dozens of those. (Same happens in Tkinter, from what I gather, but I haven't much looked into that.) Naming those widgets doesn't improve readability - in fact, it damages it, because you're left wondering which insignificant box is which. Leaving them unnamed and just part of a single expression emphasizes their insignificance.
All you're arguing here is that PyGtk is badly designed, or that Gtk is not a good match for Python, so you have to write wrappers. There's no reason the wrapper has to be fluent instead of declarative.
The other means you're doing declarative stuff, transforming objects into new temporary objects. That distinction is huge
I guess that's where people disagree. I think the distinction is not huge. Whether imperatively constructing something or "declaratively" (???) doing transformations, the *meaning* of the code is the same: start from some *foo* and do *some stuff *on *foo *in sequence until the *foo *is what I want. Whether it's implemented using mutation or allocation/garbage-collection is an implementation detail that clouds our view of a higher-level semantic: initializing an object with some stuff. In fact, this distinction is so meaningless that many languages/runtimes will turn one into the other as an optimization, because the semantics are exactly the same. On Sat, Feb 22, 2014 at 9:06 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Feb 22, 2014, at 18:48, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Feb 23, 2014 at 1:25 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That *looks* like it should be a data transformation pipeline, but it's not - each step in the chain is mutating the original object, rather than creating a new one. That's a critical *problem* with the idea, not a desirable feature.
Except that it doesn't. The idea of using a different operator is that it should clearly be mutating the original object. It really IS obvious, at a glance, that it's going to be returning the existing object, because that operator means it will always be.
The difference between the look of nested statements and giant expressions in Python is much larger than the difference between the look of . and ->. One structure means you're doing imperative stuff, mutating one value on each line. The other means you're doing declarative stuff, transforming objects into new temporary objects. That distinction is huge, and the fact that it's immediately visible in Python i's one of the strengths of Python over most other "multi-paradigm" languages.
I believe that naming things that don't matter is a bad idea. We don't write code like this:
five = 5 two = 2 print("ten is",five*two)
But in real life code, this would be something like rows * columns, and even if rows and columns are constant, they're constants you might want to change in a future version of the code, so you _would_ name them.
And if, as you say, they're actually function calls, not constants, I think most people would write:
rows = consoleobj.getparam('ROWS') cols = consoleobj.getparam('COLS') cells = rows * cols
... rather than try to cram it all in one line.
because the intermediate values are completely insignificant. It's much better to leave them unnamed. (Okay, they're trivial here, but suppose those were function calls.) In a GTK-based layout, you'll end up creating a whole lot of invisible widgets whose sole purpose is to control the layout of other widgets. In a complex window, you might easily have dozens of those. (Same happens in Tkinter, from what I gather, but I haven't much looked into that.) Naming those widgets doesn't improve readability - in fact, it damages it, because you're left wondering which insignificant box is which. Leaving them unnamed and just part of a single expression emphasizes their insignificance.
All you're arguing here is that PyGtk is badly designed, or that Gtk is not a good match for Python, so you have to write wrappers. There's no reason the wrapper has to be fluent instead of declarative. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 02/23/2014 06:27 AM, Haoyi Li wrote:
The other means you're doing declarative stuff, transforming objects into new temporary objects. That distinction is huge
I guess that's where people disagree. I think the distinction is not huge. Whether imperatively constructing something or "declaratively" (???) doing transformations, the *meaning* of the code is the same: start from some *foo* and do *some stuff *on *foo *in sequence until the *foo *is what I want.
Whether it's implemented using mutation or allocation/garbage-collection is an implementation detail that clouds our view of a higher-level semantic: initializing an object with some stuff. In fact, this distinction is so meaningless that many languages/runtimes will turn one into the other as an optimization, because the semantics are exactly the same.
It's not the same if you have references to some of the objects involved; in python via "symbolic assignments", assignments which right side is a symbol, as in "b=a". This is the whole point, I guess, and one core difficulty of complex system modelisation. Such references exist in the model being expressed (as in, Bob is Ann's boyfriend, and the football team goal keeper, and the manager of the shop around the corner, and...); or else your code is wrong (you're making up inexistent refs to things, as artifact of your coding style). The 2 methods above are not semantically equivalent. In the mutative case, you're talking about one single *thing* (the representation of one thing in the model, correctly "reified" in code). This thing may be multiply ref'ed, because it is a thing and as such has multiple *aspects* or *roles* (or competences?). In the functional case, you making up new objects at every step, thus if there were ref's they would be broken. We should only use such a style for non-things, meaning for plain data (information) *about* things. (This style is not appropriate for GUI widgets, which conceptually are things, and pretty often ref'ed. Instead widgets should be created on one go.) d
On Feb 23, 2014, at 3:51, spir <denis.spir@gmail.com> wrote:
The 2 methods above are not semantically equivalent. In the mutative case, you're talking about one single *thing* (the representation of one thing in the model, correctly "reified" in code). This thing may be multiply ref'ed, because it is a thing and as such has multiple *aspects* or *roles* (or competences?). In the functional case, you making up new objects at every step, thus if there were ref's they would be broken. We should only use such a style for non-things, meaning for plain data (information) *about* things.
(This style is not appropriate for GUI widgets, which conceptually are things, and pretty often ref'ed. Instead widgets should be created on one go.)
But put that together with your other reply:
That's what I was about to argue. I don't understand why the python wrapper does not let you construct widgets in one go, with all their "equipment", since it's trivial and standard style in python (except, certainly, for adding sub-widgets to containers, as the sub-widgets exist and need to defnied by themselves).
The cleanest way to do this would then be to build the initializer(s) declaratively, then initialize the (mutable) object in one go, right? As for why PyGtk doesn't work that way, there are two reasons. First, it's deliberately intended to make Python Gtk code, C Gtk code, Vala Gtk code, etc. look as similar as possible. Gtk has a language-agnostic idiom that takes precedence over each language's idioms. This means you only have to write examples, detailed docs, etc. once, instead of once for every language with bindings. Second, from a practical point of view, it allow PyGtk to be a very thin wrapper around Gtk. And if you want to know why Gtk itself wasn't designed to be more Pythonic... Well, it wasn't designed for Python. It was originally designed for C, and then updated for C and Vala. So it has C/Vala-focused idioms. That's the same reason Qt and Wx have C++ idioms, Tk has Tcl idioms, WinForms has C#/VB idioms, Cocoa has ObjC idioms, etc. Unfortunately, none of the major GUI libraries was designed primarily with Python in mind, so we have to adapt.
Andrew Barnert wrote:
Unfortunately, none of the major GUI libraries was designed primarily with Python in mind, so we have to adapt.
Yes, but the adaptation can be in the form of wrappers that make the API more Pythonic. It shouldn't mean warping Python to make it fit the ways of other languages. -- Greg
On Feb 23, 2014, at 16:36, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Andrew Barnert wrote:
Unfortunately, none of the major GUI libraries was designed primarily with Python in mind, so we have to adapt.
Yes, but the adaptation can be in the form of wrappers that make the API more Pythonic. It shouldn't mean warping Python to make it fit the ways of other languages.
But remember that there is an advantage to Gtk, Qt, etc. having their own language-agnostic idioms. They have a hard enough time documenting the whole thing as it is; if they had to write completely different documentation for C, C++, Vala, Python, .NET, etc., we just wouldn't get any documentation. Of course there's also a disadvantage. PyGtk code doesn't look very Pythonic. I think the suggestions in this thread for a language change that allows people to write code that looks like _ neither_ Python _nor_ Gtk is a bad solution to the problem. But there is a real problem, and I understand why people are trying to solve it. So what is the solution? Maybe the best thing people can put their effort into is a higher-level, more Pythonic wrapper around the most painful parts of the PyGtk wrapper (like initialization)?
On Mon, Feb 24, 2014 at 5:49 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
But remember that there is an advantage to Gtk, Qt, etc. having their own language-agnostic idioms. They have a hard enough time documenting the whole thing as it is; if they had to write completely different documentation for C, C++, Vala, Python, .NET, etc., we just wouldn't get any documentation.
Point to note: When I'm trying to pin down an issue that relates to GTK on Windows, I'll sometimes switch between Pike and Python, since my installations of them embed different GTK versions. (I discovered a GTK bug that way.) Being able to translate my code from one language to another is extremely useful. And when I'm looking for docs and (especially) examples, it's really common to get them for some completely different language - it's easier to translate back from (say) Perl than to hunt down an example in the language I'm actually using. A more Pythonic wrapper around object creation would pretty much look like what I was saying, except that it would take specific coding work. Actually, probably all it'd take is a module that imports all the PyGTK classes and wraps them in Steven's chain() function. But the proposal I make here would put the power directly in the hands of the programmer, rather than requiring that the module support it. Why should method chaining be in the hands of the module author? ChrisA
On Feb 23, 2014, at 23:00, Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Feb 24, 2014 at 5:49 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
But remember that there is an advantage to Gtk, Qt, etc. having their own language-agnostic idioms. They have a hard enough time documenting the whole thing as it is; if they had to write completely different documentation for C, C++, Vala, Python, .NET, etc., we just wouldn't get any documentation.
Point to note: When I'm trying to pin down an issue that relates to GTK on Windows, I'll sometimes switch between Pike and Python, since my installations of them embed different GTK versions. (I discovered a GTK bug that way.) Being able to translate my code from one language to another is extremely useful. And when I'm looking for docs and (especially) examples, it's really common to get them for some completely different language - it's easier to translate back from (say) Perl than to hunt down an example in the language I'm actually using.
A more Pythonic wrapper around object creation would pretty much look like what I was saying, except that it would take specific coding work. Actually, probably all it'd take is a module that imports all the PyGTK classes and wraps them in Steven's chain() function.
But that would be _less_ Pythonic, not more. The fact that mutating methods return None--and, more generally, the strong divide between mutation and transformation and between statements and expressions--is one of the major ways in which Python is idiomatically different from languages that are specifically meant to be fluent (like Smalltalk or C#) or that just never considered the design issue (like perl or JavaScript). You don't have to agree with Guido that method chaining is bad. But he's designed his language and stdlib to discourage it, and therefore anything you do in the opposite direction is fighting against the grain of the language and its standard idioms.
On 02/23/2014 06:06 AM, Andrew Barnert wrote:
All you're arguing here is that PyGtk is badly designed, or that Gtk is not a good match for Python, so you have to write wrappers. There's no reason the wrapper has to be fluent instead of declarative.
That's what I was about to argue. I don't understand why the python wrapper does not let you construct widgets in one go, with all their "equipment", since it's trivial and standard style in python (except, certainly, for adding sub-widgets to containers, as the sub-widgets exist and need to defnied by themselves). d
spir writes:
On 02/23/2014 06:06 AM, Andrew Barnert wrote:
All you're arguing here is that PyGtk is badly designed, or that Gtk is not a good match for Python, so you have to write wrappers. There's no reason the wrapper has to be fluent instead of declarative.
That's what I was about to argue. I don't understand why the python wrapper does not let you construct widgets in one go,
*Because* it's a *wrapper*, which leverages the gobject-introspection FFI. (gobject-introspection is an export-oriented FFI, rather than an import-oriented FFI like ctypes.)
with all their "equipment", since it's trivial and standard style in python (except, certainly, for adding sub-widgets to containers, as the sub-widgets exist and need to defnied by themselves).
I don't think you need to except subwidgets. They could be defined recursively by including calls to their constructors in the "description" of the parent. It might be tricky to find a pleasant way to express placement in containers with flexible placement disciplines, but for single-child, column, row, and grid widgets I don't see a problem at all. Now, that would be nice for initialization, but the GTK+ v3 API is very dynamic and insanely complicated. I think it would take a huge amount of work to do this at all well, and I suspect that your program would benefit only from beautification of initialization -- everything else would still look the same. (Of course many programs don't need a dynamic UI; I suppose that would be a benefit.) In any case, I don't find a "chained" API for something like GTK any more attractive than the repetition of object being mutated. They're both quite ugly, an accurate reflection of the underlying library which tried to be a better Xt, but ended up equally messy, compounded by being a lot bigger (and more poorly documented).
From: Stephen J. Turnbull <stephen@xemacs.org> Sent: Sunday, February 23, 2014 3:31 PM
spir writes:
On 02/23/2014 06:06 AM, Andrew Barnert wrote:
All you're arguing here is that PyGtk is badly designed, or that Gtk is not a good match for Python, so you have to write wrappers. There's no reason the wrapper has to be fluent instead of declarative.
That's what I was about to argue. I don't understand why the python wrapper does not let you construct widgets in one go,
*Because* it's a *wrapper*, which leverages the gobject-introspection FFI. (gobject-introspection is an export-oriented FFI, rather than an import-oriented FFI like ctypes.)
[snip]
I don't think you need to except subwidgets. They could be defined recursively by including calls to their constructors in the "description" of the parent.
I think the problem here is that, to everyone who doesn't like the method chaining idea (like all three of us), it's obvious that this could be done declaratively, and it's also obvious why that would be better than writing code which breaks both Python and Gtk+ idioms, and therefore none of us have explained those obvious facts very well to the people who like the idea. So I put together examples for both the PyGtk example that started this thread and the Java example that started the whole fluent-interface fad, along with an explanation of why method chaining is neither necessary in, nor a good fit for, Python even though it's very useful in languages like Java. See https://stupid-python-ideas.runkite.com/fluent-pythonic/ for the whole thing. That being said, I think Nick's examples using with statements should be enough to show that there are more Pythonic solutions (that already work today) than adding chaining.
On Tue, Feb 25, 2014 at 12:10 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
So I put together examples for both the PyGtk example that started this thread and the Java example that started the whole fluent-interface fad, along with an explanation of why method chaining is neither necessary in, nor a good fit for, Python even though it's very useful in languages like Java. See https://stupid-python-ideas.runkite.com/fluent-pythonic/ for the whole thing.
While I do think a method chaining operator would solve the problem generically, rather than requiring every module to do it individually, I do rather like your proposed window creation syntax. The thing is, Pike GTK is *almost* there: instead of creating a window and setting its title and border, you can create a window and pass it a mapping (dict) specifying the title and border. (PyGTK doesn't have anything of the sort, it seems. I haven't looked into PyGObject, which is supposed to be the new great thing; it might have that.) But it's not quite all the way, because you can't stuff children into them. Being able to do the whole job in the constructor is extremely tempting. It'd work beautifully for the objects where you basically just call add() with each thing (just provide a list of children to be added), but not so well when you want to specify parameters (eg specifying how spare space should be allocated). I'm not sure how that ought to be done. I could come up with something where you pass it a list of tuples, but I'm not sure that completely covers the issue either. There's no perfect solution, which is why the search continues. I freely admit that the suggestion I made at the beginning of this thread is unideal; it's un-Pythonic, it's a heavy language change, and it'd take a huge amount of justification to go anywhere. But there is a problem still, that it's trying to solve. ChrisA
On 02/22/2014 08:48 PM, Chris Angelico wrote:
On Sun, Feb 23, 2014 at 1:25 PM, Nick Coghlan<ncoghlan@gmail.com> wrote:
it's a fundamentally bad idea to blur that distinction: you should be able to tell*at a glance* whether an operation is mutating an existing object or creating a new one...
Compare:
seq = get_data() seq.sort()
seq = sorted(get_data())
Now, compare that with the proposed syntax as applied to the first operation:
seq = []->extend(get_data())->sort()
That*looks* like it should be a data transformation pipeline, but it's not - each step in the chain is mutating the original object, rather than creating a new one. That's a critical*problem* with the idea, not a desirable feature.
Except that it doesn't. The idea of using a different operator is that it should clearly be mutating the original object. It really IS obvious, at a glance, that it's going to be returning the existing object, because that operator means it will always be.
I agree with nick, this looks like a transformation chain. Each step transforming the "new" result of the previous step. seq = []->extend(get_data())->sort() To make it pythonic ... The operator you want is one for an in place method call. If we apply the "+=" pattern for the '__iadd__' method call syntax, to the more the general '.' method syntax, we get ".=", the in place method call syntax. seq = [] seq .= extend(get_data()) # In place method call. In place method calls seem quite reasonable to me. And then to get the rest of the way there, allow chained "in place" method calls. seq = [] .= extend(get_data()) .= sort() Which should be a separate pep from the ".=" enhancement. BTW... allowing ".=" could mean a class could have one __iget_method__ attribute instead of multiple __ixxxx___ methods. (Or something like that.) Cheers, Ron
I agree with nick, this looks like a transformation chain. Each step
On Feb 23, 2014 10:25 AM, "Ron Adam" <ron3200@gmail.com> wrote: transforming the "new" result of the previous step.
seq = []->extend(get_data())->sort()
To make it pythonic ...
The operator you want is one for an in place method call. If we apply the
"+=" pattern for the '__iadd__' method call syntax, to the more the general '.' method syntax, we get ".=", the in place method call syntax.
seq = [] seq .= extend(get_data()) # In place method call.
In place method calls seem quite reasonable to me.
And then to get the rest of the way there, allow chained "in place"
method calls.
seq = [] .= extend(get_data()) .= sort()
Which should be a separate pep from the ".=" enhancement.
BTW... allowing ".=" could mean a class could have one __iget_method__
attribute instead of multiple __ixxxx___ methods. (Or something like that.)
Cheers, Ron
I like this syntax. It easy to tell what exactly is getting mutated.
Ron Adam writes:
The operator you want is one for an in place method call.
seq = [] .= extend(get_data()) .= sort()
That looks like anything but Python to me. If I really thought of that as a single operation, I'd do something like class ScarfNSort(list): def __init__(self): self.extend(get_data()) self.sort() seq = ScarfNSort() If it doesn't deserve a class definition, then the repeated references to 'seq' wouldn't bother me. N.B. '.=' shouldn't be called "in-place": 'sort' and 'extend' are already in-place. The word would be "chain," "cascade," or similar.
On 02/23/2014 05:52 PM, Stephen J. Turnbull wrote:
Ron Adam writes:
The operator you want is one for an in place method call.
seq = [] .= extend(get_data()) .= sort()
That looks like anything but Python to me.
Is it really all that different from this?
"Py" . __add__("th") . __add__("on") 'Python'
The '.=' just says more explicitly that self will be returned after the method call. It wouldn't alter the string example here, since self isn't returned. But for mutable objects, it's an explicit reminder that it mutates rather than returns a new object. In the case that there is no __iget_method__ method, it would give an error. So it's not a make everything into a chain tool.
If I really thought of that as a single operation, I'd do something like
class ScarfNSort(list): def __init__(self): self.extend(get_data()) self.sort()
seq = ScarfNSort()
If it doesn't deserve a class definition, then the repeated references to 'seq' wouldn't bother me.
N.B. '.=' shouldn't be called "in-place": 'sort' and 'extend' are already in-place. The word would be "chain," "cascade," or similar.
It's a chain only if you link more than one in sequence. Cheers, Ron
On Feb 23, 2014, at 23:14, Ron Adam <ron3200@gmail.com> wrote:
On 02/23/2014 05:52 PM, Stephen J. Turnbull wrote:
Ron Adam writes:
The operator you want is one for an in place method call.
seq = [] .= extend(get_data()) .= sort()
That looks like anything but Python to me.
Is it really all that different from this?
"Py" . __add__("th") . __add__("on") 'Python'
Well, yes, it is. But, more importantly, who cares? That code is horribly unreadable and unpythonic. OK, I don't think PEP8 has a guideline saying "don't call __add__ when you can just use +", but only because it's so obvious it doesn't need to be stated. (And I'm pretty sure it _does_ have a guideline saying not to put spaces around the attribute dot.) So, is your argument is "my code looks kind of like some horribly unreadable and unpythonic, but legal, code, and therefore it should also be legal despite being unreadable and unpythonic?"
On 02/24/2014 01:34 AM, Andrew Barnert wrote:
On 02/23/2014 05:52 PM, Stephen J. Turnbull wrote:
Ron Adam writes:
The operator you want is one for an in place method call.
seq = [] .= extend(get_data()) .= sort()
That looks like anything but Python to me.
Is it really all that different from this?
>>>"Py" . __add__("th") . __add__("on") 'Python' Well, yes, it is. But, more importantly, who cares? That code is horribly unreadable and unpythonic. OK, I don't think PEP8 has a guideline saying "don't call __add__ when you can just use +", but only because it's so obvious it doesn't need to be stated. (And I'm pretty sure it_does_ have a guideline saying not to put spaces around the attribute dot.)
So, is your argument is "my code looks kind of like some horribly unreadable and unpythonic, but legal, code, and therefore it should also be legal despite being unreadable and unpythonic?"
Wow, tough crowd here.. :-) Both the separation of the '.', and the use of the already special __add__ isn't important as far as the actual suggestion is concerned. Those are unrelated style issues. You would probably see it used more often like this... def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds) Normally .update returns None. The reason for that is so that it's clear you are mutating an object instead of creating a new one. By using .=, it can return self, but still maintain the clarity between mutation and non-mutation. This particular syntax is consistent with the use of OP+equal to mean mutate in place. But you might prefer, "..", or something else. The other alternative is to use a function. But it would be difficult to get the same behaviour along with the same efficiency. Regards, Ron
On 24 February 2014 15:01, Ron Adam <ron3200@gmail.com> wrote:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
How is this better than def names(defaults, pos_names, pos_args, kwds): ret = {} ret.update(defaults) ret.update(zip(pos_names, pos_args) ret.update(kwds) return ret (I originally named the return value _ to cater for the tendency to insist on punctuation rather than names in this thread, but honestly, why *not* name the thing "ret"?) I get the idea of chained updates, I really do. But translating between mutation of a named value and chained updates is pretty trivial, so I don't see how this is anything but a case of "follow the preferred style for the language/API you're using". And Python uses updating named values, why is that so bad? Paul
On 02/24/2014 09:12 AM, Paul Moore wrote:
On 24 February 2014 15:01, Ron Adam<ron3200@gmail.com> wrote:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds) How is this better than
def names(defaults, pos_names, pos_args, kwds): ret = {} ret.update(defaults) ret.update(zip(pos_names, pos_args) ret.update(kwds) return ret
(I originally named the return value _ to cater for the tendency to insist on punctuation rather than names in this thread, but honestly, why*not* name the thing "ret"?)
I get the idea of chained updates, I really do. But translating between mutation of a named value and chained updates is pretty trivial, so I don't see how this is anything but a case of "follow the preferred style for the language/API you're using". And Python uses updating named values, why is that so bad?
It's not bad, just not as good. The chained expression is more efficient and can be used in places where you can't use more than a single expression. The point is to maintain both a visual and computational separation of mutable and immutable expressions. Compare the byte code from these. You can see how the chained version would be more efficient. Cheers, Ron def names(defaults, pos_names, pos_args, kwds): ret = {} ret.update(defaults) ret.update(zip(pos_names, pos_args)) ret.update(kwds) return ret
dis(names) 2 0 BUILD_MAP 0 3 STORE_FAST 4 (ret)
3 6 LOAD_FAST 4 (ret) 9 LOAD_ATTR 0 (update) 12 LOAD_FAST 0 (defaults) 15 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 18 POP_TOP 4 19 LOAD_FAST 4 (ret) 22 LOAD_ATTR 0 (update) 25 LOAD_GLOBAL 1 (zip) 28 LOAD_FAST 1 (pos_names) 31 LOAD_FAST 2 (pos_args) 34 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 37 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 40 POP_TOP 5 41 LOAD_FAST 4 (ret) 44 LOAD_ATTR 0 (update) 47 LOAD_FAST 3 (kwds) 50 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 53 POP_TOP 6 54 LOAD_FAST 4 (ret) 57 RETURN_VALUE By using the '.' we can see the difference. The byte code should be very close to this, even though this function will give an error if you try to run it. (Can't update None.) The actual difference would probably be replacing LOAD_ATTR with LOAD_MUTATE_ATTR, Which would call __getmutatemethod__ instead of __getmethod__. (or something similar to that, depending on how it's implemented.) def names(defaults, pos_names, pos_args, kwds): return {}.update(defaults) \ .update(zip(pos_names, pos_args)) \ .update(kwds)
dis(names) 2 0 BUILD_MAP 0 3 LOAD_ATTR 0 (update) 6 LOAD_FAST 0 (defaults) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 LOAD_ATTR 0 (update)
3 15 LOAD_GLOBAL 1 (zip) 18 LOAD_FAST 1 (pos_names) 21 LOAD_FAST 2 (pos_args) 24 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 27 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 30 LOAD_ATTR 0 (update) 4 33 LOAD_FAST 3 (kwds) 36 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 39 RETURN_VALUE
On 24 February 2014 16:08, Ron Adam <ron3200@gmail.com> wrote:
By using the '.' we can see the difference
But the difference is precisely those bytecodes that are needed to replicate the argument that's required because that's not what update does. Show me the bytecode you propose for your proposed operator, and how it's faster, and (assuming it's an improvement) explain why it can't be achieved via bytecode optimisation of the existing code that the compiler could be updated to do. Otherwise your "it's faster" argument doesn't hold water. As regards "it's a single expression", I still say that's purely a style issue - Python doesn't place any emphasis on being able to write things as a single expression (quite the opposite, in fact - complex expressions are generally a sign of bad style in Python). Paul
On 2014-02-24, at 17:08 , Ron Adam <ron3200@gmail.com> wrote:
On 02/24/2014 09:12 AM, Paul Moore wrote:
On 24 February 2014 15:01, Ron Adam<ron3200@gmail.com> wrote:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds) How is this better than
def names(defaults, pos_names, pos_args, kwds): ret = {} ret.update(defaults) ret.update(zip(pos_names, pos_args) ret.update(kwds) return ret
(I originally named the return value _ to cater for the tendency to insist on punctuation rather than names in this thread, but honestly, why*not* name the thing "ret"?)
I get the idea of chained updates, I really do. But translating between mutation of a named value and chained updates is pretty trivial, so I don't see how this is anything but a case of "follow the preferred style for the language/API you're using". And Python uses updating named values, why is that so bad?
It's not bad, just not as good. The chained expression is more efficient and can be used in places where you can't use more than a single expression.
The point is to maintain both a visual and computational separation of mutable and immutable expressions.
Compare the byte code from these. You can see how the chained version would be more efficient.
The chained version is not intrinsically more efficient, the Python compiler could be smart enough to not LOAD_FAST (ret) repeatedly (if that proves beneficial to execution speed, which I'm not even certain of, and either way it's going to be extremely minor compared to the actual cost of executing methods). AFAIK the peephole optimiser does not even bother eliding out pairs of STORE_FAST $name LOAD_FAST $name e.g. as far as I know a = foo() a.bar() compiles to: 0 LOAD_* 0 (foo) 3 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 6 STORE_FAST 0 (a) 9 LOAD_FAST 0 (a) 12 LOAD_ATTR 1 (bar) 15 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 18 POP_TOP the pair (6, 9) is a noop and could trivially be removed (in the absence of jumps around). According to [0] a patch implementing this (although without taking care of jumps) was rejected:
because apparently the additional six lines of code didn’t buy enough of a speed improvement for an uncommon case.
(although no link to the patch so he might have been optimizing the triplet of (STORE_FAST, LOAD_FAST, RETURN_VALUE)). If removing 2 bytecode instructions once in a while does not sway the core team, I really can't see removing a single one even more rarely doing so.
By using the '.' we can see the difference. The byte code should be very close to this, even though this function will give an error if you try to run it. (Can't update None.) The actual difference would probably be replacing LOAD_ATTR with LOAD_MUTATE_ATTR, Which would call __getmutatemethod__ instead of __getmethod__. (or something similar to that, depending on how it's implemented.)
Why? There's no need for LOAD_MUTATE_ATTR. And LOAD_ATTR calls __getattribute__ (and __getattr__ if necessary), a bound method is a form callable attribute, the bytecode for a method call (assuming an object on the stack) is LOAD_ATTR $attrname CALL_FUNCTION that the function mutates the original object (or not) has no relevance to attribute loading.
def names(defaults, pos_names, pos_args, kwds): return {}.update(defaults) \ .update(zip(pos_names, pos_args)) \ .update(kwds)
dis(names) 2 0 BUILD_MAP 0 3 LOAD_ATTR 0 (update) 6 LOAD_FAST 0 (defaults) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 LOAD_ATTR 0 (update)
3 15 LOAD_GLOBAL 1 (zip) 18 LOAD_FAST 1 (pos_names) 21 LOAD_FAST 2 (pos_args) 24 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 27 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 30 LOAD_ATTR 0 (update)
4 33 LOAD_FAST 3 (kwds) 36 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 39 RETURN_VALUE
That bytecode's not correct for the case: * the return value of each method call needs to be discarded with a POP_TOP * LOAD_ATTR needs an object on the stack so you need a DUP_TOP before each LOAD_ATTR (update) (you can create the correct bytecode with something like byteplay, it'll work) [0] http://www.coactivate.org/projects/topp-engineering/blog/2008/11/03/optimizi...
On Tue, Feb 25, 2014 at 6:59 AM, Masklinn <masklinn@masklinn.net> wrote:
a = foo() a.bar()
compiles to:
0 LOAD_* 0 (foo) 3 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 6 STORE_FAST 0 (a)
9 LOAD_FAST 0 (a) 12 LOAD_ATTR 1 (bar) 15 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 18 POP_TOP
the pair (6, 9) is a noop and could trivially be removed (in the absence of jumps around). According to [0] a patch implementing this (although without taking care of jumps) was rejected:
Possible reason for rejection: The optimizer would have to be sure that a wasn't used anywhere else. a = foo() a.bar() a.spam() 2 0 LOAD_GLOBAL 0 (foo) 3 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 6 STORE_FAST 0 (a) 3 9 LOAD_FAST 0 (a) 12 LOAD_ATTR 1 (bar) 15 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 18 POP_TOP 4 19 LOAD_FAST 0 (a) 22 LOAD_ATTR 2 (spam) 25 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 28 POP_TOP The subsequent LOAD_FAST of a depends on the STORE_FAST having been done. In the specific case mentioned in your link, he was looking for a RETURN_VALUE opcode, so that would be safe. (But if there really is code like he's seeing, I'd look at tidying it up on the Python source level. Just return the value directly. No need for "single exit point" in Python code.) ChrisA
On 02/24/2014 01:59 PM, Masklinn wrote:
By using the '.' we can see the difference. The byte code should be very close to this, even though this function will give an error if you try to run it. (Can't update None.) The actual difference would probably be replacing LOAD_ATTR with LOAD_MUTATE_ATTR, Which would call __getmutatemethod__ instead of __getmethod__. (or something similar to that, depending on how it's implemented.) Why? There's no need for LOAD_MUTATE_ATTR. And LOAD_ATTR calls __getattribute__ (and __getattr__ if necessary), a bound method is a form callable attribute, the bytecode for a method call (assuming an object on the stack) is
LOAD_ATTR $attrname CALL_FUNCTION
that the function mutates the original object (or not) has no relevance to attribute loading.
Turn it around... If an object doesn't have a mutate-attribute-loader.. Then you will get an error before the CALL_FUNCION instead of during it or after it, without making any changes to existing method/function call code paths.
def names(defaults, pos_names, pos_args, kwds): return {}.update(defaults) \ .update(zip(pos_names, pos_args)) \ .update(kwds)
>> dis(names) 2 0 BUILD_MAP 0 3 LOAD_ATTR 0 (update) 6 LOAD_FAST 0 (defaults) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 LOAD_ATTR 0 (update)
3 15 LOAD_GLOBAL 1 (zip) 18 LOAD_FAST 1 (pos_names) 21 LOAD_FAST 2 (pos_args) 24 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 27 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 30 LOAD_ATTR 0 (update)
4 33 LOAD_FAST 3 (kwds) 36 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 39 RETURN_VALUE
That bytecode's not correct for the case:
* the return value of each method call needs to be discarded with a POP_TOP
* LOAD_ATTR needs an object on the stack so you need a DUP_TOP before each LOAD_ATTR (update) (you can create the correct bytecode with something like byteplay, it'll work)
Actually, for it to work the way I was thinking, it needs a matched pair to replace LOAD_ATTR and CALL_FUNCTION. The alternate call function bytecode would leave the funciton on the stack and give an error if the returned anything other than None. But I think that's too many new Byte codes. Even one new byte code is a hard sell. The idea is really just a more limited version of cascading with an error for non-mutatables used with that syntax, and am error if any value is returned when using with that syntax. So if you see this particular syntax you will know instantly the intent is to mutate the subject. It might be a fun patch to play with and/or try to do. But I don't think it would ever get approved. The alternative is to use the existing byte codes as you describe and that removes any (however small they might be) performance benifits. Cheers, Ron
On Mon, Feb 24, 2014 at 09:01:07AM -0600, Ron Adam wrote:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
Normally .update returns None. The reason for that is so that it's clear you are mutating an object instead of creating a new one.
By using .=, it can return self, but still maintain the clarity between mutation and non-mutation.
How does the update method know whether it is being called via . or via .= ? I'm trying to understand how you think this is supposed to work, and not having much success. Can you give a sketch of how this .= thingy is supposed to operate?
The other alternative is to use a function. But it would be difficult to get the same behaviour along with the same efficiency.
I don't see how you can compare the efficiency of code that can be written now with code that doesn't exist yet. How do you know how efficient your __iget_method__ suggestion will be? -- Steven
On Tue, Feb 25, 2014 at 2:16 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Feb 24, 2014 at 09:01:07AM -0600, Ron Adam wrote:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
Normally .update returns None. The reason for that is so that it's clear you are mutating an object instead of creating a new one.
By using .=, it can return self, but still maintain the clarity between mutation and non-mutation.
How does the update method know whether it is being called via . or via .= ? I'm trying to understand how you think this is supposed to work, and not having much success. Can you give a sketch of how this .= thingy is supposed to operate?
I don't know how his plan is, but mine was for the function to continue to return None, or 42, or "spam", or whatever it likes, and for that to be ignored. The expression result would be the initial object, and the actual function return value is discarded. Function doesn't need any rewriting. ChrisA
On 02/24/2014 09:16 AM, Steven D'Aprano wrote:
On Mon, Feb 24, 2014 at 09:01:07AM -0600, Ron Adam wrote:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
Normally .update returns None. The reason for that is so that it's clear you are mutating an object instead of creating a new one.
By using .=, it can return self, but still maintain the clarity between mutation and non-mutation. How does the update method know whether it is being called via . or via .= ? I'm trying to understand how you think this is supposed to work, and not having much success. Can you give a sketch of how this .= thingy is supposed to operate?
The other alternative is to use a function. But it would be difficult to get the same behaviour along with the same efficiency.
I don't see how you can compare the efficiency of code that can be written now with code that doesn't exist yet. How do you know how efficient your __iget_method__ suggestion will be?
First off, I need to be more consistent with names. Apologies for that added confusion. I've been just trying to get out the gist of the idea that feels right to me, but haven't worked through the finer details, so for now on, I'll try to be more precise. The byte code for '.' and '.=' will be nearly identical. The "LOAD_ATTR" would be replaced by another byte code. That byte code would add a light wrapper in (C) to check for None, an return self. By doing that, the CALL_FUNCTION byte code deosn't need to change, and there's no need to add the check for None in the bytecode. (Although that's doable too.) Normally a method access with a "." is done with the LOAD_ATTR bytecode, which in turn calls the objects __getattribute__ method. (Is this correct?) For the .= examples, lets use LOAD_I_ATTR for the bytecode and __getiattribute__. (Or other names if you think they would be better.) The .= would differ by using (from the AST) a "LOAD_I_ATTR" in the byte code, which would call a __getiattribute__. If you use ".=" on an object without a __getiattribute__ it would give an error saying you can't mutate that object. When you use ".=" with a method on a mutable object, the call would expect None, and return self. (giving an error if it gets something other than none.) This is different, but not incompatible. This has no effect on using '.' with immutable or mutable objects. Cheers, Ron
On Feb 24, 2014, at 7:01, Ron Adam <ron3200@gmail.com> wrote:
On 02/24/2014 01:34 AM, Andrew Barnert wrote:
So, is your argument is "my code looks kind of like some horribly unreadable and unpythonic, but legal, code, and therefore it should also be legal despite being unreadable and unpythonic?"
Wow, tough crowd here.. :-)
Both the separation of the '.', and the use of the already special __add__ isn't important as far as the actual suggestion is concerned. Those are unrelated style issues.
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
Normally .update returns None. The reason for that is so that it's clear you are mutating an object instead of creating a new one.
And so you can't chain it, so you can only mutate one thing in a statement. People keep assuming that's an accidental unwanted side effect of the rule, but Guido explicitly saying that he doesn't like method chaining, can't you imagine it's at least possible that this is intentional, not a bug?
By using .=, it can return self, but still maintain the clarity between mutation and non-mutation.
This particular syntax is consistent with the use of OP+equal to mean mutate in place. But you might prefer, "..", or something else.
You're missing a key distinction here. OP+equal does not return self. In fact, it doesn't return _anything_, because it's not an expression at all, it's a statement. It would be very easy to make augmented assignment an expression, and even easier to make it return self (after all, it's implemented by calling dunder methods that _do_ return self!). But it wasn't designed that way. Intentionally.
The other alternative is to use a function. But it would be difficult to get the same behaviour along with the same efficiency.
How do you think a regular function would have less efficiency than an operator? Operators work by doing a complex chain of lookups to find a function to call. Ordinary calls do a simpler lookup. They both call the function the same way once they find it. You're also missing the other alternative: write it Pythonically. For example: return merge_dicts( defaults, zip(pos_names, pos_args), kwds) It's shorter, it has less extraneous syntax, it doesn't need awkward backslash continuations, and it created or modifies one value in one place. It's easier to read, and easier to reason about. Why would you want to write it the other way? Of course that merge_dicts function isn't in the stdlib. Maybe it should be. But you can write it yourself trivially. For example: def merge_dicts(*args): return {k: v for arg in args for (k, v) in dict(arg).items()} That nested comprehension might be a little too complicated; if you think so, you can split it into two expressions, or even write an explicit loop around dict.update. Whatever; this is something you write once and use every time you want to merge a bunch of dicts.
On 02/24/2014 05:29 PM, Andrew Barnert wrote:
On Feb 24, 2014, at 7:01, Ron Adam<ron3200@gmail.com> wrote:
On 02/24/2014 01:34 AM, Andrew Barnert wrote:
So, is your argument is "my code looks kind of like some horribly unreadable and unpythonic, but legal, code, and therefore it should also be legal despite being unreadable and unpythonic?"
Wow, tough crowd here..:-)
Both the separation of the '.', and the use of the already special __add__ isn't important as far as the actual suggestion is concerned. Those are unrelated style issues.
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
Normally .update returns None. The reason for that is so that it's clear you are mutating an object instead of creating a new one.
And so you can't chain it, so you can only mutate one thing in a statement. People keep assuming that's an accidental unwanted side effect of the rule, but Guido explicitly saying that he doesn't like method chaining, can't you imagine it's at least possible that this is intentional, not a bug?
Yes, I know it's an intentional design choice. Did he ever say why he doesn't like chained methods? I tried to look it up, and all I found was a lot of other people saying he doesn't. But nothing that indicated what his reasoning for it was.
By using .=, it can return self, but still maintain the clarity between mutation and non-mutation.
This particular syntax is consistent with the use of OP+equal to mean mutate in place. But you might prefer, "..", or something else.
You're missing a key distinction here. OP+equal does not return self. In fact, it doesn't return_anything_, because it's not an expression at all, it's a statement.
I was referring to the dunder method that gets called as you noted below.
It would be very easy to make augmented assignment an expression, and even easier to make it return self (after all, it's implemented by calling dunder methods that_do_ return self!). But it wasn't designed that way. Intentionally.
How else would it be designed?
The other alternative is to use a function. But it would be difficult to get the same behaviour along with the same efficiency.
How do you think a regular function would have less efficiency than an operator? Operators work by doing a complex chain of lookups to find a function to call. Ordinary calls do a simpler lookup. They both call the function the same way once they find it.
Ok, you lost me with this.. what complex chain of lookups are you referring to? (Besides the normal name and method resolution.) Operators are one level above methods... And yes, so are direct method calls. But the alternative I was talking about was to write a function to get the same behaviour with chained methods as I was describing, That adds another layer, so obviously it wouldn't be quite as efficient. I wasn't saying functions aren't efficient.
You're also missing the other alternative: write it Pythonically. For example:
Never said writing functions was bad. I use functions all the time to make code nicer and cleaner. So no, I'm not missing that.
return merge_dicts( defaults, zip(pos_names, pos_args), kwds)
It's shorter, it has less extraneous syntax, it doesn't need awkward backslash continuations, and it created or modifies one value in one place. It's easier to read, and easier to reason about. Why would you want to write it the other way?
Of course that merge_dicts function isn't in the stdlib. Maybe it should be. But you can write it yourself trivially. For example:
def merge_dicts(*args): return {k: v for arg in args for (k, v) in dict(arg).items()}
That nested comprehension might be a little too complicated; if you think so, you can split it into two expressions, or even write an explicit loop around dict.update. Whatever; this is something you write once and use every time you want to merge a bunch of dicts.
Much slower too. Yes, the dict.update is nicer and quicker than the comprehension. I'd probably do it this way... def merge_dicts(*args): D = {} for a in args: D.update(a) return D Chaining methods isn't a do everything everywhere kind of thing. There are times when it's handy and times when a function is better. Cheers, Ron
Ron Adam writes:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
I actually have a bunch of code in one of my apps that implements the same thing for a different reason (cascading configs), but my implementation is def names(defaults, pos_names, pos_args, kwds): for dct in pos_names, pos_args, kwds: defaults.update(dct) return defaults The other obvious use for this (as several have posted) is accumulating a sequence. In which case most uses will be well-handled with a genexp, or if you need a concrete sequence, a listcomp, and the body becomes a one (logical) liner (although it will very likely be formatted in multiple lines).
On 02/25/2014 02:21 AM, Stephen J. Turnbull wrote:
Ron Adam writes:
You would probably see it used more often like this...
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
I actually have a bunch of code in one of my apps that implements the same thing for a different reason (cascading configs), but my implementation is
def names(defaults, pos_names, pos_args, kwds): for dct in pos_names, pos_args, kwds: defaults.update(dct) return defaults
Not quite the same but close. I just tried to come up with a more realistic example without having to look up a lot code. How does pos_args in your example get paired with names? Cheers, Ron
The other obvious use for this (as several have posted) is accumulating a sequence. In which case most uses will be well-handled with a genexp, or if you need a concrete sequence, a listcomp, and the body becomes a one (logical) liner (although it will very likely be formatted in multiple lines).
Ron Adam writes:
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
def names(defaults, pos_names, pos_args, kwds): for dct in pos_names, pos_args, kwds: defaults.update(dct) return defaults
Not quite the same but close. I just tried to come up with a more realistic example without having to look up a lot code. How does pos_args in your example get paired with names?
Sorry, I knew that before dinner but forgot after dinner. Same way as in yours: def names(defaults, pos_names, pos_args, kwds): for dct in zip(pos_names, pos_args), kwds: defaults.update(dct) return defaults If that doesn't work in my version (I've never used zip that way), how does it work in yours? BTW, I'd actually be more likely to write that now as def names(defaults, *updates): for update in updates: defaults.update(update) return defaults and call it with "names(zip(pos_names, pos_args), kwds)".
On 02/26/2014 01:19 AM, Stephen J. Turnbull wrote:
Ron Adam writes:
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
def names(defaults, pos_names, pos_args, kwds): for dct in pos_names, pos_args, kwds: defaults.update(dct) return defaults
Not quite the same but close. I just tried to come up with a more realistic example without having to look up a lot code. How does pos_args in your example get paired with names?
Sorry, I knew that before dinner but forgot after dinner. Same way as in yours:
def names(defaults, pos_names, pos_args, kwds): for dct in zip(pos_names, pos_args), kwds: defaults.update(dct) return defaults
Yes, ok.
If that doesn't work in my version (I've never used zip that way), how does it work in yours? BTW, I'd actually be more likely to write that now as
def names(defaults, *updates): for update in updates: defaults.update(update) return defaults
and call it with "names(zip(pos_names, pos_args), kwds)".
The main difference between this and the one I posted is in this, defaults is mutated in your version. I'd prefer it not be. Dictionaries are pretty flexible on how they are initiated, so it's surprising we can't do this... D = dict(keys=names, values=args) The .fromkeys() method is almost that, but sets all the values to a single value. I think I would have written that a bit different. def fromkeys(self, keys, values=None, default=None): D = {} if D is not None: D.update(zip(keys, values)] for k in keys[len(vaues):]: D[k] = default return D And probably named it withkeys instead of fromkeys. <shrug> It's what I expected fromkeys to do. cheers, Ron
On 2014-02-26, at 16:28 , Ron Adam <ron3200@gmail.com> wrote:
On 02/26/2014 01:19 AM, Stephen J. Turnbull wrote:
Ron Adam writes:
def names(defaults, pos_names, pos_args, kwds): return {}.=update(defaults) \ .=update(zip(pos_names, pos_args) \ .=update(kwds)
def names(defaults, pos_names, pos_args, kwds): for dct in pos_names, pos_args, kwds: defaults.update(dct) return defaults
Not quite the same but close. I just tried to come up with a more realistic example without having to look up a lot code. How does pos_args in your example get paired with names?
Sorry, I knew that before dinner but forgot after dinner. Same way as in yours:
def names(defaults, pos_names, pos_args, kwds): for dct in zip(pos_names, pos_args), kwds: defaults.update(dct) return defaults
Yes, ok.
If that doesn't work in my version (I've never used zip that way), how does it work in yours? BTW, I'd actually be more likely to write that now as
def names(defaults, *updates): for update in updates: defaults.update(update) return defaults
and call it with "names(zip(pos_names, pos_args), kwds)".
The main difference between this and the one I posted is in this, defaults is mutated in your version. I'd prefer it not be.
Dictionaries are pretty flexible on how they are initiated, so it's surprising we can't do this...
D = dict(keys=names, values=args)
You can. It may not do what you want, but you definitely can do this: >>> dict(keys=names, values=args) {'keys': ['a', 'b', 'c', 'd'], 'values': [0, 1, 2]} Although you're probably looking for: >>> dict(zip(names, args)) {'a': 0, 'c': 2, 'b': 1} and if you want to do a fill because you don't have enough args: >>> dict(izip_longest(names, args, fillvalue=None)) {'a': 0, 'c': 2, 'b': 1, 'd': None} (itertools is like friendship, it's bloody magic)
The .fromkeys() method is almost that, but sets all the values to a single value. I think I would have written that a bit different.
def fromkeys(self, keys, values=None, default=None): D = {} if D is not None: D.update(zip(keys, values)] for k in keys[len(vaues):]: D[k] = default return D
And probably named it withkeys instead of fromkeys. <shrug> It's what I expected fromkeys to do.
cheers, Ron
From: Ron Adam <ron3200@gmail.com> Sent: Wednesday, February 26, 2014 7:28 AM
Dictionaries are pretty flexible on how they are initiated, so it's surprising we can't do this...
D = dict(keys=names, values=args)
The .fromkeys() method is almost that, but sets all the values to a single value. I think I would have written that a bit different.
def fromkeys(self, keys, values=None, default=None): D = {} if D is not None: D.update(zip(keys, values)] for k in keys[len(vaues):]: D[k] = default return D
And probably named it withkeys instead of fromkeys. <shrug> It's what I expected fromkeys to do.
Sounds like you're thinking in Smalltalk/ObjC terms, both the "with" name and the expecting two "withs": [NSDictionary dictionaryWithObjects:values forKeys:keys] The reason we don't need this in Python is that the default construction method takes key-value pairs, and you can get that trivially from zip: dict(zip(keys, values)) Would it really be more readable your way? dict.withkeys(keys, values=values) Yes, to novices who haven't internalized zip yet. I guess the question is whether requiring people to internalize zip early is a good thing about Python, or a problem to be solved. It's not like we're requiring hundreds of weird functional idioms to make everything as brief as possible, just a very small number, each an abstraction that works consistently across a broad range of uses.
On 02/26/2014 01:55 PM, Andrew Barnert wrote:
From: Ron Adam<ron3200@gmail.com>
Sent: Wednesday, February 26, 2014 7:28 AM
Dictionaries are pretty flexible on how they are initiated, so it's surprising we can't do this...
D = dict(keys=names, values=args)
The .fromkeys() method is almost that, but sets all the values to a single value. I think I would have written that a bit different.
def fromkeys(self, keys, values=None, default=None): D = {} if D is not None: D.update(zip(keys, values)] for k in keys[len(vaues):]: D[k] = default return D
And probably named it withkeys instead of fromkeys. <shrug> It's what I expected fromkeys to do.
Sounds like you're thinking in Smalltalk/ObjC terms, both the "with" name and the expecting two "withs":
[NSDictionary dictionaryWithObjects:values forKeys:keys]
The reason we don't need this in Python is that the default construction method takes key-value pairs, and you can get that trivially from zip:
dict(zip(keys, values))
Would it really be more readable your way?
dict.withkeys(keys, values=values)
Yes, to novices who haven't internalized zip yet. I guess the question is whether requiring people to internalize zip early is a good thing about Python, or a problem to be solved. It's not like we're requiring hundreds of weird functional idioms to make everything as brief as possible, just a very small number, each an abstraction that works consistently across a broad range of uses.
The reason I expected to be able to do that is you can get just keys, or values from a dict... dict.keys() dict.values(), and pairs.. dict.items(). It just makes sense that it will take those directly too. I'm sure I'm not the only one who thought that. I don't really buy the because we can do... ... and get... as a valid reason by it's self not to do something. Not when it's clearly related to the objects type as keys, and values are. For unrelated things, yes, it is a valid reason. For example you could say, we don't need dict.items because we can do... zip(dict.keys(), dict.values()) Or we don't need dict.keys() and dict.values because we can do... [x for x, y in dict.items()] [y for x, y in dict.itmes()] But dictionaries are used so often that having these methods really helps to make the code more readable and easy to use. (or beautiful to the eye of the programmer) In any case.. It's just my opinion. Not trying to convince anyone we need it or to do it. If it was really needed, we'd have it already. (although that argument isn't very strong either. ;-) Cheers, Ron
On Wed, Feb 26, 2014 at 12:54 PM, Ron Adam <ron3200@gmail.com> wrote:
For example you could say, we don't need dict.items because we can do...
zip(dict.keys(), dict.values())
Have we actually been promised that d.keys() and d.values() walk the (unordered) dictionary in the same order, for every Python implementation/version? While I think it is almost certainly true in practice, I haven't where this invariant is guaranteed: assert [d[k] for k in d] == d.values() I could trivially subclass dict to make a pretty good dictionary that violated this invariant, e.g.: class SortedDict(dict): def keys(self): return sorted(dict.keys(self)) def values(self): return sorted(dict.values(self)) I actually don't have great difficulty imagining purposes for which this would be a useful data structure even. Clearly it violates the invariant listed though. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Thu, Feb 27, 2014 at 8:17 AM, David Mertz <mertz@gnosis.cx> wrote:
On Wed, Feb 26, 2014 at 12:54 PM, Ron Adam <ron3200@gmail.com> wrote:
For example you could say, we don't need dict.items because we can do...
zip(dict.keys(), dict.values())
Have we actually been promised that d.keys() and d.values() walk the (unordered) dictionary in the same order, for every Python implementation/version? While I think it is almost certainly true in practice, I haven't where this invariant is guaranteed:
Yes, it's promised. It's intrinsic to the definition of keys()/values(). http://docs.python.org/3.4/library/stdtypes.html#dict-views """ Keys and values are iterated over in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions. If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond. This allows the creation of (value, key) pairs using zip(): pairs = zip(d.values(), d.keys()). Another way to create the same list is pairs = [(v, k) for (k, v) in d.items()]. """ http://docs.python.org/2/library/stdtypes.html#mapping-types-dict """ If items(), keys(), values(), iteritems(), iterkeys(), and itervalues() are called with no intervening modifications to the dictionary, the lists will directly correspond. This allows the creation of (value, key) pairs using zip(): pairs =zip(d.values(), d.keys()). The same relationship holds for the iterkeys() and itervalues() methods: pairs = zip(d.itervalues(), d.iterkeys()) provides the same value for pairs. Another way to create the same list is pairs = [(v,k) for (k, v) in d.iteritems()]. """ The latter has a "CPython implementation detail" box immediately above the piece I quoted, which in a way emphasizes the fact that the bit not in the box is not CPython-specific. Note the criteria, by the way. You have to not modify the dictionary in any way in between. I believe CPython will maintain iteration order as long as the set of keys never changes, but it would be legal for a compliant Python implementation to assert-fail here: d = {1:2, 3:4, 5:6, 7:8, 9:10} items = list(d.items()) d[3] = 4 # Breakage assert items == list(d.items()) However, retrieval is safe. A compliant Python will never assert-fail if the breakage line is changed to: spam = d[3] # No breakage So a splay tree implementation (say) would have to have some other means of iterating, or it would have to not adjust itself on reads. ChrisA
On Wed, Feb 26, 2014 at 01:17:59PM -0800, David Mertz wrote:
Have we actually been promised that d.keys() and d.values() walk the (unordered) dictionary in the same order, for every Python implementation/version?
Yes. Any implementation which breaks the invariant that keys and values will be given in the same order (so long as there are no intervening changes to the dict) is buggy. http://docs.python.org/3/library/stdtypes.html#dictionary-view-objects
I could trivially subclass dict to make a pretty good dictionary that violated this invariant, e.g.:
The invariant only applies to dict :-) -- Steven
Ron Adam wrote:
Dictionaries are pretty flexible on how they are initiated, so it's surprising we can't do this...
D = dict(keys=names, values=args)
All keywords are taken by the dict(name = value, ...) constructor, so this is not so surprising. But you can write that as D = dict(zip(names, values)) -- Greg
On 02/23/2014 03:25 AM, Nick Coghlan wrote:
list_of_numbers = [1,2] list_of_numbers.append(3) list_of_numbers.append(4) list_of_numbers.append(5)
As a side-note: There is no need in python for such constructs as proposed. As noted previously, most of them happen at init time (or more generally at object conctruction time), as in your example, and python proposes better constructs for that.
Now write that without repeating list_of_numbers all the way down the line. The thing is, this *isn't an accident*, it's a deliberate choice in the library design to distinguish between data flow pipelines that transform data without side effects, and repeated mutation of a single object (although reading Guido's earlier reply, that may be me retrofitting an explicit rationale onto Guido's personal preference).
Mutation and transformation are critically different operations, [...]
I approve all what you say (if i understand right). Except that your choice of terms is rather misleading: mutation and transformation are just synonyms. What you mean (if i understand right) is mutation (of an existing piece of data) vs creation (of a new, and different, piece of data). d
On 02/22/2014 03:42 PM, Masklinn wrote:
Executing the method always has the same effect on its subject. That it may not be used for the same purpose is a different issue and common: a[k] = v can be used to add a new (k, v) pair or to update a key to a new value (in fact Erlang's new map construct makes the difference and provides for an update-only version).
You are right! and this is a weakness of python in my view. But you could as well have chosen plain assignment, couldn't you? works the same way: n = 1 # symbol creation / definition n = 2 # symbol change / redefinition [Note, as a side-point, that if def / redef signs were different, there would be no ambiguity around local vs global scope. The ambiguity lies in fact in that the compiler cannot know if one wants to create a local symbol or change a global one. As for locally creating a global symbol, this should just not exist ;-); symbols live in their creation scope.] d
On Fri, Feb 21, 2014 at 11:30 AM, Chris Angelico <rosuav@gmail.com> wrote:
Yeah, I'm insane, opening another theory while I'm busily championing a PEP. But it was while writing up the other PEP that I came up with a possible syntax for this.
In Python, as in most languages, method chaining requires the method to return its own object.
class Count: def __init__(self): self.n = 0 def inc(self): self.n += 1 return self
dracula = Count() dracula.inc().inc().inc() print(dracula.n)
It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to, and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()). Method chaining is therefore far less common than it could be, with the result that, often, intermediate objects need to be separately named and assigned to. I pulled up one file from Lib/tkinter (happened to pick filedialog) and saw what's fairly typical of Python GUI code:
... self.midframe = Frame(self.top) self.midframe.pack(expand=YES, fill=BOTH)
self.filesbar = Scrollbar(self.midframe) self.filesbar.pack(side=RIGHT, fill=Y) self.files = Listbox(self.midframe, exportselection=0, yscrollcommand=(self.filesbar, 'set')) self.files.pack(side=RIGHT, expand=YES, fill=BOTH) ...
Ugh!
Every frame has to be saved away somewhere (incidentally, I don't see why self.midframe rather than just midframe - it's not used outside of __init__). With Tkinter, that's probably necessary (since the parent is part of the construction of the children), but in GTK, widget parenting is done in a more method-chaining-friendly fashion. Compare these examples of PyGTK and Pike GTK:
# Cut down version of http://pygtk.org/pygtk2tutorial/examples/helloworld2.py import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) box1 = gtk.HBox(False, 0) window.add(box1) button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all()
gtk.main()
//Pike equivalent of the above: void callback(object widget, string data) {write("Hello again - %s was pressed\n", data);} void delete_event() {exit(0);}
int main() { GTK2.setup_gtk(); object button1, button2; GTK2.Window(GTK2.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->set_border_width(10) ->add(GTK2.Hbox(0,0) ->pack_start(button1 = GTK2.Button("Button 1"), 1, 1, 0) ->pack_start(button2 = GTK2.Button("Button 2"), 1, 1, 0) ) ->show_all() ->signal_connect("delete_event", delete_event); button1->signal_connect("clicked", callback, "button 1"); button2->signal_connect("clicked", callback, "button 2"); return -1; }
Note that in the Pike version, I capture the button objects, but not the Hbox. There's no name ever given to that box. I have to capture the buttons, because signal_connect doesn't return the object (it returns a signal ID). The more complicated the window layout, the more noticeable this is: The structure of code using chained methods mirrors the structure of the window with its widgets containing widgets; but the structure of the Python equivalent is strictly linear.
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though.
This probably wouldn't interact at all with __getattr__ (because the attribute has to already exist for this to work), and definitely not with __setattr__ or __delattr__ (mutations aren't affected). How it interacts with __getattribute__ I'm not sure; whether it adds the wrapper around any returned functions or applies only to something that's looked up "the normal way" can be decided by ease of implementation.
Supposing this were done, using the -> token that currently is used for annotations as part of 'def'. Here's how the PyGTK code would look:
import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = (gtk.Window(gtk.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->connect("delete_event", delete_event) ->set_border_width(10) ->add(gtk.HBox(False, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ) ->show_all() )
gtk.main()
Again, the structure of the code would match the structure of the window. Unlike the Pike version, this one can even connect signals as part of the method chaining.
Effectively, x->y would be equivalent to chain(x.y):
def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chainable
Could be useful in a variety of contexts.
Thoughts?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Seems weird to me, since I'm used to -> being for C++ pointers. I prefer "..", because it gives the impression that it's something additional. Either that, or I've used Lua too much. -- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated."
On 2014-02-21, at 18:30 , Chris Angelico <rosuav@gmail.com> wrote:
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
As Yuri noted the concept exists, AFAIK it was introduced by smalltalk as "message cascading". Basically, the ability to send a sequence of messages to the same subject without having to repeatedly specify the subject. I believe Dart is the first language to have resurrected this feature so far. The cascading operator in smalltalk was `;` (the message-send operator is the space), so e.g. foo message ; message2: aParameter ; message3. would send all of message, message2:aParameter and message 3 to `foo`, in that specific order. In smalltalk, a cascade returns the result of the last message in the cascade. smalltalk provides a `yourself` operator to return the subject itself. Cascading is very commonly used to initialise collections as smalltalk was born at a time where literal high-level collections were not exactly a thing: aCollection := (OrderedCollection new) add: 1 ; add: 2 ; add: 3 ; youself.
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
I could be wrong, but I'm pretty sure this is an over-complication when you look at it at the bytecode level: you can load the subject as many times as you've got attr accesses to do on it, or you could have an alternate attr access which puts TOS back. No need for a wrapper.
Effectively, x->y would be equivalent to chain(x.y):
def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chainable
Could be useful in a variety of contexts.
Thoughts?
No need for a wrapper. Where `a.b` compiles to LOAD_FAST a LOAD_ATTR b POP_TOP `a->b` would compile to LOAD_FAST a DUP_TOP LOAD_ATTR b POP_TOP at this point you've got an a left on the stack and can reuse it: `a->b()->c()->d()` would be LOAD_FAST a DUP_TOP LOAD_ATTR b CALL_FUNCTION POP_TOP DUP_TOP LOAD_ATTR c CALL_FUNCTION POP_TOP DUP_TOP LOAD_ATTR d CALL_FUNCTION POP_TOP The tail end of the cascade would be slightly more complex in that it would be: ROT_TWO POP_TOP so that the subject is discarded and the value of the last attribute/method is available on the stack (it may get popped as well if it's unused). Or maybe it would do nothing special and the cascade would yield (or pop) the subject unless closed by an attribute access or regular method call. That would avoid the requirement of a `yourself`-type method when initialising mutables, although the final irregularity may lack visibility.
On Sat, Feb 22, 2014 at 9:31 AM, Masklinn <masklinn@masklinn.net> wrote:
On 2014-02-21, at 18:30 , Chris Angelico <rosuav@gmail.com> wrote:
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
As Yuri noted the concept exists, AFAIK it was introduced by smalltalk as "message cascading". Basically, the ability to send a sequence of messages to the same subject without having to repeatedly specify the subject. I believe Dart is the first language to have resurrected this feature so far.
Cascading is what I'm looking for here, yes. As noted in the Wiki page Yuri linked to, chaining-with-return-self enables cascading. Consider this to be a proposal to add method cascading to Python. Also, since Dart uses .., that's a good reason to use .. here too.
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
I could be wrong, but I'm pretty sure this is an over-complication when you look at it at the bytecode level: you can load the subject as many times as you've got attr accesses to do on it, or you could have an alternate attr access which puts TOS back. No need for a wrapper.
That would be a job for the peephole optimizer. What happens if you do this: func = x..y # more code func().z It can't just leave x on the stack, but it has to have the same semantics. But I agree, the DUP_TOP form would be excellent for the common case:
No need for a wrapper. Where `a.b` compiles to
LOAD_FAST a LOAD_ATTR b POP_TOP
`a->b` would compile to
LOAD_FAST a DUP_TOP LOAD_ATTR b POP_TOP
at this point you've got an a left on the stack and can reuse it:
`a->b()->c()->d()` would be
LOAD_FAST a
DUP_TOP LOAD_ATTR b CALL_FUNCTION POP_TOP
DUP_TOP LOAD_ATTR c CALL_FUNCTION POP_TOP
DUP_TOP LOAD_ATTR d CALL_FUNCTION POP_TOP
Or maybe it would do nothing special and the cascade would yield (or pop) the subject unless closed by an attribute access or regular method call. That would avoid the requirement of a `yourself`-type method when initialising mutables, although the final irregularity may lack visibility.
Yes, it would do that. If you use .. everywhere, then the end result of the whole expression should be the original object. In Pike GTK, where most methods return themselves, I can do this: object window = GTK2.Window(0) ->set_title("Title!") ->add(some_object) ->show_all(); The return value from show_all() is the original window. With explicit method cascading, I could either capture the return value of the last function call by choosing _not_ to use cascading there, or I could capture the original object by continuing the cascade. (In the case of GUI work like this, I'd default to cascading, if I were not using the result of the expression. It'd mean that adding or deleting lines of code wouldn't risk changing anything - it's like permitting a trailing comma in a tuple/list.) ChrisA
From: Chris Angelico <rosuav@gmail.com> Sent: Friday, February 21, 2014 9:30 AM
Yeah, I'm insane, opening another theory while I'm busily championing a PEP. But it was while writing up the other PEP that I came up with a possible syntax for this.
In Python, as in most languages, method chaining requires the method to return its own object.
class Count: def __init__(self): self.n = 0 def inc(self): self.n += 1 return self
dracula = Count() dracula.inc().inc().inc() print(dracula.n)
It's common in languages like C++ to return *this by reference if there's nothing else useful to return. It's convenient, it doesn't cost anything much, and it allows method chaining. The Python convention, on the other hand, is to return self only if there's a very good reason to, and to return None any time there's mutation that could plausibly return a new object of the same type (compare list.sort() vs sorted()). Method chaining is therefore far less common than it could be, with the result that, often, intermediate objects need to be separately named and assigned to.
I think that's intentional, as a way of discouraging (mutable) method chaining and similar idioms—and that Python code ultimately benefits from it. In Python, each statement generally mutates one thing one time. That makes it simpler to skim Python code and get an idea of what it does than code in languages like C++ or JavaScript. On top of that, it's the lack of method chaining that means lines of Python code tend to be just about the right length, and don't need to be continued very often. If you break that, you lose most of the readability benefits of Python's whitespace-driven syntax. In JavaScript or Ruby, a function call is often a dozen lines long. Readable programs use indentation conventions for expressions, just as they do for block statements, but those expression indentation conventions do not map to indent tokens in Python (and indentation rules in Python-friendly editors) the same way the block statement conventions do.
I pulled up one file from Lib/tkinter (happened to pick filedialog) and saw what's fairly typical of Python GUI code:
Tkinter has its own weird idioms that aren't necessarily representative of Python in general. And PyQt/PySide and PyGTK/GObject have their own _different_ weird idioms. Partly this is because they're mappings to Python of idioms from Tcl, C++, and C (and/or Vala), respectively. But whatever the reason, I'm not sure it's reasonable to call any of them typical.
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current.
Why? Why not just use x.y for those cases, and make it a TypeError if you use x->y for a data attribute? It seems pretty misleading to "chain" through something that isn't a function call—especially since it doesn't actually chain in that case.
3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
For normal methods, the result will _not_ be a function, it will be a bound method. It will only be a function for classmethods, staticmethods, functions you've explicitly added to self after construction, and functions returned by __getattr__ or custom __getattribute__. And this isn't just nit-picking; it's something you can take advantage of: bound methods have a __self__, so your wrapper can just be: def wrap_method(method): @wraps(method) def wrapper(*args, **kwargs): method(*args, **kwargs) return method.__self__ return wrapper Or, alternatively, you've already got the self from the lookup, so you could just use that—in which case you can even make it work on static and class methods if you want, although you don't have to if you don't want. And, depending on where you hook in to attribute lookup, you may be able to distinguish methods from data attributes before calling the descriptor's __get__, as explained toward the bottom, making this even simpler.
This can be done with an external wrapper, so it might be possible to
do this with MacroPy. It absolutely must be a compact notation, though.
This probably wouldn't interact at all with __getattr__ (because the attribute has to already exist for this to work),
Why? See below for details.
and definitely not with __setattr__ or __delattr__ (mutations aren't affected). How it interacts with __getattribute__ I'm not sure; whether it adds the wrapper around any returned functions or applies only to something that's looked up "the normal way" can be decided by ease of implementation.
Supposing this were done, using the -> token that currently is used for annotations as part of 'def'. Here's how the PyGTK code would look:
import pygtk pygtk.require('2.0') import gtk
def callback(widget, data): print "Hello again - %s was pressed" % data
def delete_event(widget, event, data=None): gtk.main_quit() return False
window = (gtk.Window(gtk.WINDOW_TOPLEVEL) ->set_title("Hello Buttons!") ->connect("delete_event", delete_event) ->set_border_width(10) ->add(gtk.HBox(False, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ->pack_start( gtk.Button("Button 1")->connect("clicked", callback, "button 1"), True, True, 0) ) ->show_all() )
gtk.main()
I personally think this looks terrible, and unpythonic, for exactly the reasons I suspected I would. I do not want to write—or, more importantly, read—15-line expressions in Python. Maybe that's just me.
Again, the structure of the code would match the structure of the window. Unlike the Pike version, this one can even connect signals as part of the method chaining.
Effectively, x->y would be equivalent to chain(x.y):
def chain(func): def chainable(self, *args, **kwargs): func(self, *args, **kwargs) return self return chain able
With this definition, it definitely works with __getattr__, __getattribute__, instance attributes, etc., not just normal methods. The value of x.y is the result of x.__getattribute__('y'), which, unless you've overridden it, does something similar to this Python code (slightly simplified): try: return x.__dict__['y'] except KeyError: pass for cls in type(x).__mro__: try: return cls.__dict__['y'].__get__(x) except KeyError: pass return x.__getattr__('y') By the time you get back x.y, you have no way of knowing whether it came from a normal method, a method in the instance dict, a __getattr__ call, or some funky custom stuff from __getattribute__. And I don't see why you have any reason to care, either. However, if you want the logic you suggested, as I mentioned earlier, you could implement x->y from scratch, which means you can hook just normal methods and nothing else, instead of switching on type or callability or something. For example: for cls in type(x).__mro__: try: descr = cls.__dict__['y'] except KeyError: pass else: if hasattr(descr, '__set__'): return descr.__get__(x) # data descriptor else: return wrap_method(descr.__get__(x)) # non-data descriptor
Could be useful in a variety of contexts.
Thoughts?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Feb 22, 2014 at 04:30:03AM +1100, Chris Angelico wrote:
Yeah, I'm insane, opening another theory while I'm busily championing a PEP.
Completely raving bonkers :-)
In Python, as in most languages, method chaining requires the method to return its own object. [...]
Rather than add syntactic support for this, I'd prefer a wrapper in the standard library. Syntax implies to me that chaining operators is, in some sense, a preferred idiom of the language, and although method chaining is sometimes useful, I don't think that it ought to be preferred. There's a difference between "Python supports this, use this syntax" and "You can do this in Python, import this library and call this function to make it happen". [...]
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
When you say "a function", do you actually mean a function? What about other callables, such as methods (instance, class or static) or types? How does this interact with the descriptor protocol?
This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though.
Here's an alternative: add a `chained` wrapper to, oh, let's say functools (I don't think it needs to be a built-in). In my hierarchy of language preferredness, this suggests that while Python *supports* method chaining, it isn't *preferred*. (If it were preferred, it would happen by default, or there would be syntax for it.) If you want to chain methods, you can, but it's a slightly unusual thing to do, and the fact that you have to import a module and call a wrapper class should be sufficient discouragement for casual (ab)use. Inspired by the Ruby "tap" method, I wrote this proof-of-concept last year: https://mail.python.org/pipermail/python-list/2013-November/660892.html http://code.activestate.com/recipes/578770-method-chaining/ Of course, if the class author wants to support method chaining, they can always write the methods to return self :-) -- Steven
This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though.
Something like: merged = c[ my_dict.update(dict_a).update(dict_b) ] Desugaring into # top-level somewhere elsedef fresh_name(x, *args): y = x for op, a, kw in args: y = getattr(y, op)(*a, **kw) y = x if y is None else y return y merged = fresh_name(my_dict, ("update", [dict_a], {}), ("update", [dict_b], {})) could be doable pretty easily, and is flexible enough to make old-school chaining work while also letting things that return None do the right thing. If you don't want to use macros, an alternative could be doing something like: chain(my_dict).update(dict_a).update(dict_b).get Using a wrapper as Steven mentioned. On Fri, Feb 21, 2014 at 5:13 PM, Steven D'Aprano <steve@pearwood.info>wrote:
On Sat, Feb 22, 2014 at 04:30:03AM +1100, Chris Angelico wrote:
Yeah, I'm insane, opening another theory while I'm busily championing a PEP.
Completely raving bonkers :-)
In Python, as in most languages, method chaining requires the method to return its own object. [...]
Rather than add syntactic support for this, I'd prefer a wrapper in the standard library. Syntax implies to me that chaining operators is, in some sense, a preferred idiom of the language, and although method chaining is sometimes useful, I don't think that it ought to be preferred. There's a difference between "Python supports this, use this syntax" and "You can do this in Python, import this library and call this function to make it happen".
[...]
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
When you say "a function", do you actually mean a function? What about other callables, such as methods (instance, class or static) or types? How does this interact with the descriptor protocol?
This can be done with an external wrapper, so it might be possible to do this with MacroPy. It absolutely must be a compact notation, though.
Here's an alternative: add a `chained` wrapper to, oh, let's say functools (I don't think it needs to be a built-in). In my hierarchy of language preferredness, this suggests that while Python *supports* method chaining, it isn't *preferred*. (If it were preferred, it would happen by default, or there would be syntax for it.) If you want to chain methods, you can, but it's a slightly unusual thing to do, and the fact that you have to import a module and call a wrapper class should be sufficient discouragement for casual (ab)use.
Inspired by the Ruby "tap" method, I wrote this proof-of-concept last year:
https://mail.python.org/pipermail/python-list/2013-November/660892.html
http://code.activestate.com/recipes/578770-method-chaining/
Of course, if the class author wants to support method chaining, they can always write the methods to return self :-)
-- Steven _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Feb 22, 2014 at 12:13 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Feb 22, 2014 at 04:30:03AM +1100, Chris Angelico wrote:
So here's the proposal. Introduce a new operator to Python, just like the dot operator but behaving differently when it returns a bound method. We can possibly use ->, or maybe create a new operator that currently makes no sense, like .. or .> or something. Its semantics would be:
1) Look up the attribute following it on the object, exactly as per the current . operator 2) If the result is not a function, return it, exactly as per current. 3) If it is a function, though, return a wrapper which, when called, calls the inner function and then returns self.
When you say "a function", do you actually mean a function? What about other callables, such as methods (instance, class or static) or types? How does this interact with the descriptor protocol?
Well, I mean anything that can become a bound method. At some point, a lookup is done and it returns a function (in the normal case), but I guess presumably any callable will do at that point. I'm looking at hooking in at the point where it becomes bound. A bound method is a function wrapper that provides a 'self' argument. A chaining bound method would be a function wrapper that provides a 'self' argument, and then returns it.
Here's an alternative: add a `chained` wrapper to, oh, let's say functools (I don't think it needs to be a built-in). In my hierarchy of language preferredness, this suggests that while Python *supports* method chaining, it isn't *preferred*. (If it were preferred, it would happen by default, or there would be syntax for it.) If you want to chain methods, you can, but it's a slightly unusual thing to do, and the fact that you have to import a module and call a wrapper class should be sufficient discouragement for casual (ab)use.
Downside of that is that it's really verbose. This sort of thing is useful only if it's short. It's like doing a filtered iteration: for spam in can if tasty: eat(spam) for spam in filter(lambda x: tasty, can): eat(spam) Using filter() is so verbose that there has been and will continue to be a stream of requests for a "filtered for loop" syntax. Having to pass everything through a named wrapper is too wordy to be useful. ChrisA
On Sat, Feb 22, 2014 at 03:31:35PM +1100, Chris Angelico wrote:
Here's an alternative: add a `chained` wrapper to, oh, let's say functools (I don't think it needs to be a built-in). In my hierarchy of language preferredness, this suggests that while Python *supports* method chaining, it isn't *preferred*. (If it were preferred, it would happen by default, or there would be syntax for it.) If you want to chain methods, you can, but it's a slightly unusual thing to do, and the fact that you have to import a module and call a wrapper class should be sufficient discouragement for casual (ab)use.
Downside of that is that it's really verbose. This sort of thing is useful only if it's short. It's like doing a filtered iteration:
It's not that long: chained() only adds nine characters, and if you're worried about that, call it chain() instead (seven). With your syntax, every dot lookup takes two chars instead of one, so it only takes eight method calls before my syntax is shorter than yours. # unrealistically short method names just so the example fits on one line obj->f()->g()->h()->i()->j()->k()->l()->m() chain(obj).f().g().h().i().j().k().l().m() In practice, I would expect that you would want to split the chain across multiple lines, just for readability, and if you're approaching a chain ten methods long, your code probably needs a rethink.
for spam in can if tasty: eat(spam)
for spam in filter(lambda x: tasty, can): eat(spam)
The number of characters isn't really relevent there. If that was written: for spam in f(len, can): eat(spam) which is shorter than "for spam in can if len(spam)", people would still want the filtered syntax. The problem is to think in terms of higher-order functions, and that doesn't come easily to most people.
Using filter() is so verbose that there has been and will continue to be a stream of requests for a "filtered for loop" syntax. Having to pass everything through a named wrapper is too wordy to be useful.
Not so -- if the alternative is: obj.f() obj.g() obj.h() obj.i() obj.j() obj.k() obj.l() obj.m() my version with chain() or chained() or even enable_chained_methods() wins hands down. Given that Guido dislikes chaining methods, I don't think he'll want to make it too easy to chain methods :-) -- Steven
On Sat, Feb 22, 2014 at 4:16 PM, Steven D'Aprano <steve@pearwood.info> wrote:
It's not that long: chained() only adds nine characters, and if you're worried about that, call it chain() instead (seven). With your syntax, every dot lookup takes two chars instead of one, so it only takes eight method calls before my syntax is shorter than yours.
# unrealistically short method names just so the example fits on one line obj->f()->g()->h()->i()->j()->k()->l()->m() chain(obj).f().g().h().i().j().k().l().m()
In practice, I would expect that you would want to split the chain across multiple lines, just for readability, and if you're approaching a chain ten methods long, your code probably needs a rethink.
Oh, I see. I thought I'd need to call chain() at each step along the way, in which case it really would be too long.
Not so -- if the alternative is:
obj.f() obj.g() obj.h() obj.i() obj.j() obj.k() obj.l() obj.m()
my version with chain() or chained() or even enable_chained_methods() wins hands down.
It certainly beats that version, yeah! ChrisA
Chris Angelico wrote:
In Python, as in most languages, method chaining requires the method to return its own object.
I don't think Python has much need for method chaining. Most uses of it I've seen in other languages are better addressed in other ways in Python. E.g. a pattern often used in Java for initialising objects: b = new BlockStone().setHardness(1.5F) .setResistance(10.0F).setStepSound(soundTypePiston) .setBlockName("stone").setBlockTextureName("stone") is expressed much more Pythonically as b = BlockStone(hardness = 1.5, resistance = 10.0, step_sound = sound_type_piston, name = "stone", texture_name = "stone")
window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.set_title("Hello Buttons!") window.connect("delete_event", delete_event) window.set_border_width(10) box1 = gtk.HBox(False, 0) window.add(box1) button1 = gtk.Button("Button 1") button1.connect("clicked", callback, "button 1") box1.pack_start(button1, True, True, 0) button2 = gtk.Button("Button 2") button2.connect("clicked", callback, "button 2") box1.pack_start(button2, True, True, 0) window.show_all()
I think this is a symptom of bad API design. A more Pythonic way to write that would be window = Window(style = TOPLEVEL, title = "Hello Buttons!", on_delete = delete_event, border_width = 10, HBox(False, 0, [ Button("Button 1", on_clicked = callback), Button("Button 2", on_clicked = callback), ]) ) -- Greg
participants (19)
-
Alan Cristhian Ruiz -
Andrew Barnert -
Antony Lee -
Chris Angelico -
David Mertz -
Greg Ewing -
Guido van Rossum -
Haoyi Li -
Masklinn -
Nick Coghlan -
Paul Moore -
Ron Adam -
Ryan Gonzalez -
spir -
Stephen J. Turnbull -
Steven D'Aprano -
Terry Reedy -
Westley Martínez -
Yury Selivanov