[Python-ideas] Method chaining notation
Nick Coghlan
ncoghlan at gmail.com
Sun Feb 23 03:25:09 CET 2014
On 23 February 2014 01:54, Chris Angelico <rosuav at gmail.com> wrote:
> On Sun, Feb 23, 2014 at 2:44 AM, Alan Cristhian Ruiz
> <alan.cristh at gmail.com> wrote:
>> What is wrong with the current sintax?:
>>
>> 'abcd'\
>> .upper()\
>> .lower()\
>> .title()
>
> It doesn't have each method operate on the original object. It's not
> easy to see with strings, but try this:
>
> list_of_numbers = [1,2]
> list_of_numbers.append(3)
> list_of_numbers.append(4)
> list_of_numbers.append(5)
>
> Now write that without repeating list_of_numbers all the way down the line.
The thing is, this *isn't an accident*, it's a deliberate choice in
the library design to distinguish between data flow pipelines that
transform data without side effects, and repeated mutation of a single
object (although reading Guido's earlier reply, that may be me
retrofitting an explicit rationale onto Guido's personal preference).
Mutation and transformation are critically different operations, and
most requests for method chaining amount to "I want to use something
that looks like a chained transformation to apply multiple mutating
operations to the same object". The response from the core developers
to that request is almost certainly always going to be "No", because
it's a fundamentally bad idea to blur that distinction: you should be
able to tell *at a glance* whether an operation is mutating an
existing object or creating a new one (this is actually one of the
problems with the iterator model: for iterators, rather than
iterables, the "__iter__ returns self" implementation means that
iteration becomes an operation with side effects, which can be
surprising at times, usually because the iterator shows up as
unexpectedly empty later on).
Compare:
seq = get_data()
seq.sort()
seq = sorted(get_data())
Now, compare that with the proposed syntax as applied to the first operation:
seq = []->extend(get_data())->sort()
That *looks* like it should be a data transformation pipeline, but
it's not - each step in the chain is mutating the original object,
rather than creating a new one. That's a critical *problem* with the
idea, not a desirable feature.
There are a few good responses to this:
1. Design your APIs as transformation APIs that avoid in-place
operations with side effects. This is a really good choice, as
stateless transformations are one of the key virtues of functional
programming, and if a problem can be handled that way without breaking
the reader's brain, *do it*. Profiling later on may reveal the need to
use more efficient in-place operations, but externally stateless APIs
are still a great starting point that are less likely to degenerate
into an unmaintainable stateful mess over time (you can maintain
temporary state *internally*, but from the API users' perspective,
things should look like they're stateless).
2. Provide a clean "specification" API, that allows a complex object
structure to be built from something simpler (see, for example,
logging.dictConfig(), or the various declarative approaches to
defining user interfaces, or the Python 3 open(), which can create
multilayered IO stacks on behalf of the user)
3. If the core API is based on mutation, but there's a clean and fast
copying mechanism, consider adding a transformation API around it that
trades speed (due to the extra object copies) for clarity (due to the
lack of reliance on side effects).
There's also a somewhat hacky workaround that can be surprisingly
effective in improving readability when working with tree structures:
abuse context managers to make the indentation structure match the
data manipulation structure.
@contextmanager
def make(obj):
yield obj
with make(gtk.Window(gtk.WINDOW_TOPLEVEL)) as window:
window.set_title("Hello Buttons!")
window.connect("delete_event", delete_event)
window.set_border_width(10)
with make(gtk.HBox(False, 0)) as box1:
window.add(box1)
with make(gtk.Button("Button 1")) as button1:
button1.connect("clicked", callback, "button 1")
box1.pack_start(button1, True, True, 0)
with make(gtk.Button("Button 2")) as button2:
button2.connect("clicked", callback, "button 2")
box1.pack_start(button2, True, True, 0)
window.show_all()
Although even judicious use of vertical whitespace and comments can
often be enough to provide a significant improvement:
# Make the main window
window = gtk.Window(gtk.WINDOW_TOPLEVEL)
window.set_title("Hello Buttons!")
window.connect("delete_event", delete_event)
window.set_border_width(10)
# Make the box and add the buttons
box1 = gtk.HBox(False, 0)
window.add(box1)
# Add Button 1
button1 = gtk.Button("Button 1")
button1.connect("clicked", callback, "button 1")
box1.pack_start(button1, True, True, 0)
# Add Button 2
button2 = gtk.Button("Button 2")
button2.connect("clicked", callback, "button 2")
box1.pack_start(button2, True, True, 0)
# And now we're done
window.show_all()
And adding a short internal helper function makes it even clearer:
# Make the main window
window = gtk.Window(gtk.WINDOW_TOPLEVEL)
window.set_title("Hello Buttons!")
window.connect("delete_event", delete_event)
window.set_border_width(10)
# Make the box and add the buttons
box1 = gtk.HBox(False, 0)
window.add(box1)
def add_button(box, label, callback_arg):
button = gtk.Button(label)
button.connect("clicked", callback, callback_arg)
box.pack_start(button, True, True, 0)
add_button(box, "Button 1", "button 1")
add_button(box, "Button 2", "button 2")
# And now we're done
window.show_all()
It's easy to write code that looks terrible - but to make the case for
a syntax change, you can't use code that looks terrible as a
rationale, when there are existing ways to refactor that code that
make it substantially easier to read. It's only when the code is
*still* hard to read after it has been refactored to be as beautiful
as is currently possible that a case for new syntactic sugar can be
made.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-ideas
mailing list