[Python-ideas] How do you think about these language extensions?

Steven D'Aprano steve at pearwood.info
Fri Aug 18 23:57:10 EDT 2017


On Fri, Aug 18, 2017 at 11:47:40AM -0700, Chris Barker wrote:
> >> # arrow transform (to avoid endless parentheses and try to be more
> readable.
> 
> > >>
> > >>   >> range(5) -> map(.x->x+2, _) -> list(_)
> > >>   >> [2,3,4,5,6]
> > >
> > > I like the idea of chained function calls
> 
> 
> parentheses aren't that bad, and as far as I can tell, this is just another
> way to call a function on the results of a function.

I wouldn't say that parens are evil, but they're pretty noisy and 
distracting. I remember an old joke that claimed to prove that the US 
Defence Department was using Lisp for the SDI ("Star Wars") software: 
somebody had found a page covered completely edge to edge in nothing but 
closing brackets:

    ))))))))))))))))))))))))))))))))))))))))
    ))))))))))))))))))))))))))))))))))))))))
    ))))))))))))))))))))) ... etc


Your example has a fairly short pipeline of calls:

> list(map(lambda x: x+2, range(5)))

But even this has two clear problems:

- the trailing brackets ))) are just noise, like the SDI joke above;

- you have to read it backwards, right to left, to make sense of it.

Imagine if you had a chain of ten or twenty calls:

    )))))))))) ... you get the picture

But ultimately that's a relatively minor nuisance rather than a major 
problem. The thing that makes long chains of function calls painful is 
that you have to read them backwards:

- first range() is called;
- then map;
- finally list

even though we write them in the opposite order. When we reason about 
the code, say to write it in the first place, or to read the expression 
and understand it, I would guess that most people reason something like 
this:

- start with our input data, range()
- call map on it to generate new values;
- call list to generate a list.

When writing code like this, I frequently find myself having to work 
backwards compared to how we write the order of function calls:

range(5)
# move editor insertion point backwards
map(...)
# move editor insertion point backwards
list(...)

Half of my key presses are moving backwards over code I've just written 
to insert a function call which is executed *after* what I wrote, but 
needs to be written *before* what I just wrote.

For a short example like this, where we can easily keep the three 
function calls in short-term memory, it isn't so bad, but short-term 
memory is very limited ("magic number seven, plus or minus two") and if 
you're already thinking about a couple of previous operations on earlier 
lines of code, you don't have a lot of stack space left for a long chain 
of operations.

And that's why we often fall back to temporary variables and an 
imperative style:

data = range(5)
data = map(..., data)
data = list(data)

Perhaps not in such a short example, but for longer ones, very 
frequently.

We can write the code in the same order that it is executed with a 
pipeline and avoid needing to push functions into our short-term memory 
when either reading or writing:

range(5) -> map(lambda...) -> list

This way of thinking combines the stengths of postfix notation and 
function call notation, without the disadvantages of either.

This is very successful in shell scripting languages like bash. I don't 
want to oversell it as a panacea that solves everything, but it really 
is a powerful (and underused) software paradigm.

> which seems fine with me -- the only improvement I see is a more compact
> way to spell lambda. (though really, a list comp is considered more
> "pythonic" these days, yes?
> 
> [x+2 for x in  range(5)]

Aye, for such a sort example. But consider a longer one: find the 
earliest date in a bunch of lines of text:

result = (myfile.readlines() 
                 -> map(str.strip) 
                 -> filter( lambda s: not s.startwith('#') )
                 -> sorted
                 -> collapse  # collapse runs of identical lines
                 -> extract_dates
                 -> map(date_to_seconds)
                 -> min
                 )

(I've assumed that the functions map and filter have some sort of 
automatic currying, like in Haskell; if you don't like that, then just 
pretend I spelled them Map and Filter instead :-)

That's nice and easy to read and write: I wrote down exactly the steps I 
would have taken to solve the problem, in the same order that they need 
to be taken. Formatting is a breeze: the hardest decision was how far to 
indent subsequent lines. Compare it to this:

result = min(map(date_to_seconds, extract_dates(collapse(sorted(
         filter(lambda s: not s.startswith('#'), map(str.strip, 
         myfile.readlines())))))))

You have to read all the way to the end to find out the most important 
part, namely what data you are operating on! And then you have to read
backwards to understand what is done to the data. And finally you have 
to be prepared for a whole lot of arguments from your co-workers about 
how to format it :-)

# Either the ugliest thing ever, or the One True Way
result = min(
             map(
                 date_to_seconds, 
                 extract_dates(
                               collapse(
                                        sorted(
                                               filter(
                                                      lambda s: not s.startswith('#'), 
                                                      map(
                                                          str.strip, 
                                                          myfile.readlines()
                                                         )
                                                     )
                                              )
                                       )
                              )
                )
            )



[...]
> Also,  we need to remember that functions can take *args, **kwargs, etc,
> and can return a tuple of just about anything -- not sure how well that
> maps to the "pipe" model.

Not everything maps well to the function pipeline model. But enough 
things do that I believe it is a powerful tool in the programmers 
toolkit.



-- 
Steve


More information about the Python-ideas mailing list