[Python-ideas] How do you think about these language extensions?
Steven D'Aprano
steve at pearwood.info
Fri Aug 18 23:57:10 EDT 2017
On Fri, Aug 18, 2017 at 11:47:40AM -0700, Chris Barker wrote:
> >> # arrow transform (to avoid endless parentheses and try to be more
> readable.
>
> > >>
> > >> >> range(5) -> map(.x->x+2, _) -> list(_)
> > >> >> [2,3,4,5,6]
> > >
> > > I like the idea of chained function calls
>
>
> parentheses aren't that bad, and as far as I can tell, this is just another
> way to call a function on the results of a function.
I wouldn't say that parens are evil, but they're pretty noisy and
distracting. I remember an old joke that claimed to prove that the US
Defence Department was using Lisp for the SDI ("Star Wars") software:
somebody had found a page covered completely edge to edge in nothing but
closing brackets:
))))))))))))))))))))))))))))))))))))))))
))))))))))))))))))))))))))))))))))))))))
))))))))))))))))))))) ... etc
Your example has a fairly short pipeline of calls:
> list(map(lambda x: x+2, range(5)))
But even this has two clear problems:
- the trailing brackets ))) are just noise, like the SDI joke above;
- you have to read it backwards, right to left, to make sense of it.
Imagine if you had a chain of ten or twenty calls:
)))))))))) ... you get the picture
But ultimately that's a relatively minor nuisance rather than a major
problem. The thing that makes long chains of function calls painful is
that you have to read them backwards:
- first range() is called;
- then map;
- finally list
even though we write them in the opposite order. When we reason about
the code, say to write it in the first place, or to read the expression
and understand it, I would guess that most people reason something like
this:
- start with our input data, range()
- call map on it to generate new values;
- call list to generate a list.
When writing code like this, I frequently find myself having to work
backwards compared to how we write the order of function calls:
range(5)
# move editor insertion point backwards
map(...)
# move editor insertion point backwards
list(...)
Half of my key presses are moving backwards over code I've just written
to insert a function call which is executed *after* what I wrote, but
needs to be written *before* what I just wrote.
For a short example like this, where we can easily keep the three
function calls in short-term memory, it isn't so bad, but short-term
memory is very limited ("magic number seven, plus or minus two") and if
you're already thinking about a couple of previous operations on earlier
lines of code, you don't have a lot of stack space left for a long chain
of operations.
And that's why we often fall back to temporary variables and an
imperative style:
data = range(5)
data = map(..., data)
data = list(data)
Perhaps not in such a short example, but for longer ones, very
frequently.
We can write the code in the same order that it is executed with a
pipeline and avoid needing to push functions into our short-term memory
when either reading or writing:
range(5) -> map(lambda...) -> list
This way of thinking combines the stengths of postfix notation and
function call notation, without the disadvantages of either.
This is very successful in shell scripting languages like bash. I don't
want to oversell it as a panacea that solves everything, but it really
is a powerful (and underused) software paradigm.
> which seems fine with me -- the only improvement I see is a more compact
> way to spell lambda. (though really, a list comp is considered more
> "pythonic" these days, yes?
>
> [x+2 for x in range(5)]
Aye, for such a sort example. But consider a longer one: find the
earliest date in a bunch of lines of text:
result = (myfile.readlines()
-> map(str.strip)
-> filter( lambda s: not s.startwith('#') )
-> sorted
-> collapse # collapse runs of identical lines
-> extract_dates
-> map(date_to_seconds)
-> min
)
(I've assumed that the functions map and filter have some sort of
automatic currying, like in Haskell; if you don't like that, then just
pretend I spelled them Map and Filter instead :-)
That's nice and easy to read and write: I wrote down exactly the steps I
would have taken to solve the problem, in the same order that they need
to be taken. Formatting is a breeze: the hardest decision was how far to
indent subsequent lines. Compare it to this:
result = min(map(date_to_seconds, extract_dates(collapse(sorted(
filter(lambda s: not s.startswith('#'), map(str.strip,
myfile.readlines())))))))
You have to read all the way to the end to find out the most important
part, namely what data you are operating on! And then you have to read
backwards to understand what is done to the data. And finally you have
to be prepared for a whole lot of arguments from your co-workers about
how to format it :-)
# Either the ugliest thing ever, or the One True Way
result = min(
map(
date_to_seconds,
extract_dates(
collapse(
sorted(
filter(
lambda s: not s.startswith('#'),
map(
str.strip,
myfile.readlines()
)
)
)
)
)
)
)
[...]
> Also, we need to remember that functions can take *args, **kwargs, etc,
> and can return a tuple of just about anything -- not sure how well that
> maps to the "pipe" model.
Not everything maps well to the function pipeline model. But enough
things do that I believe it is a powerful tool in the programmers
toolkit.
--
Steve
More information about the Python-ideas
mailing list