
I like the "obj -> func1 -> func2” idiom If func1 and func2 take only one argument. If func1 takes two arguments (arg1, arg2), it would be like the following: obj -> func1(arg2) -> func2. Suppose you want to insert the returned object not as a first argument. We could do something like the following: obj -> func1(arg1, ?) -> func2. The returned object will be captured in ?, which happens to be the second argument. If ? is the first positional argument and we only have one ? to pass, it can be omitted. That is, obj -> func1(arg2) -> func2 is equivalent to obj -> func1(?, arg2) -> func2(?) If you don’t see the “?”, you can always assume the returned object is used as the first argument because each chained function needs at least one “?” explicitly or implicitly. This implies chained functions need to take at least one argument which makes sense because we want them to transform the data we pass in. If func1 takes zero argument, then … obj -> func1 -> func2 will throw this error TypeError: func1() takes 0 positional arguments but 1 was given, which is the implicit ?. We can use chaining/piping with keyword arguments. obj -> func1(arg1, arg2=?) -> func2 Suppose our only_kw_func signature like this: def only_kw_func(*, arg1, arg2): … , then the following will work. obj -> only_kw_func(arg1=?, arg2=arg2) -> func2 We can probably pass only part of the returned object with pattern matching. If obj is a tuple of three elements, then the third element will be the returned object in the following expression: obj -> only_kw_func(arg1=(_, _, ?), arg2=arg2) -> func2. The “?” captures what we want to pass. If the pattern does not match, a missing argument error will occur. You can pass the returned object or part of it using pattern matching in multiple arguments because each chained function needs at least one “?” explicitly or implicitly. But now you cannot omit ? for the first positional argument. Or, missing argument error will occur. obj -> func1(?, ?) -> func2. obj will be captured twice in func1; one for arg1 and another for arg2. Let’s do a contrived example. Let’s say our obj is (1, 2, [1, 2, 3]) def func1(x: int, y: int, seq: Sequence[int]) -> bool: return (x in seq) and (y in seq) def func2(contains: bool) -> str: if contains: return “Success” else: return “Failure” obj -> func1((?, _, _), (_, ?, _), (_, _, ?)) -> func2 (1, 2, [1, 2, 3]) => func1(1, 2, [1, 2, 3]) => func2(True) => “Success” Abdulla
On 27 Nov 2021, at 7:34 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Nov 27, 2021 at 02:58:07AM -0000, Raimi bin Karim wrote:
This syntactic sugar imo is powerful because it's not limited to iterables but generalises to possibly any object.
Indeed. There is no reason to limit pipelines to only collections, any object which can be transformed in any way will work.
But I guess since method chaining (for collection pipeline) is more commonplace across many languages, it might be easier to catch on.
We should be careful about the terminology. Method chaining and pipelining are related, but independent, design patterns or idioms:
Method chaining or cascading, also called fluent interfaces, relies on calling a chain of methods, usually of the same object:
obj.method().foo().bar().baz().foobar()
This is very common in Python libraries like pandas, and in immutable classes like str, but not for mutable builtins like list.
So it is very simple to implement chaining in your own classes, by having your methods either return a new instance, or by returning self. Just don't return None and you'll probably be fine :-)
Pipelining involves calling a sequence of independent functions, not necessarily methods:
obj | func | spam | eggs | cheese | aardvark
In mathematical notation, using the ring operator for function composition, that is the same as:
(func∘spam∘eggs∘cheese∘aardvark)(obj)
In concatenative languages like Factor or Forth, you would write it in reverse polish notation (no operator required):
obj func spam eggs cheese aardvark
compared to regular function notation, where functions are written in prefix order, which puts them in the reverse of executation order:
aardvark(cheese(eggs(spam(func(obj)))))
Even though they logically go together in some ways, let's be careful to not confuse these two idioms.
Note that chaining, pipelining and function composition go together very well:
(obj.method().foo() | func∘spam | eggs).bar().baz()
executes from left-to-right, exactly as it is written.
(Assuming that the ring operator has a higher precedence than the pipe operator, otherwise you can use parentheses.) Contrast how well that reads from left to right compared to:
eggs(spam(func(obj.method().foo()))).bar().baz()
where the order of executation starts in the middle, runs left to right for a bit, then back to the middle, runs right to left, then jumps to the end and runs left to right again.
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WKULHJ... Code of Conduct: http://python.org/psf/codeofconduct/