[Python-ideas] The pipe protocol, a convention for extensible method chaining

Joao S. O. Bueno jsbueno at python.org.br
Tue May 26 15:43:58 CEST 2015


> Unfortunately, method chaining isn't very extensible -- short of monkey patching
> every method we want to use has exist on the original object.

(Link for repo on which the examples here are implemented:
https://github.com/jsbueno/chillicurry )

Actually, the last time this subjetc showed up (and it is was not that
long ago) -
I could think of something "short of monkey patching everything" --

It is possible to fashion an special object with a custom
`__getattr__` - sai that you call it
"curry" - that them proceeds to retrieve references to functions (and
methods) with the same names of the attributes you try to get from it,
and wrap those function calls in order to create your pipeline.

Say:

>>> curry.len.list.range(5,10)
5

The trick is to pick the names "len", "list" and "range" from the
calling stack frame.
You can them evolve on this idea, and pass a sepecial sentinel
parameter to calls on the chain, so that the function call gets
delayed and the sentinel is replaced by the piped object when it is
actually executed - say:

>>> curry.mul(DELAY, 2).mul(DELAY, 3).complex.int(5)
(30+0j)

So I did put this together - but lacking a concrete use case myself,
it is somewhat
"amorph" - lacking specifications on what it should do  -
it can for example, retrieve names from the piped object attributes
instead of the calling namespace:

>>> curry.split.upper.str("good morning Vietnam")
['GOOD', 'MORNING', 'VIETNAM']

And the "|" operator is overriden as well so that with some
parentheses  lambdas and other things can be added to the chain -

Just throwing in what could give you more ideas to the approach you
have in mind. This one works applying the calls on the rightside first
and traversing the object to the left - but it should be easy to do
the opposite - starting with a call with the "seed" object on the
left, and chaining calls on the right.

If you find the idea interesting enough to be of use, I'd be happy to
evolve what is already in place there so it could be useful.

regards,

   js
  -><-


On 26 May 2015 at 06:46, Wes Turner <wes.turner at gmail.com> wrote:
>
> On May 25, 2015 6:45 PM, "Stephan Hoyer" <shoyer at gmail.com> wrote:
>>
>> In the PyData community, we really like method chaining for data analysis
>> pipelines:
>>
>> (iris.query('SepalLength > 5')
>>  .assign(SepalRatio = lambda x: x.SepalWidth / x.SepalLength,
>>          PetalRatio = lambda x: x.PetalWidth / x.PetalLength)
>>  .plot(kind='scatter', x='SepalRatio', y='PetalRatio'))
>>
>>
>> Unfortunately, method chaining isn't very extensible -- short of monkey
>> patching, every method we want to use has exist on the original object. If a
>> user wants to supply their own plotting function, they can't use method
>> chaining anymore.
>
>>
>> You may recall that we brought this up a few months ago on python-ideas as
>> an example of why we would like macros.
>>
>> To get around this issue, we are contemplating adding a pipe method to
>> pandas DataFrames. It looks like this:
>>
>> def pipe(self, func, *args, **kwargs):
>>     pipe_func = getattr(func, '__pipe_func__', func)
>>     return pipe_func(self, *args, **kwargs)
>>
>>
>> We would encourage third party libraries with objects on which method
>> chaining is useful to define a pipe method in the same way.
>>
>> The main idea here is to create an easy way for users to do method
>> chaining with their own functions and with functions from third party
>> libraries.
>>
>> The business with __pipe_func__ is more magical, and frankly we aren't
>> sure it's worth the complexity. The idea is to create a "pipe protocol" that
>> allows functions to decide how they are called when piped. This is useful in
>> some cases, because it doesn't always make sense for functions that act on
>> piped data to accept that data as their first argument.
>>
>> For more motivation and examples, please read the opening post in this
>> GitHub issue: https://github.com/pydata/pandas/issues/10129
>>
>> Obviously, this sort of protocol would not be an official part of the
>> Python language. But because we are considering creating a de-facto
>> standard, we would love to get feedback from other Python communities that
>> use method chaining:
>> 1. Have you encountered or addressed the problem of extensible method
>> chaining?
>
> * https://pythonhosted.org/pyquery/api.html
> * SQLAlchemy
>
>> 2. Would this pipe protocol be useful to you?
>
> What are the advantages over just returning 'self'? (Which use cases are not
> possible with current syntax?)
>
> In terms of documenting functional composition, I find it easier to test and
> add comment strings to multiple statements.
>
> Months ago, when I looked at creating pandasrdf (pandas #3402), there is
> need for a (...).meta.columns w/ columnar URIs, units, (metadata: who, what,
> when, how). Said metadata is not storable with e.g. CSV; but is with
> JSON-LD, RDF, RDFa, CSVW.
>
> It would be neat to be able to track provenance metadata through [chained]
> transformations.
>
>> 3. Is it worth allowing piped functions to override how they are called by
>> defining something like __pipe_func__?
>
> "There should be one-- and preferably only one --obvious way to do it."
>
>> Note that I'm not particularly interested in feedback about how we
>> shouldn't be defining double underscore methods. There are other ways we
>> could spell __pipe_func__, but double underscores seems to be pretty
>> standard for ad-hoc protocols.
>> Thanks for your attention.
>> Best,
>> Stephan
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list