<p dir="ltr"><br>
On May 25, 2015 6:45 PM, "Stephan Hoyer" <<a href="mailto:shoyer@gmail.com">shoyer@gmail.com</a>> wrote:<br>
><br>
> In the PyData community, we really like method chaining for data analysis pipelines:<br>
><br>
> (iris.query('SepalLength > 5')<br>
> .assign(SepalRatio = lambda x: x.SepalWidth / x.SepalLength,<br>
> PetalRatio = lambda x: x.PetalWidth / x.PetalLength)<br>
> .plot(kind='scatter', x='SepalRatio', y='PetalRatio'))<br>
><br>
><br>
> Unfortunately, method chaining isn't very extensible -- short of monkey patching, every method we want to use has exist on the original object. If a user wants to supply their own plotting function, they can't use method chaining anymore.<br><br></p>
<p dir="ltr">><br>
> You may recall that we brought this up a few months ago on python-ideas as an example of why we would like macros.<br>
><br>
> To get around this issue, we are contemplating adding a pipe method to pandas DataFrames. It looks like this:<br>
><br>
> def pipe(self, func, *args, **kwargs):<br>
> pipe_func = getattr(func, '__pipe_func__', func)<br>
> return pipe_func(self, *args, **kwargs)<br>
><br>
><br>
> We would encourage third party libraries with objects on which method chaining is useful to define a pipe method in the same way.<br>
><br>
> The main idea here is to create an easy way for users to do method chaining with their own functions and with functions from third party libraries.<br>
><br>
> The business with __pipe_func__ is more magical, and frankly we aren't sure it's worth the complexity. The idea is to create a "pipe protocol" that allows functions to decide how they are called when piped. This is useful in some cases, because it doesn't always make sense for functions that act on piped data to accept that data as their first argument. <br>
><br>
> For more motivation and examples, please read the opening post in this GitHub issue: <a href="https://github.com/pydata/pandas/issues/10129">https://github.com/pydata/pandas/issues/10129</a><br>
><br>
> Obviously, this sort of protocol would not be an official part of the Python language. But because we are considering creating a de-facto standard, we would love to get feedback from other Python communities that use method chaining:<br>
> 1. Have you encountered or addressed the problem of extensible method chaining?</p>
<p dir="ltr">* <a href="https://pythonhosted.org/pyquery/api.html">https://pythonhosted.org/pyquery/api.html</a><br>
* SQLAlchemy </p>
<p dir="ltr">> 2. Would this pipe protocol be useful to you?</p>
<p dir="ltr">What are the advantages over just returning 'self'? (Which use cases are not possible with current syntax?)</p>
<p dir="ltr">In terms of documenting functional composition, I find it easier to test and add comment strings to multiple statements.</p>
<p dir="ltr">Months ago, when I looked at creating pandasrdf (pandas #3402), there is need for a (...).meta.columns w/ columnar URIs, units, (metadata: who, what, when, how). Said metadata is not storable with e.g. CSV; but is with JSON-LD, RDF, RDFa, CSVW.</p>
<p dir="ltr">It would be neat to be able to track provenance metadata through [chained] transformations.</p>
<p dir="ltr">> 3. Is it worth allowing piped functions to override how they are called by defining something like __pipe_func__?</p>
<p dir="ltr">"There should be one-- and preferably only one --obvious way to do it."</p>
<p dir="ltr">> Note that I'm not particularly interested in feedback about how we shouldn't be defining double underscore methods. There are other ways we could spell __pipe_func__, but double underscores seems to be pretty standard for ad-hoc protocols.<br>
> Thanks for your attention.<br>
> Best,<br>
> Stephan<br>
><br>
> _______________________________________________<br>
> Python-ideas mailing list<br>
> <a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/python-ideas">https://mail.python.org/mailman/listinfo/python-ideas</a><br>
> Code of Conduct: <a href="http://python.org/psf/codeofconduct/">http://python.org/psf/codeofconduct/</a><br>
</p>