[scikit-learn] Pipegraph is on its way!

Joel Nothman joel.nothman at gmail.com
Wed Feb 7 15:46:43 EST 2018


cool! We have been talking for a while about how to pass other things
around grid search and other meta-analysis estimators. This injection
approach looks pretty neat as a way to express it. Will need to mull on it.

On 8 Feb 2018 2:51 am, "Manuel Castejón Limas" <manuel.castejon at gmail.com>
wrote:

> Dear all,
>
> after some playing with the concept we have developed a module for
> implementing the functionality of Pipeline in more general contexts as
> first introduced in a former thread ( https://mail.python.org/piperm
> ail/scikit-learn/2018-January/002158.html )
>
> In order to expand the possibilities of Pipeline for non linearly
> sequential workflows a graph like structure has been deployed while keeping
> as much as possible the already known syntax we all love and honor:
>
> X = pd.DataFrame(dict(X=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
> y = 2 * X
> sc = MinMaxScaler()
> lm = LinearRegression()
> steps = [('scaler', sc),
>          ('linear_model', lm)]
> connections = {'scaler': dict(X='X'),
>                'linear_model': dict(X=('scaler', 'predict'),
>                                     y='y')}
> pgraph = PipeGraph(steps=steps,
>                    connections=connections,
>                    use_for_fit='all',
>                    use_for_predict='all')
>
> As you can see the biggest difference for the final user is the dictionary
> describing the connections.
>
> Another major contribution for developers wanting to expand scikit learn
> is a collection of adapters for scikit learn models in order to provide
> them a common API irrespectively of whether they originally implemented
> predict, transform or fit_predict as an atomic operation without predict.
> These adapters accept as many positional or keyword parameters in their fit
> predict methods through *pargs and **kwargs.
>
> As general as PipeGraph is, it cannot work under the restrictions imposed
> by GridSearchCV on the input parameters, namely X and y since PipeGraph can
> accept as many input signals as needed. Thus, an adhoc GridSearchCv version
> is also needed and we will provide a basic initial version in a later
> version.
>
> We need to write the documentation and we will propose it as a
> contrib-project in a few days.
>
> Best wishes,
> Manuel Castejón-Limas
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180208/c6e2d5ea/attachment-0001.html>


More information about the scikit-learn mailing list