[scikit-learn] Terminating a Pipeline with a NearestNeighbors search

Andreas Mueller t3kcit at gmail.com
Wed Sep 13 14:45:41 EDT 2017


Hi Ryan.

I don't think there's a good solution. Feel free to open an issue in the 
issue tracker (I'm not aware of one for this).
You can access the pipeline steps, so you can access the kneighbors 
method via the "steps" attribute, but that wouldn't
take any of the previous steps into account, and so you lose all the 
benefits of the pipeline.

We could add a way to call non-standard methods, but I'm not sure that 
is the right way to go.
(like pipeline.custom_method(X, method="kneighbors")). But that assumes 
that the method signature is X or (X, y).
So I'm not sure if this is generally useful.

Andy


On 09/12/2017 04:01 PM, Ryan Conway wrote:
> Hello,
>
> I'm wondering if sklearn provides a means of terminating pipelines 
> with a NearestNeighbors search.
>
> For example, my workflow is DictVectorizer -> TfidfTransformer -> 
> NearestNeighbors. I'd like to capture this in an sklearn Pipeline. 
> Unfortunately, Pipeline does not expose a kneighbors() method that 
> would run all intermediate transforms and then return the result of 
> NearestNeighbors.kneighbors().
>
> I went through Pipeline's source and noticed its decision_function(), 
> predict() etc. all capture this functionality with different 
> terminating operation names. Maybe there is some way to specify the 
> terminating operation method name rather than relying on these 
> Pipeline methods?
>
> Thank you,
> Ryan
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170913/5e9627c6/attachment.html>


More information about the scikit-learn mailing list