[scikit-learn] Terminating a Pipeline with a NearestNeighbors search

Ryan Conway ryanmackenzieconway at gmail.com
Tue Sep 12 16:01:33 EDT 2017


I'm wondering if sklearn provides a means of terminating pipelines with a
NearestNeighbors search.

For example, my workflow is DictVectorizer -> TfidfTransformer ->
NearestNeighbors. I'd like to capture this in an sklearn Pipeline.
Unfortunately, Pipeline does not expose a kneighbors() method that would
run all intermediate transforms and then return the result of

I went through Pipeline's source and noticed its decision_function(),
predict() etc. all capture this functionality with different terminating
operation names. Maybe there is some way to specify the terminating
operation method name rather than relying on these Pipeline methods?

Thank you,
