[scikit-learn] Problems with running GridSearchCV on a pipeline with a custom transformer

Sam Barnett sambarnett95 at gmail.com
Wed Aug 2 15:08:07 EDT 2017


Hi Andy,

The purpose of the transformer is to take an ordinary kernel (in this case
I have taken 'rbf' as a default) and return a 'sequentialised' kernel using
a few extra parameters. Hence, the transformer takes an ordinary
data-target pair X, y as its input, and the fit_transform(X, y) method will
output the Gram matrix for X that is associated with this sequentialised
kernel. In the pipeline, this Gram matrix is passed into an SVC classifier
with the kernel parameter set to 'precomputed'.

Therefore, I do not think your hacky solution would be possible. However, I
am still unsure how to implement your first solution: won't the Gram matrix
from the transformer contain all the necessary kernel values? Could you
elaborate further?


Best,
Sam

On Wed, Aug 2, 2017 at 5:05 PM, Andreas Mueller <t3kcit at gmail.com> wrote:

> Hi Sam.
> GridSearchCV will do cross-validation, which requires to "transform" the
> test data.
> The shape of the test-data will be different from the shape of the
> training data.
> You need to have the ability to compute the kernel between the training
> data and new test data.
>
> A more hacky solution would be to compute the full kernel matrix in
> advance and pass that to GridSearchCV.
>
> You probably don't need it here, but you should also checkout what the
> _pairwise attribute does in cross-validation,
> because that it likely to come up when playing with kernels.
>
> Hth,
> Andy
>
>
> On 08/02/2017 08:38 AM, Sam Barnett wrote:
>
> Dear all,
>
> I have created a 2-step pipeline with a custom transformer followed by a
> simple SVC classifier, and I wish to run a grid-search over it. I am able
> to successfully create the transformer and the pipeline, and each of these
> elements work fine. However, when I try to use the fit() method on my
> GridSearchCV object, I get the following error:
>
>      57         # during fit.
>      58         if X.shape != self.input_shape_:
> ---> 59             raise ValueError('Shape of input is different from
> what was seen '
>      60                              'in `fit`')
>      61
>
> ValueError: Shape of input is different from what was seen in `fit`
>
> For a full breakdown of the problem, I have written a Jupyter notebook
> showing exactly how the error occurs (this also contains all .py files
> necessary to run the notebook). Can anybody see how to work through this?
>
> Many thanks,
> Sam Barnett
>
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170802/d0cbe35e/attachment.html>


More information about the scikit-learn mailing list