[scikit-learn] Imblearn: SMOTENC

S Hamidizade hamidizade.s at gmail.com
Thu Jan 24 10:17:46 EST 2019


Thanks. Unfortunately, now the error is:
ValueError: Some of the categorical indices are out of range. Indices
should be between 0 and 160.
Best regards,

On Sun, Jan 20, 2019 at 8:31 PM S Hamidizade <hamidizade.s at gmail.com> wrote:

> Dear Scikit-learners
> Hi.
>
> I would greatly appreciate if you could let me know how to use SMOTENC.  I
> wrote:
>
> num_indices1 = list(X.iloc[:,np.r_[0:94,95,97,100:123]].columns.values)
> cat_indices1 = list(X.iloc[:,np.r_[94,96,98,99,123:160]].columns.values)
> print(len(num_indices1))
> print(len(cat_indices1))
>
> pipeline=Pipeline(steps= [
>     # Categorical features
>     ('feature_processing', FeatureUnion(transformer_list = [
>             ('categorical', MultiColumn(cat_indices1)),
>
>             #numeric
>             ('numeric', Pipeline(steps = [
>                 ('select', MultiColumn(num_indices1)),
>                 ('scale', StandardScaler())
>                         ]))
>         ])),
>     ('clf', rg)
>     ]
> )
>
> Therefore, as it is indicated I have 5 categorical features. Really,
> indices 123 to 160 are related to one categorical feature with 37 possible
> values which is converted into 37 columns using get_dummies.
>  Sorry, I think SMOTENC should be inserted before the classifier ('clf',
> reg) but I don't know how to define "categorical_features" in SMOTENC.
> Besides, could you please let me know where to use imblearn.pipeline?
>
> Thanks in advance.
> Best regards,
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190124/c0017f10/attachment-0001.html>


More information about the scikit-learn mailing list