[scikit-learn] SMOTE-ENN in Imbalanced-learn package

Mamun Rashid mamunbabu2001 at gmail.com
Mon Dec 12 07:56:15 EST 2016


Hi All,
Not sure if questions regarding the contributory packages are answered
here. Just trying my luck.

I am have a seriously imbalanced classification problem. I am trying to use
SMOTE+ENN oversampling and undersampling method to oversample my minority
class and oversample my majority class.

========

from sklearn.datasets import make_classification
from imblearn.combine import SMOTEENN

sm = SMOTEENN()
X, y = make_classification(n_classes=2, class_sep=2, weights=[0.2, 0.8],
n_informative=1, n_redundant=1, flip_y=0, n_features=3,
n_clusters_per_class=1, n_samples=50, random_state=10)
X_df = pd.DataFrame(X)
X_resampled, y_resampled = sm.fit_sample(X_df, y)

=========

I understand that SMOTE returns a resampled data matrix i.e. X_resampled. I
was wondering if there is a direct way to retrieve the indexes of the
original data observations ?

Thanks in advance.

Best Regards and Seasons Greetings.,
Mamun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161212/21b3de07/attachment.html>


More information about the scikit-learn mailing list