[scikit-learn] Fwd: Proposing Encoder class to encode Ordinal attributes

prudhviraj nitjsr prudhvirajnitjsr at gmail.com
Mon May 13 14:58:34 EDT 2019


Hi,

Can someone please respond. Any response would be appreciated

Thanks

---------- Forwarded message ---------
From: prudhviraj nitjsr <prudhvirajnitjsr at gmail.com>
Date: Sun, May 12, 2019 at 1:38 AM
Subject: Proposing Encoder class to encode Ordinal attributes
To: <scikit-learn at python.org>


Hi All,

Recently, when i was solving some ML problem, I came accross an
attribute which has Ordinal Values . Eg:

Student ID    |    Subjects
========================================
1            |    ['Math']
2            |    ['Math','Python']
3            |    ['C']
4            |    ['Python','Statistics']
========================================

Here, attribute Subjects is a list which contains list of subjects the
student is interested in. We have sklearn.preprocessing.OneHotEncoder
which encodes a single Categorical variable by creating multiple
columns.
Similarily, I want to propose different encoder that encodes this type
of list and creates new columns , one column for each subject. Allowed
values are 1/0 which specifies whether student is interested in this
subject or not. I'm new to Open Source contribution. Can someone tell
me If there is an existing feature that handles this type of data or
If I can start working on this feature. Any response would be
appreciated.

Thanks
Prudvi RajKumar


More information about the scikit-learn mailing list