[scikit-learn] Can Scikit-learn decision tree (CART) have both continuous and categorical features?

C W tmrsg11 at gmail.com
Sat Sep 14 14:57:22 EDT 2019


Thanks, Guillaume.
Column transformer looks pretty neat. I've also heard though, this pipeline
can be tedious to set up? Specifying what you want for every feature is a
pain.

Jaiver,
Actually, you guessed right. My real data has only one numerical
variable, looks more like this:

Gender Date            Income  Car   Attendance
Male     2019/3/01   10000   BMW          Yes
Female 2019/5/02    9000   Toyota          No
Male     2019/7/15   12000    Audi           Yes

I am predicting income using all other categorical variables. Maybe it is
catboost!

Thanks,

M






On Sat, Sep 14, 2019 at 9:25 AM Javier López <jlopez at ende.cc> wrote:

> If you have datasets with many categorical features, and perhaps many
> categories, the tools in sklearn are quite limited,
> but there are alternative implementations of boosted trees that are
> designed with categorical features in mind. Take a look
> at catboost [1], which has an sklearn-compatible API.
>
> J
>
> [1] https://catboost.ai/
>
> On Sat, Sep 14, 2019 at 3:40 AM C W <tmrsg11 at gmail.com> wrote:
>
>> Hello all,
>> I'm very confused. Can the decision tree module handle both continuous
>> and categorical features in the dataset? In this case, it's just CART
>> (Classification and Regression Trees).
>>
>> For example,
>> Gender Age Income  Car   Attendance
>> Male     30   10000   BMW          Yes
>> Female 35     9000  Toyota          No
>> Male     50   12000    Audi           Yes
>>
>> According to the documentation
>> https://scikit-learn.org/stable/modules/tree.html#tree-algorithms-id3-c4-5-c5-0-and-cart,
>> it can not!
>>
>> It says: "scikit-learn implementation does not support categorical
>> variables for now".
>>
>> Is this true? If not, can someone point me to an example? If yes, what do
>> people do?
>>
>> Thank you very much!
>>
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190914/fdfe8fd7/attachment.html>


More information about the scikit-learn mailing list