[scikit-learn] numpy integration with random forrest implementation

Sat Jan 21 13:36:51 EST 2017

Thanks for the Info!.. 
How do you set it up.. 

There doesn’t seem a example available for regression purposes.. 
> Den 21. jan. 2017 kl. 19.32 skrev Sebastian Raschka <se.raschka at gmail.com>:
> 
> Oh okay. But that shouldn’t be a problem, the RandomForestRegressor also supports multi-outpout regression; same expected target array shape: [n_samples, n_outputs]
> 
> Best,
> Sebastian
> 
>> On Jan 21, 2017, at 1:27 PM, Carlton Banks <noflaco at gmail.com> wrote:
>> 
>> Not classifiication…  but regression.. 
>> and yes both the input and output should be stored stored like that.. 
>> 
>>> Den 21. jan. 2017 kl. 19.24 skrev Sebastian Raschka <se.raschka at gmail.com>:
>>> 
>>> Hi, Carlton,
>>> sounds like you are looking for multilabel classification and your target array has the shape [n_samples, n_outputs]? If the output shape is consistent (aka all output label arrays have 13 columns), you should be fine, otherwise, you could use the MultiLabelBinarizer (http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html#sklearn.preprocessing.MultiLabelBinarizer).
>>> 
>>> Also, the RandomForestClassifier should support multillabel classification.
>>> 
>>> Best,
>>> Sebastian
>>> 
>>>> On Jan 21, 2017, at 12:59 PM, Carlton Banks <noflaco at gmail.com> wrote:
>>>> 
>>>> Most of the machine learning library i’ve tried has an option of of just give the dimension…
>>>> In this case my input consist of an numpy.ndarray with shape (x,2050) and the output is an numpy.ndarray with shape (x,13) 
>>>> x is different for each  set… 
>>>> But for each set is the number of columns consistent.  
>>>> 
>>>> Column consistency is usually enough for most library tools i’ve worked with… 
>>>> But is this not the case here?
>>>>> Den 21. jan. 2017 kl. 18.42 skrev Jacob Schreiber <jmschreiber91 at gmail.com>:
>>>>> 
>>>>> I don't understand what you mean. Does each sample have a fixed number of features or not?
>>>>> 
>>>>> On Sat, Jan 21, 2017 at 9:35 AM, Carlton Banks <noflaco at gmail.com> wrote:
>>>>> Thanks for the response!
>>>>> 
>>>>> If you see it in 1d then yes…. it has variable length. In 2d will the number of columns always be constant both for the input and output. 
>>>>> 
>>>>>> Den 21. jan. 2017 kl. 18.25 skrev Jacob Schreiber <jmschreiber91 at gmail.com>:
>>>>>> 
>>>>>> If what you're saying is that you have a variable length input, then most sklearn classifiers won't work on this data. They expect a fixed feature set. Perhaps you could try extracting a set of informative features being fed into the classifier?
>>>>>> 
>>>>>> On Sat, Jan 21, 2017 at 3:18 AM, Carlton Banks <noflaco at gmail.com> wrote:
>>>>>> Hi guys..
>>>>>> 
>>>>>> I am currently working on a ASR project  in which the objective is to substitute part of the general ASR framework with some form of neural network, to see whether the tested part improves in any way.
>>>>>> 
>>>>>> I started working with the feature extraction and tried, to make a neural network (NN) that could create MFCC features. I already know what the desired output is supposed to be, so the problem boils down to a simple
>>>>>> input -  output mapping. Problem here is the my NN doesn’t seem to perform that well..  and i seem to get pretty large error for some reason.
>>>>>> 
>>>>>> I therefore wanted to give random forrest a try, and see whether it could provide me a better result.
>>>>>> 
>>>>>> I am currently storing my input and output in numpy.ndarrays, in which the input and output columns is consistent throughout all the examples, but the number of rows changes
>>>>>> depending on length of the audio file.
>>>>>> 
>>>>>> Is it possible with the random forrest implementation in scikit-learn to train a random forrest to map an input an output, given they are stored numpy.ndarrays?
>>>>>> Or do i have do it in a different way? and if so how?
>>>>>> 
>>>>>> kind regards
>>>>>> 
>>>>>> Carl truz
>>>>>> _______________________________________________
>>>>>> scikit-learn mailing list
>>>>>> scikit-learn at python.org
>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>> 
>>>>>> _______________________________________________
>>>>>> scikit-learn mailing list
>>>>>> scikit-learn at python.org
>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> scikit-learn mailing list
>>>>> scikit-learn at python.org
>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> scikit-learn mailing list
>>>>> scikit-learn at python.org
>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>> 
>>>> _______________________________________________
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>> 
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn