[scikit-learn] Need for multioutput multivariate algorithm for Random Forest in Python (using Mahalanobis distance)
Andreas Mueller
t3kcit at gmail.com
Thu Feb 13 23:13:37 EST 2020
On 2/9/20 12:21 PM, Paul Chike Ofoche via scikit-learn wrote:
>
> Hello all,
>
> My name is Paul and I am enthused about data science. I have been
> using Python and other programming languages for close to two years.
> There is an issue that I have been facing since I began applying
> Python to the analysis of my research work.
>
>
> My question has remained unanswered for months. Has anybody not run
> into the need to work with data whereby the regression results are a
> multiple output, in which the output parameters are correlated with
> each other? This is called a multi-output multivariate problem. A
> version of random forest that handles multiple outputs is referred to
> as the multivariate random forest. It is implemented in the
> programming language, R (see attached reference documentation below).
>
The scikit-learn random forest actually handles this. It doesn't use the
mahalanobis distance but that seems like a simple preprocessing step.
>
>
> Till date, there exists no such package in Python. My question is
> whether anybody knows how to go about implementing this. The random
> forest univariate regression case utilizes the Euclidean distance as
> the measurement criteria, whereas the multivariate regression case
> uses the Mahalanobis distance, which takes into account the
> inter-relationships between the multiple outputs. I have inquired
> about an equivalent capability in Python for many years, but it has
> still not been addressed. Such a multivariate random forest mode is
> very applicable to the type of research and analysis that I do. Could
> someone help, please?
>
> Thank you,
>
> Paul Ofoche
>
> PS: This is an important need for multivariate output analysis as a
> technique to solving practical research problems. Here are some posted
> questions by various other Python users concerning this same issue.
>
> *https://datascience.stackexchange.com/questions/21637/code-for-multivariate-random-forest-in-python-r*
>
> Multi-output regression
> <https://stackoverflow.com/questions/49391637/multi-output-regression>
>
>
>
>
>
>
>
>
>
> Multi-output regression
>
> I have been looking in to Multi-output regression the last view weeks.
> I am working with the scikit learn packag...
>
> <https://stackoverflow.com/questions/49391637/multi-output-regression>
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200213/8f505efb/attachment-0001.html>
More information about the scikit-learn
mailing list