Donating code for robust statistical estimators
Hi, I wrote a Python package for the high-performance calculation of three robust statistical estimators: weighted median, medcouple and mode. The repository with the code is at this link: https://github.com/FilippoBovo/robustats <https://github.com/FilippoBovo/robustats>. The estimators are implemented in C using linear or log-linear algorithms, thereby making the code run pretty fast. I would like to donate the code for the above three estimators to SciPy. I believe that `scipy.stats` is a good location where to add the weighted median, medcouple and mode. Could you please let me know if you are happy with this? Thank you. Best wishes, Filippo
On Mon, Sep 2, 2019 at 1:45 PM Filippo Bovo <hi.hellbee@gmail.com> wrote:
Hi,
I wrote a Python package for the high-performance calculation of three robust statistical estimators: weighted median, medcouple and mode.
The repository with the code is at this link: https://github.com/FilippoBovo/robustats.
The estimators are implemented in C using linear or log-linear algorithms, thereby making the code run pretty fast.
I would like to donate the code for the above three estimators to SciPy.
Hi Filippo. Thanks for your interest in contributing! I believe that `scipy.stats` is a good location where to add the weighted
median, medcouple and mode.
There is a mode function in scipy.stats. It looks nontrivial to make that work in C in a backwards-compatible fashion, but please have a look at it and see if that can work and would be an improvement. I've looked at your medcouple, but the function doesn't explain what it's for and I'm not familiar with the name. Can you explain? A weighted median doesn't fit well. NumPy has a median function, perhaps better to consider whether to add weights to that, in the same fashion as for numpy.average Cheers, Ralf
On 3 Sep 2019, at 06:45, Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Sep 2, 2019 at 1:45 PM Filippo Bovo <hi.hellbee@gmail.com <mailto:hi.hellbee@gmail.com>> wrote: Hi,
I wrote a Python package for the high-performance calculation of three robust statistical estimators: weighted median, medcouple and mode.
The repository with the code is at this link: https://github.com/FilippoBovo/robustats <https://github.com/FilippoBovo/robustats>.
The estimators are implemented in C using linear or log-linear algorithms, thereby making the code run pretty fast.
I would like to donate the code for the above three estimators to SciPy.
Hi Filippo. Thanks for your interest in contributing!
I believe that `scipy.stats` is a good location where to add the weighted median, medcouple and mode.
There is a mode function in scipy.stats. It looks nontrivial to make that work in C in a backwards-compatible fashion, but please have a look at it and see if that can work and would be an improvement.
I've looked at your medcouple, but the function doesn't explain what it's for and I'm not familiar with the name. Can you explain?
The medcouple is a quantity used to determine the outliers of a skewed distribution. For example, say that we have a data sample that leans more towards the right than the left; in this case, we would use the medcouple to determine the skewness of the distribution and use the result to determine the left and right outliers. You may find more information in the Wikipedia page: https://en.wikipedia.org/wiki/Medcouple <https://en.wikipedia.org/wiki/Medcouple>.
A weighted median doesn't fit well. NumPy has a median function, perhaps better to consider whether to add weights to that, in the same fashion as for numpy.average
Ok, thank you.
Cheers, Ralf
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev <https://mail.python.org/mailman/listinfo/scipy-dev>
Cheers, Filippo
participants (2)
-
Filippo Bovo -
Ralf Gommers