[Numpy-discussion] quantile() or percentile()

Chun-Wei Yuan chunwei.yuan at gmail.com
Fri Jul 21 19:42:17 EDT 2017


That would be great.  I just used np.argsort because it was familiar to
me.  Didn't know about the C code.

On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> While #9211 is a good start, it is pretty inefficient in terms of the fact
> that it performs an O(nlogn) sort of the array. It is possible to reduce
> the time to O(n) by using a similar partitioning algorithm to the one in
> the C code of percentile. I will look into it as soon as I can.
>
>     -Joe
>
> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
> wrote:
>
>> Just to provide some context, 9213 actually spawned off of this guy:
>>
>> https://github.com/numpy/numpy/pull/9211
>>
>> which might address the weighted inputs issue Joe brought up.
>>
>> C
>>
>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <
>> jfoxrabinovitz at gmail.com> wrote:
>>
>>> I think that there would be a very good reason to have a separate
>>> function if we were to introduce weights to the inputs, similarly to the
>>> way that we have mean and average. This would have some (positive)
>>> repercussions like making weighted histograms with the Freedman-Diaconis
>>> binwidth estimator a possibility. I have had this change on the back-burner
>>> for a long time, mainly because I was too lazy to figure out how to include
>>> it in the C code. However, I will take a closer look.
>>>
>>> Regards,
>>>
>>>     -Joe
>>>
>>>
>>>
>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>>> wrote:
>>>
>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
>>>> use it just like percentile(), but would input your q value in probability
>>>> space (0.5 for 50%):
>>>>
>>>> https://github.com/numpy/numpy/pull/9213
>>>>
>>>> Since there's a great deal of overlap between these two functions, we'd
>>>> like to solicit opinions on how to move forward on this.
>>>>
>>>> The current thinking is to tolerate the redundancy and keep both, using
>>>> one as the engine for the other.  I'm partial to having quantile because
>>>> 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().
>>>>
>>>> Best,
>>>>
>>>> C
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170721/d3370232/attachment.html>


More information about the NumPy-Discussion mailing list