[Numpy-discussion] quantile() or percentile()

Chun-Wei Yuan chunwei.yuan at gmail.com
Thu Aug 3 13:00:03 EDT 2017


Any way I can help expedite this?

On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
wrote:

> That would be great.  I just used np.argsort because it was familiar to
> me.  Didn't know about the C code.
>
> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz <
> jfoxrabinovitz at gmail.com> wrote:
>
>> While #9211 is a good start, it is pretty inefficient in terms of the
>> fact that it performs an O(nlogn) sort of the array. It is possible to
>> reduce the time to O(n) by using a similar partitioning algorithm to the
>> one in the C code of percentile. I will look into it as soon as I can.
>>
>>     -Joe
>>
>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>> wrote:
>>
>>> Just to provide some context, 9213 actually spawned off of this guy:
>>>
>>> https://github.com/numpy/numpy/pull/9211
>>>
>>> which might address the weighted inputs issue Joe brought up.
>>>
>>> C
>>>
>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <
>>> jfoxrabinovitz at gmail.com> wrote:
>>>
>>>> I think that there would be a very good reason to have a separate
>>>> function if we were to introduce weights to the inputs, similarly to the
>>>> way that we have mean and average. This would have some (positive)
>>>> repercussions like making weighted histograms with the Freedman-Diaconis
>>>> binwidth estimator a possibility. I have had this change on the back-burner
>>>> for a long time, mainly because I was too lazy to figure out how to include
>>>> it in the C code. However, I will take a closer look.
>>>>
>>>> Regards,
>>>>
>>>>     -Joe
>>>>
>>>>
>>>>
>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>>>> wrote:
>>>>
>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
>>>>> use it just like percentile(), but would input your q value in probability
>>>>> space (0.5 for 50%):
>>>>>
>>>>> https://github.com/numpy/numpy/pull/9213
>>>>>
>>>>> Since there's a great deal of overlap between these two functions,
>>>>> we'd like to solicit opinions on how to move forward on this.
>>>>>
>>>>> The current thinking is to tolerate the redundancy and keep both,
>>>>> using one as the engine for the other.  I'm partial to having quantile
>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on
>>>>> quantile().
>>>>>
>>>>> Best,
>>>>>
>>>>> C
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170803/88e1951f/attachment.html>


More information about the NumPy-Discussion mailing list