[Numpy-discussion] quantile() or percentile()

Chun-Wei Yuan chunwei.yuan at gmail.com
Thu Aug 3 17:36:45 EDT 2017


Cool.  Just as a heads up, for my algorithm to work, I actually need the
indices, which is why argsort() is so important to me.  I use it to get
both ap_sorted and ws_sorted variables.  If your weighted-quantile algo is
faster and doesn't require those indices, please by all means change my
implementation.  Thanks.

On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> Not that I know of. The algorithm is very simple, requiring a
> relatively small addition to the current introselect algorithm used
> for `np.partition`. My biggest hurdle is figuring out how the calling
> machinery really works so that I can figure out which input type
> permutations I need to generate, and how to get the right backend
> running for a given function call.
>
>     -Joe
>
> On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
> wrote:
> > Any way I can help expedite this?
> >
> > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
> > wrote:
> >>
> >> That would be great.  I just used np.argsort because it was familiar to
> >> me.  Didn't know about the C code.
> >>
> >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
> >> <jfoxrabinovitz at gmail.com> wrote:
> >>>
> >>> While #9211 is a good start, it is pretty inefficient in terms of the
> >>> fact that it performs an O(nlogn) sort of the array. It is possible to
> >>> reduce the time to O(n) by using a similar partitioning algorithm to
> the one
> >>> in the C code of percentile. I will look into it as soon as I can.
> >>>
> >>>     -Joe
> >>>
> >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com
> >
> >>> wrote:
> >>>>
> >>>> Just to provide some context, 9213 actually spawned off of this guy:
> >>>>
> >>>> https://github.com/numpy/numpy/pull/9211
> >>>>
> >>>> which might address the weighted inputs issue Joe brought up.
> >>>>
> >>>> C
> >>>>
> >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
> >>>> <jfoxrabinovitz at gmail.com> wrote:
> >>>>>
> >>>>> I think that there would be a very good reason to have a separate
> >>>>> function if we were to introduce weights to the inputs, similarly to
> the way
> >>>>> that we have mean and average. This would have some (positive)
> repercussions
> >>>>> like making weighted histograms with the Freedman-Diaconis binwidth
> >>>>> estimator a possibility. I have had this change on the back-burner
> for a
> >>>>> long time, mainly because I was too lazy to figure out how to
> include it in
> >>>>> the C code. However, I will take a closer look.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>>     -Joe
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <
> chunwei.yuan at gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
> >>>>>> use it just like percentile(), but would input your q value in
> probability
> >>>>>> space (0.5 for 50%):
> >>>>>>
> >>>>>> https://github.com/numpy/numpy/pull/9213
> >>>>>>
> >>>>>> Since there's a great deal of overlap between these two functions,
> >>>>>> we'd like to solicit opinions on how to move forward on this.
> >>>>>>
> >>>>>> The current thinking is to tolerate the redundancy and keep both,
> >>>>>> using one as the engine for the other.  I'm partial to having
> quantile
> >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting
> on
> >>>>>> quantile().
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> C
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> NumPy-Discussion mailing list
> >>>>>> NumPy-Discussion at python.org
> >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> NumPy-Discussion mailing list
> >>>>> NumPy-Discussion at python.org
> >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> NumPy-Discussion mailing list
> >>>> NumPy-Discussion at python.org
> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at python.org
> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170803/9287c9b4/attachment-0001.html>


More information about the NumPy-Discussion mailing list