[Numpy-discussion] Make np.bincount output same dtype as weights
josef.pktd at gmail.com
josef.pktd at gmail.com
Sat Mar 26 22:58:00 EDT 2016
On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
<jfoxrabinovitz at gmail.com> wrote:
> Would it make sense to just make the output type large enough to hold the
> cumulative sum of the weights?
>
>
> - Joseph Fox-Rabinovitz
>
> ------ Original message------
>
> From: Jaime Fernández del Río
>
> Date: Sat, Mar 26, 2016 16:16
>
> To: Discussion of Numerical Python;
>
> Subject:[Numpy-discussion] Make np.bincount output same dtype as weights
>
> Hi all,
>
> I have just submitted a PR (#7464) that fixes an enhancement request
> (#6854), making np.bincount return an array of the same type as the weights
> parameter. This is an important deviation from current behavior, which
> always casts weights to double, and always returns a double array, so I
> would like to hear what others think about the worthiness of this. Main
> discussion points:
>
> np.bincount now works with complex weights (yay!), I guess this should be a
> pretty uncontroversial enhancement.
> The return is of the same type as weights, which means that small integers
> are very likely to overflow. This is exactly what #6854 requested, but
> perhaps we should promote the output for integers to a long, as we do in
> np.sum?
I always thought of bincount with weights just as a group-by sum. So
it would be easier to remember and have fewer surprises if it matches
the behavior of np.sum.
> Boolean arrays stay boolean, and OR, rather than sum, the weights. Is this
> what one would want? If we decide that integer promotion is the way to go,
> perhaps booleans should go in the same pack?
Isn't this calculating the sum, i.e. count of True by group, already?
Based on a quick example with numpy 1.9.2, I don't think I ever used
bool weights before.
> This new implementation currently supports all of the reasonable native
> types, but has no fallback for user defined types. I guess we should
> attempt to cast the array to double as before if no native loop can be
> found? It would be good to have a way of testing this though, any thoughts
> on how to go about this?
> Does a behavior change like this require some deprecation period? What would
> that look like?
> I have also added broadcasting of weights to the full size of list, so that
> one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to tile
> the single weight to the size of the bins list.
>
> Any other thoughts are very welcome as well!
(2-D weights ?)
Josef
>
> Jaime
>
> --
> (__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de
> dominación mundial.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list