
Hello, I would like to propose adding the `out` array as an optional parameter to `bincount`. This makes `bincount` very useful when iteratively tallying data with large indices. Consider this example tallying batches of values from some fictional source of data:
tally = np.zeros(10000**2) for indices, weights in read_sensor_data(): ... tally += np.bincount(indices, weights, 10000**2) # slow: repeatedly adding large arrays
This could be trivially sped up:
tally = np.zeros(10000**2) for indices, weights in read_sensor_data(): ... np.bincount(indices, weights, out=tally) # fast: plain sum-loop in C
As far as I can see, there is no equivalent numpy functionality. In fact, as far as I'm aware, there isn't any fast alternative outside of C/Cython/numba/... It also fits the purpose of `bincount` nicely, and does not change existing functionality. One might argue about the exact semantics if both `minlength` and `out` are given, but I think that a sensible answer exists in requiring `len(out) >= max(list.max(), minlength)`.

On Thu, 2022-10-20 at 23:26 +0000, ntessore@pm.me wrote:
Hello,
Hi, I don't have a strong opinion yet it does seem potentially useful, but I think there would be some details to hash out in the proposal. Some thoughts: * `np.add.at` should be able to do what you want (but of course is very slow right now, and maybe hard to get as fast as bincount even if improved). * `out=` by itself usually does not mean to use the values in `out`. So I think you would need either a different name or another flag to indicate use of `out` (rather than overwriting it). * bincount resizes its output dynamically, however, if you provide an output then resizing is not really feasible. Probably you can find a design that solves this. If you always add only a moderate amount of points `np.add.at(tally, indices, weights)` may just be a good solution? (It is "very" slow, but if the problem is having the giant `tally` array, then a factor of 10 "too slow" probably doesn't matter) - Sebastian
I would like to propose adding the `out` array as an optional parameter to `bincount`. This makes `bincount` very useful when iteratively tallying data with large indices.
Consider this example tallying batches of values from some fictional source of data:
tally = np.zeros(10000**2) for indices, weights in read_sensor_data(): ... tally += np.bincount(indices, weights, 10000**2) # slow: repeatedly adding large arrays
This could be trivially sped up:
tally = np.zeros(10000**2) for indices, weights in read_sensor_data(): ... np.bincount(indices, weights, out=tally) # fast: plain sum- loop in C
As far as I can see, there is no equivalent numpy functionality. In fact, as far as I'm aware, there isn't any fast alternative outside of C/Cython/numba/... It also fits the purpose of `bincount` nicely, and does not change existing functionality. One might argue about the exact semantics if both `minlength` and `out` are given, but I think that a sensible answer exists in requiring `len(out) >= max(list.max(), minlength)`. _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net

Hi, thanks for your reply. Let me try and answer your points one by one.
* `np.add.at` should be able to do what you want (but of course is very slow right now, and maybe hard to get as fast as bincount even if improved).
It is indeed far too slow to be useful for my particular application.
* `out=` by itself usually does not mean to use the values in `out`. So I think you would need either a different name or another flag to indicate use of `out` (rather than overwriting it).
That makes a lot of sense. Are there existing functions with arguments of a similar nature, so that similar names can be used?
* bincount resizes its output dynamically, however, if you provide an output then resizing is not really feasible.
Yes, that's the moral issue with this proposal. But it would not make any difference to "normal" operation of the function. So you could say: if you pass a wrong input, you get an error. In fact, you could also go one step further and say: instead of `minlength`, why not also have an optional `length` parameter to get an output of exactly the given length? Because maybe the bins you are counting are fixed ahead of time, and if the indices don't fit, there has been an error. In that case, it would make perfect sense i) that `bincount` raises an error if the length is too small for the given indices, and ii) you can specify either `length` explicitly or implicitly via the output array, maybe via a suitably-named parameter that is not `length` but can be either integer or output array.
Probably you can find a design that solves this. If you always add only a moderate amount of points `np.add.at(tally, indices, weights)` may just be a good solution? (It is "very" slow, but if the problem is having the giant `tally` array, then a factor of 10 "too slow" probably doesn't matter)
I am really looking at situations where only a low-level loop for summation will do, because both the inputs and outputs are large, e.g. counting O(1e8) indices with weights into output lists of length O(1e7). And `bincount` already contains that exact loop, it is only missing the functionality to continue counting without creating a new list.

On Thu, 20 Oct 2022 23:26:37 -0000 ntessore@pm.me wrote:
As far as I can see, there is no equivalent numpy functionality. In fact, as far as I'm aware, there isn't any fast alternative outside of C/Cython/numba/..
We have cummulative histograms in silx ... and found it useful. Maybe it would worth having it in numpy. https://github.com/silx-kit/silx/blob/master/src/silx/math/chistogramnd.pyx#... Cheers, Jerome
participants (4)
-
Jerome Kieffer
-
Nicolas Tessore
-
ntessore@pm.me
-
Sebastian Berg