
Hi, thanks for your reply. Let me try and answer your points one by one.
* `np.add.at` should be able to do what you want (but of course is very slow right now, and maybe hard to get as fast as bincount even if improved).
It is indeed far too slow to be useful for my particular application.
* `out=` by itself usually does not mean to use the values in `out`. So I think you would need either a different name or another flag to indicate use of `out` (rather than overwriting it).
That makes a lot of sense. Are there existing functions with arguments of a similar nature, so that similar names can be used?
* bincount resizes its output dynamically, however, if you provide an output then resizing is not really feasible.
Yes, that's the moral issue with this proposal. But it would not make any difference to "normal" operation of the function. So you could say: if you pass a wrong input, you get an error. In fact, you could also go one step further and say: instead of `minlength`, why not also have an optional `length` parameter to get an output of exactly the given length? Because maybe the bins you are counting are fixed ahead of time, and if the indices don't fit, there has been an error. In that case, it would make perfect sense i) that `bincount` raises an error if the length is too small for the given indices, and ii) you can specify either `length` explicitly or implicitly via the output array, maybe via a suitably-named parameter that is not `length` but can be either integer or output array.
Probably you can find a design that solves this. If you always add only a moderate amount of points `np.add.at(tally, indices, weights)` may just be a good solution? (It is "very" slow, but if the problem is having the giant `tally` array, then a factor of 10 "too slow" probably doesn't matter)
I am really looking at situations where only a low-level loop for summation will do, because both the inputs and outputs are large, e.g. counting O(1e8) indices with weights into output lists of length O(1e7). And `bincount` already contains that exact loop, it is only missing the functionality to continue counting without creating a new list.