Re: [Numpy-discussion] Rewrite np.histogram in c?

March 16, 2015

      It might make sense to dispatch to difference c implements if the bins are
equally spaced (as created by using an integer for the np.histogram bins
argument), vs. non-equally-spaced bins.

In that case, getting the bigger speedup may be easier, at least for one
common use case.

-Robert

On Sun, Mar 15, 2015 at 11:00 PM, Jaime Fernández del Río <
jaime.frio@gmail.com> wrote:
...
On Sun, Mar 15, 2015 at 9:32 PM, Robert McGibbon <rmcgibbo@gmail.com>
wrote:
...
Hi,
Numpy.histogram is implemented in python, and is a little sluggish. This
has been discussed previously on the mailing list, [1, 2]. It came up in a
project that I maintain, where a new feature is bottlenecked by
numpy.histogram, and one developer suggested a faster implementation in
cython [3].
Would it make sense to reimplement this function in c? or cython? Is
moving functions like this from python to c to improve performance within
the scope of the development roadmap for numpy? I started implementing this
a little bit in c, [4] but I figured I should check in here first.
Where do you think the performance gains will come from? The PR in your
project that claims a 10x speed-up uses a method that is only fit for
equally spaced bins. I want to think that implementing that exact same
algorithm in Python with NumPy would be comparably fast, say within 2x.
For the general case, NumPy is already doing most of the heavy lifting
(the sorting and the searching) in C: simply replicating the same
algorithmic approach entirely in C is unlikely to provide any major
speed-up. And if the change is to the algorithm, then we should first try
it out in Python.
That said, if you can speed things up 10x, I don't think there is going to
be much opposition to moving it to C!
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion