Hi everyone,
I’ve started work on getting image moments working in nD. I have two working alternative implementations: one with NumPy broadcasting and one with Numba JIT. Both are consistent with each other and with the existing Cython implementation, with to caveats:
1- my results are the *transpose* of the reference results. I suspected this might be because the reference uses x/y coordinates but that doesn’t seem to be the case. (see code below).
2- each of my implementations is ~10x slower than the Cython implementation.
Of these, problem 1 is the most pressing. (Problem 2 can be solved by special-casing 2D.) Can someone familiar with moments compare my implementations and the reference and tell me where I might have gone wrong? I’ve included all the code below for reference.
Ideas about speedups are also welcome! To that end, I’ve marked the slow lines in the broadcasting code. (I don’t know how to line profile in Numba; I think it’s currently impossible, in fact.)
Juan.
————————— Reference results
———————————— Values
(Notice the .T for `moments_central`, which is the existing Cython implementation)
In [189]: m.moments_central_nb(horse.astype(float), central_coords, 3)
Out[189]:
array([[ 8.779e+04, -1.621e-07, 1.301e+09, -6.370e+09],
[ -8.245e-07, 9.275e+07, -3.430e+09, 1.938e+12],
[ 9.873e+08, -1.363e+10, 1.349e+13, -3.604e+14],
[ -2.218e+10, 1.552e+12, -2.849e+14, 3.245e+16]])
In [190]: moments_central(horse.astype(float), *central_coords, 3).T
Out[190]:
array([[ 8.779e+04, -1.621e-07, 1.301e+09, -6.370e+09],
[ -8.245e-07, 9.275e+07, -3.430e+09, 1.938e+12],
[ 9.873e+08, -1.363e+10, 1.349e+13, -3.604e+14],
[ -2.218e+10, 1.552e+12, -2.849e+14, 3.245e+16]])
In [191]: m.moments_central(horse.astype(float), central_coords, 3)
Out[191]:
array([[ 8.779e+04, -1.037e-09, 1.301e+09, -6.370e+09],
[ -1.979e-09, 9.275e+07, -3.430e+09, 1.938e+12],
[ 9.873e+08, -1.363e+10, 1.349e+13, -3.604e+14],
[ -2.218e+10, 1.552e+12, -2.849e+14, 3.245e+16]])
———————————— Timings
In [185]: %timeit moments_central(horse.astype(float), *central_coords, 3)
100 loops, best of 3: 3.14 ms per loop
In [187]: %timeit m.moments_central_nb(horse.astype(float), central_coords, 3)
10 loops, best of 3: 35.7 ms per loop
In [188]: %timeit m.moments_central(horse.astype(float), central_coords, 3)
10 loops, best of 3: 39.4 ms per loop
————————— Implementations
———————————— Existing Cython
cpdef moments_central(image_t image, double cr, double cc, Py_ssize_t order):
cdef Py_ssize_t p, q, r, c
cdef double[:, ::1] mu = np.zeros((order + 1, order + 1), dtype=np.double)
cdef double val, dr, dc, dcp, drq
for r in range(image.shape[0]):
dr = r - cr
for c in range(image.shape[1]):
dc = c - cc
val = image[r, c]
dcp = 1
for p in range(order + 1):
drq = 1
for q in range(order + 1):
mu[p, q] += val * drq * dcp
drq *= dr
dcp *= dc
return np.asarray(mu)
———————————— Broadcasting
import numpy as np
import numba
from functools import reduce
def moments_central(image, center=None, order=3):
coords = np.nonzero(image)
values = image[coords]
coords_stack = np.column_stack(np.nonzero(image))
if center is None:
center = np.mean(coords_stack, axis=0)
ccoords = coords_stack - center
powers = np.arange(order + 1)
result_shape = (order + 1,) * image.ndim + (coords_stack.shape[0],)
coords_pow = []
for dim in range(image.ndim):
power_shape = [1,] * (image.ndim + 1)
power_shape[dim] = order + 1
powers = np.reshape(powers, power_shape)
ccoords_pow = ccoords[:, dim] ** powers # SLOW, 50% of time
bcast_coords_pow = np.broadcast_to(ccoords_pow, result_shape)
coords_pow.append(bcast_coords_pow)
result = np.sum(reduce(np.multiply, coords_pow + [values]), axis=-1) # SLOW, 30% of time
return result
———————————— Numba
def moments_central_nb(image, center=None, order=3):
mu = np.zeros((order + 1,) * image.ndim)
if center is None:
center = np.mean(np.column_stack(np.nonzero(image)), axis=0)
_moments_central_nb_loop(image, center, mu)
return mu
@numba.njit
def _moments_central_nb_loop(image, center, mu):
for coord, val in np.ndenumerate(image):
if val > 0:
for powers in np.ndindex(mu.shape):
curval = val
for c, ctr, p in zip(coord, center, powers):
dc = c - ctr
cr = 1
for pcount in range(p):
cr *= dc
curval *= cr
mu[powers] += curval