[Numpy-discussion] Broadcasting question

Thu Dec 4 10:26:37 EST 2008

Hi list,

Suppose I have array a with dimensions (d1, d3) and array b with
dimensions (d2, d3). I want to compute array c with dimensions (d1,
d2) holding the squared euclidian norms of vectors in a and b with
size d3.

My first take was to use a python level loop:

>>> from numpy import *
>>> c = array([sum((a_i - b) ** 2, axis=1) for a_i in a])

But this is too slow and allocate a useless temporary list of python references.

To avoid the python level loop I then tried to use broadcasting as follows:

>>> c = sum((a[:,newaxis,:] - b) ** 2, axis=2)

But this build a useless and huge (d1, d2, d3) temporary array that
does not fit in memory for large values of d1, d2 and d3...

Do you have any better idea? I would like to simulate a runtime
behavior similar to:

>>> c = dot(a, b.T)

but for for squared euclidian norms instead of dotproducts.

I can always write a the code in C and wrap it with ctypes but I
wondered whether this is possible only with numpy.

-- 
Olivier