Numpy slow at vector cross product?

Tue Nov 22 09:45:06 EST 2016

On 22/11/2016 12:45, BartC wrote:
> On 22/11/2016 12:34, Skip Montanaro wrote:
>>> I'm simply suggesting there is plenty of room for improvement. I even
>> showed a version that did *exactly* what numpy does (AFAIK) that was
>> three
>> times the speed of numpy even executed by CPython. So there is some
>> mystery
>> there.
>>
>> As I indicated in my earlier response, your version doesn't pass all of
>> numpy's cross product unit tests. Fix that and submit a patch to the
>> numpy
>> maintainers. I suspect it would be accepted.
>
> I saw your response but didn't understand it. My code was based around
> what Peter Otten posted from numpy sources.
>
> I will have a look. Don't forget however that all someone is trying to
> do is to multiply two vectors. They're not interested in axes
> transformation or making them broadcastable, whatever that means.

It seems the posted code of numpy.cross wasn't complete. The full code 
is below (I've removed the doc string, but have had to add some 'np.' 
prefixes as it is now out of context).

If anyone is still wondering why numpy.cross is slow, then no further 
comments are necessary!

----------------------------------------------
def cross(a, b, axisa=-1, axisb=-1, axisc=-1, axis=None):

     if axis is not None:
         axisa, axisb, axisc = (axis,) * 3
     a = np.asarray(a)
     b = np.asarray(b)
     # Check axisa and axisb are within bounds
     axis_msg = "'axis{0}' out of bounds"
     if axisa < -a.ndim or axisa >= a.ndim:
         raise ValueError(axis_msg.format('a'))
     if axisb < -b.ndim or axisb >= b.ndim:
         raise ValueError(axis_msg.format('b'))
     # Move working axis to the end of the shape
     a = np.rollaxis(a, axisa, a.ndim)
     b = np.rollaxis(b, axisb, b.ndim)
     msg = ("incompatible dimensions for cross product\n"
            "(dimension must be 2 or 3)")
     if a.shape[-1] not in (2, 3) or b.shape[-1] not in (2, 3):
         raise ValueError(msg)

     # Create the output array
     shape = np.broadcast(a[..., 0], b[..., 0]).shape
     if a.shape[-1] == 3 or b.shape[-1] == 3:
         shape += (3,)
         # Check axisc is within bounds
         if axisc < -len(shape) or axisc >= len(shape):
             raise ValueError(axis_msg.format('c'))
     dtype = np.promote_types(a.dtype, b.dtype)
     cp = np.empty(shape, dtype)

     # create local aliases for readability
     a0 = a[..., 0]
     a1 = a[..., 1]
     if a.shape[-1] == 3:
         a2 = a[..., 2]
     b0 = b[..., 0]
     b1 = b[..., 1]
     if b.shape[-1] == 3:
         b2 = b[..., 2]
     if cp.ndim != 0 and cp.shape[-1] == 3:
         cp0 = cp[..., 0]
         cp1 = cp[..., 1]
         cp2 = cp[..., 2]

     if a.shape[-1] == 2:
         if b.shape[-1] == 2:
             # a0 * b1 - a1 * b0
             multiply(a0, b1, out=cp)
             cp -= a1 * b0
             return cp
         else:
             assert b.shape[-1] == 3
             # cp0 = a1 * b2 - 0  (a2 = 0)
             # cp1 = 0 - a0 * b2  (a2 = 0)
             # cp2 = a0 * b1 - a1 * b0
             np.multiply(a1, b2, out=cp0)
             np.multiply(a0, b2, out=cp1)
             np.negative(cp1, out=cp1)
             np.multiply(a0, b1, out=cp2)
             cp2 -= a1 * b0
     else:
         assert a.shape[-1] == 3
         if b.shape[-1] == 3:
             # cp0 = a1 * b2 - a2 * b1
             # cp1 = a2 * b0 - a0 * b2
             # cp2 = a0 * b1 - a1 * b0
             np.multiply(a1, b2, out=cp0)
             tmp = np.array(a2 * b1)
             cp0 -= tmp
             np.multiply(a2, b0, out=cp1)
             np.multiply(a0, b2, out=tmp)
             cp1 -= tmp
             np.multiply(a0, b1, out=cp2)
             np.multiply(a1, b0, out=tmp)
             cp2 -= tmp
         else:
             assert b.shape[-1] == 2
             # cp0 = 0 - a2 * b1  (b2 = 0)
             # cp1 = a2 * b0 - 0  (b2 = 0)
             # cp2 = a0 * b1 - a1 * b0
             np.multiply(a2, b1, out=cp0)
             np.negative(cp0, out=cp0)
             np.multiply(a2, b0, out=cp1)
             np.multiply(a0, b1, out=cp2)
             cp2 -= a1 * b0

     # This works because we are moving the last axis
     return np.rollaxis(cp, -1, axisc)

-- 
Bartc