2012/1/21 Ondřej Čertík <ondrej.certik@gmail.com>
<snip>

Let me know if you figure out something. I think the "mask" thing is
quite slow, but the problem is that it needs to be there, to catch
overflows (and it is there in Fortran as well, see the
"where" statement, which does the same thing). Maybe there is some
other way to write the same thing in NumPy?

In the current master, you can replace

    z[mask] *= z[mask]
    z[mask] += c[mask]
with
    np.multiply(z, z, out=z, where=mask)
    np.add(z, c, out=z, where=mask)

The performance of this alternate syntax is still not great, but it is significantly faster than what it replaces. For a particular choice of mask, I get

In [40]: timeit z[mask] *= z[mask]

10 loops, best of 3: 29.1 ms per loop

In [41]: timeit np.multiply(z, z, out=z, where=mask)

100 loops, best of 3: 4.2 ms per loop


-Mark



Ondrej
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion