Thanks Francesc, Robert for giving me a broader picture of where this fits in. I believe numexpr does not  handle slicing, so that might be another thing to look at.

On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod <> wrote:

As Francesc said, Numexpr is going to get most of its power through grouping a series of operations so it can send blocks to the CPU cache and run the entire series of operations on the cache before returning the block to system memory.  If it was just used to back-end NumPy, it would only gain from the multi-threading portion inside each function call.

Is that so ?

I thought numexpr also cuts down on number of temporary buffers that get filled (in other words copy operations) if the same expression was written as series of operations. My understanding can be wrong, and would appreciate correction.

The 'out' parameter in ufuncs can eliminate extra temporaries but its not composable. Right now I have to manually carry along the array where the in place operations take place. I think the goal here is to eliminate that.