20 Mar
2010
20 Mar
'10
12:26 p.m.
Anne Archibald wrote:
I'm not knocking numpy; it does (almost) the best it can. (I'm not sure of the optimality of the order in which ufuncs are executed; I think some optimizations there are possible.)
Ufuncs and reductions are not performed in a cache-optimal fashion, IIRC dimensions are always traversed in order from left to right. Large speedups are possible in some cases, but in a quick try I didn't manage to come up with an algorithm that would always improve the speed (there was a thread about this last year or so, and there's a ticket). Things varied between computers, so this probably depends a lot on the actual cache arrangement. But perhaps numexpr has such heuristics, and we could steal them? -- Pauli Virtanen