
Chris Barker wrote:
X = transpose([x]+[y])
well, I learned a little bit more about Numeric today.
I've been skipping through a lot of messages today because I was getting behind on mailing list traffic, but I missed one thing in the discussion so far (sorry if it was marked already): transpose doesn't actually do any work. Actually, transpose only sets the "strides" counts differently, and this is blazingly fast. What is NOT fast is using the transposed array later! The problem is that many routines actually require a contiguous array, and will make a temporary local contiguous copy. This may happen multiple times if the lifetime of the transposed array is long. Even routines that do not require a contiguous array and can actually use the strides may run significantly slower because the CPU cache is trashed a lot by the high strides. Moral: you can't test this code by looping a 1000 times through it, you actually should take into account the time it takes to make a contiguous array immediately after the transpose call. Regards, Rob Hooft -- Rob W.W. Hooft || rob@hooft.net || http://www.hooft.net/people/rob/