A Friday 22 May 2009 11:55:31 Gregor Thalhammer escrigué:
dmitrey schrieb:
Hi all, I expected to have some speedup via using ldexp or multiplying an array by a power of 2 (doesn't it have to perform a simple shift of mantissa?), but I don't see the one.
# Let me also note - # 1) using b = 2 * ones(N) or b = zeros(N) doesn't yield any speedup vs b = rand() # 2) using A * 2.0 (or mere 2) instead of 2.1 doesn't yield any speedup, despite it is exact integer power of 2.
On recent processors multiplication is very fast and takes 1.5 clock cycles (float, double precision), independent of the values. There is very little gain by using bit shift operators.
...unless you use the vectorization capabilities of modern Intel-compatible processors and shift data in bunches of up to 4 elements (i.e. the number of floats that fits on a 128-bit SSE2 register), in which case you can perform operations up to a speed of 0.25 cycles/element. Indeed, that requires dealing with SSE2 instructions in your code, but using latest GCC, ICC or MSVC implementations, this is not that difficult. Cheers, -- Francesc Alted