Hi, I compared for a 256x256 float32 normal-noise (x0=100,sigma=1) array the times to do 1./ (a*a) vs. a**-2
U.timeIt('1./(a*a)', 1000) (0.00090877471871, 0.00939644563778, 0.00120674694689, 0.000687777554628) U.timeIt('a**-2', 1000) (0.00876591857354, 0.0263829620803, 0.00952076311375, 0.00173311803255)
The numbers are min,max, mean, stddev over thousand runs. N.__version == 1.0.1 The slowdown is almost 10 fold. Similar tests for **-1, and **2 show that the corresponding times are identical - i.e. those cases are optimized to not call the pow routine. Can this be fixed for the **-2 case ? Thanks, Sebastian Haase
On 7/15/07, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Hi, I compared for a 256x256 float32 normal-noise (x0=100,sigma=1) array the times to do 1./ (a*a) vs. a**-2
U.timeIt('1./(a*a)', 1000) (0.00090877471871, 0.00939644563778, 0.00120674694689, 0.000687777554628) U.timeIt('a**-2', 1000) (0.00876591857354, 0.0263829620803, 0.00952076311375, 0.00173311803255)
The numbers are min,max, mean, stddev over thousand runs. N.__version == 1.0.1
The slowdown is almost 10 fold. Similar tests for **-1, and **2 show that the corresponding times are identical - i.e. those cases are optimized to not call the pow routine.
Can this be fixed for the **-2 case ?
Not without some testing and discussion. If I recall correctly, we fixed all of the cases where the optimized case had the same accuracy as the unoptimized case. Optimizing 'x**-2', because it involves two operations (* and /), starts to become lose accuracy relative to pow(x,-2). The accuracy loss is relatively minor however; an additional ULP (unit in last place) or so, I believe. It's been a while however, so I may have the details scrambled. So, while I'm not dead set against it, I think we would definitely come to a consensus on how much accuracy we are willing to forgo for these notational conveniences. And document accordingly. The uncontroversial ones already got optimized. Then again, we could just leave things as they are and when you're hungry for speed, you could use '1/x**2'. -- . __ . |-\ . . tim.hochberg@ieee.org
participants (2)
-
Sebastian Haase -
Timothy Hochberg