[Numpy-discussion] strange sin/cos performance
Andrew Friedley
afriedle at indiana.edu
Tue Aug 4 09:39:15 EDT 2009
Bruce Southey wrote:
> Hi,
> Can you try these from the command line:
> python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000,
> (2*3.14159) / 1000, dtype=np.float32)"
> python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000,
> (2*3.14159) / 1000, dtype=np.float32); b=np.sin(a)"
> python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000,
> (2*3.14159) / 1000, dtype=np.float32); np.sin(a)"
> python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000,
> (2*3.14159) / 1000, dtype=np.float32)" "np.sin(a)"
>
> The first should be similar for different dtypes because it is just
> array creation. The second extends that by storing the sin into another
> array. I am not sure how to interpret the third but in the Python prompt
> it would print it to screen. The last causes Python to handle two
> arguments which is slow using float32 but not for float64 and float128
> suggesting compiler issue such as not using SSE or similar.
Results:
$ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0,
1000, (2*3.14159) / 1000, dtype=np.float32)"
100 loops, best of 3: 0.0811 usec per loop
$ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0,
1000, (2*3.14159) / 1000, dtype=np.float32); b=np.sin(a)"
100 loops, best of 3: 0.11 usec per loop
$ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0,
1000, (2*3.14159) / 1000, dtype=np.float32); np.sin(a)"
100 loops, best of 3: 0.11 usec per loop
$ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0,
1000, (2*3.14159) / 1000, dtype=np.float32)" "np.sin(a)"
100 loops, best of 3: 112 msec per loop
$ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0,
1000, (2*3.14159) / 1000, dtype=np.float64)" "np.sin(a)"
100 loops, best of 3: 13.2 msec per loop
I think the second and third are effectively the same; both create an
array containing the result. The second assigns that array to a value,
while the third does not, so it should get garbage collected.
The fourth one is the only one that actually runs the sin in the timing
loop. I don't understand what you mean by causing Pyton to handle two
arguments?
The fifth run I added uses float64 to compare (and reproduces the problem).
Andrew
More information about the NumPy-Discussion
mailing list