Hi all,
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas
On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
Hi all,
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
Sure, if its a 1-byte width type, the code will end up calling `memset`. If it is not, it will end up calling a loop with:
while (N > 0) { *dst = output; *dst += 8; /* or whatever element size/stride is */ --N; }
now why this gives such a difference, I don't really know, but I guess it is not too surprising and may depend on other things as well.
- Sebastian
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Thanks for the explanation Sebastian, makes sense.
Nicolas
On 26 Dec 2016, at 11:48, Sebastian Berg sebastian@sipsolutions.net wrote:
On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
Hi all,
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
Sure, if its a 1-byte width type, the code will end up calling `memset`. If it is not, it will end up calling a loop with:
while (N > 0) { *dst = output; *dst += 8; /* or whatever element size/stride is */ --N; }
now why this gives such a difference, I don't really know, but I guess it is not too surprising and may depend on other things as well.
- Sebastian
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array.
So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array.
Actually, there is ndarray.fill():
In [7]: %timeit Z_float[...] = 0
1000 loops, best of 3: 380 µs per loop
In [8]: %timeit Z_float.view(np.byte)[...] = 0
1000 loops, best of 3: 271 µs per loop
In [9]: %timeit Z_float.fill(0)
1000 loops, best of 3: 363 µs per loop
which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop.
whereas a .zero() could use a memset, like it does with bytes.
can't say I have a use-case that would justify this, though.
-CHB
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Yes, clearing is not the proper word but the "trick" works only work for 0 (I'll get the same result in both cases).
Nicolas
On 27 Dec 2016, at 20:52, Chris Barker chris.barker@noaa.gov wrote:
On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier Nicolas.Rougier@inria.fr wrote:
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array.
So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array.
Actually, there is ndarray.fill():
In [7]: %timeit Z_float[...] = 0
1000 loops, best of 3: 380 µs per loop
In [8]: %timeit Z_float.view(np.byte)[...] = 0
1000 loops, best of 3: 271 µs per loop
In [9]: %timeit Z_float.fill(0)
1000 loops, best of 3: 363 µs per loop
which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop.
whereas a .zero() could use a memset, like it does with bytes.
can't say I have a use-case that would justify this, though.
-CHB
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion