Casting to np.byte before clearing values
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
Hi all, I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue. # Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int) %timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop %timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop %timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop %timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop Nicolas
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
Hi all,
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
Sure, if its a 1-byte width type, the code will end up calling `memset`. If it is not, it will end up calling a loop with: while (N > 0) { *dst = output; *dst += 8; /* or whatever element size/stride is */ --N; } now why this gives such a difference, I don't really know, but I guess it is not too surprising and may depend on other things as well. - Sebastian
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
Thanks for the explanation Sebastian, makes sense. Nicolas
On 26 Dec 2016, at 11:48, Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Mo, 2016-12-26 at 10:34 +0100, Nicolas P. Rougier wrote:
Hi all,
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
Sure, if its a 1-byte width type, the code will end up calling `memset`. If it is not, it will end up calling a loop with:
while (N > 0) { *dst = output; *dst += 8; /* or whatever element size/stride is */ --N; }
now why this gives such a difference, I don't really know, but I guess it is not too surprising and may depend on other things as well.
- Sebastian
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array. So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array. Actually, there is ndarray.fill(): In [7]: %timeit Z_float[...] = 0 1000 loops, best of 3: 380 µs per loop In [8]: %timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 271 µs per loop In [9]: %timeit Z_float.fill(0) 1000 loops, best of 3: 363 µs per loop which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop. whereas a .zero() could use a memset, like it does with bytes. can't say I have a use-case that would justify this, though. -CHB
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
Yes, clearing is not the proper word but the "trick" works only work for 0 (I'll get the same result in both cases). Nicolas
On 27 Dec 2016, at 20:52, Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Dec 26, 2016 at 1:34 AM, Nicolas P. Rougier <Nicolas.Rougier@inria.fr> wrote:
I'm trying to understand why viewing an array as bytes before clearing makes the whole operation faster. I imagine there is some kind of special treatment for byte arrays but I've no clue.
I notice that the code is simply setting a value using broadcasting -- I don't think there is anything special about zero in that case. But your subject refers to "clearing" an array.
So I wonder if you have a use case where the performance difference matters, in which case _maybe_ it would be worth having a ndarray.zero() method that efficiently zeros out an array.
Actually, there is ndarray.fill():
In [7]: %timeit Z_float[...] = 0
1000 loops, best of 3: 380 µs per loop
In [8]: %timeit Z_float.view(np.byte)[...] = 0
1000 loops, best of 3: 271 µs per loop
In [9]: %timeit Z_float.fill(0)
1000 loops, best of 3: 363 µs per loop
which seems to take an insignificantly shorter time than assignment. Probably because it's doing exactly the same loop.
whereas a .zero() could use a memset, like it does with bytes.
can't say I have a use-case that would justify this, though.
-CHB
# Native float Z_float = np.ones(1000000, float) Z_int = np.ones(1000000, int)
%timeit Z_float[...] = 0 1000 loops, best of 3: 361 µs per loop
%timeit Z_int[...] = 0 1000 loops, best of 3: 366 µs per loop
%timeit Z_float.view(np.byte)[...] = 0 1000 loops, best of 3: 267 µs per loop
%timeit Z_int.view(np.byte)[...] = 0 1000 loops, best of 3: 266 µs per loop
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (3)
-
Chris Barker
-
Nicolas P. Rougier
-
Sebastian Berg