![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
Dear all, I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths. Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy Comments/questions/fixes/ideas are of course welcome. Nicolas
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
Nicolas,
From a quick glance, this looks really wonderful! I intend to point my students that are interested in numpy to it.
-CHB On Thu, Dec 22, 2016 at 8:44 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
![](https://secure.gravatar.com/avatar/2cb60a46de69a6e89e623955f334f56f.jpg?s=120&d=mm&r=g)
Hi Nicolas, that's a very nice work!
Comments/questions/fixes/ideas are of course welcome.
Boids example brought my attention too, some comments on it: - I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.) - you probably can speed up things with going to sparse arrays - and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo. Thanks, Alex.
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
Thanks. I'm not sure to know how to use complex with this example. Could you elaborate ? For the preliminary splitting, a quadtree (scipy KDTree) could also help a lot but I wanted to stick to numpy only. A simpler square splitting as you suggest could make thing faster but require some work. I'm not sure yet I see how to restrict analysis to close squares. Nicolas
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.
I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself). Nicolas
![](https://secure.gravatar.com/avatar/5e44e61af952612c3c45385b62806c67.jpg?s=120&d=mm&r=g)
I also have been stalking this email thread. First, excellent book! Regarding the vectorization example mentioned above, one thing to note is that it increases the order of the algorithm relative to the pure python. The vectorized approach uses correlate, which requires ~(len(seq) * len(sub)) FLOPs. In the case where the first element in sub is not equal to the vast majority of elements in seq, the basic approach requires ~len(seq) comparisons. Note that is the case in the SO answer. One fairly common thing I have seen in vectorized approaches is that the memory or operations required scales worse than strictly required. It may or may not be an issue, largely depends on the specifics of how its used, but it usually indicates a better approach exists. That may be worth mentioning here. Given that, I tried to come up with an "ideal" approach. stride_tricks can be used to convert seq to a 2D array, and then ideally each row could be compared to sub. However I can't think of how to do that with numpy function calls other than compare each element in the 2D array, requiring O(n_sub*n_seq) operations again. array_equal <https://docs.scipy.org/doc/numpy/reference/generated/numpy.array_equal.html> is an example of that. Python list equality scales better, for instance if x = [0]*n and y = [1]*n, x == y is very fast and the time is independent of the value of n. It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement. In the ideal case it is dramatically faster, and even if every element must be compared then its still probably meaningfully faster since the boolean intermediate array isn't created. Even better would be to get the axis argument in place for generalized ufuncs. Then this problem could be vectorized in one line with far better performance. If others think this is a good idea I will post an issue and attempt a solution. On Sat, Dec 31, 2016 at 5:23 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
![](https://secure.gravatar.com/avatar/5e44e61af952612c3c45385b62806c67.jpg?s=120&d=mm&r=g)
I coded up an all_equal gufunc here <https://github.com/mattharrigan/numpy_logical_gufuncs>. Benchmark results are also in that repo. For the specific problem in the book which started this, its 40x faster than the optimized code in the book. For large arrays which have any early non equal element, its dramatically faster (1000x) than the current alternative. For large arrays which are all equal, its ~10% faster due to eliminating the intermediate boolean array. For tiny arrays its much faster due to a single function call instead of at least two, but its debatable how relevant speed is for tiny problems. Disclaimer: this is my first ufunc I have every written. On Tue, Jan 10, 2017 at 8:27 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
Nicolas,
From a quick glance, this looks really wonderful! I intend to point my students that are interested in numpy to it.
-CHB On Thu, Dec 22, 2016 at 8:44 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
![](https://secure.gravatar.com/avatar/2cb60a46de69a6e89e623955f334f56f.jpg?s=120&d=mm&r=g)
Hi Nicolas, that's a very nice work!
Comments/questions/fixes/ideas are of course welcome.
Boids example brought my attention too, some comments on it: - I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.) - you probably can speed up things with going to sparse arrays - and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo. Thanks, Alex.
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
Thanks. I'm not sure to know how to use complex with this example. Could you elaborate ? For the preliminary splitting, a quadtree (scipy KDTree) could also help a lot but I wanted to stick to numpy only. A simpler square splitting as you suggest could make thing faster but require some work. I'm not sure yet I see how to restrict analysis to close squares. Nicolas
![](https://secure.gravatar.com/avatar/05fc6835e821894b1ff75c46391eed7b.jpg?s=120&d=mm&r=g)
I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.
I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself). Nicolas
![](https://secure.gravatar.com/avatar/5e44e61af952612c3c45385b62806c67.jpg?s=120&d=mm&r=g)
I also have been stalking this email thread. First, excellent book! Regarding the vectorization example mentioned above, one thing to note is that it increases the order of the algorithm relative to the pure python. The vectorized approach uses correlate, which requires ~(len(seq) * len(sub)) FLOPs. In the case where the first element in sub is not equal to the vast majority of elements in seq, the basic approach requires ~len(seq) comparisons. Note that is the case in the SO answer. One fairly common thing I have seen in vectorized approaches is that the memory or operations required scales worse than strictly required. It may or may not be an issue, largely depends on the specifics of how its used, but it usually indicates a better approach exists. That may be worth mentioning here. Given that, I tried to come up with an "ideal" approach. stride_tricks can be used to convert seq to a 2D array, and then ideally each row could be compared to sub. However I can't think of how to do that with numpy function calls other than compare each element in the 2D array, requiring O(n_sub*n_seq) operations again. array_equal <https://docs.scipy.org/doc/numpy/reference/generated/numpy.array_equal.html> is an example of that. Python list equality scales better, for instance if x = [0]*n and y = [1]*n, x == y is very fast and the time is independent of the value of n. It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement. In the ideal case it is dramatically faster, and even if every element must be compared then its still probably meaningfully faster since the boolean intermediate array isn't created. Even better would be to get the axis argument in place for generalized ufuncs. Then this problem could be vectorized in one line with far better performance. If others think this is a good idea I will post an issue and attempt a solution. On Sat, Dec 31, 2016 at 5:23 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
![](https://secure.gravatar.com/avatar/5e44e61af952612c3c45385b62806c67.jpg?s=120&d=mm&r=g)
I coded up an all_equal gufunc here <https://github.com/mattharrigan/numpy_logical_gufuncs>. Benchmark results are also in that repo. For the specific problem in the book which started this, its 40x faster than the optimized code in the book. For large arrays which have any early non equal element, its dramatically faster (1000x) than the current alternative. For large arrays which are all equal, its ~10% faster due to eliminating the intermediate boolean array. For tiny arrays its much faster due to a single function call instead of at least two, but its debatable how relevant speed is for tiny problems. Disclaimer: this is my first ufunc I have every written. On Tue, Jan 10, 2017 at 8:27 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
participants (7)
-
Alex Rogozhnikov
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Kiko
-
Marmaduke Woodman
-
Matthew Harrigan
-
Nicolas P. Rougier