Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
Nicolas
Nicolas,
From a quick glance, this looks really wonderful! I intend to point my
students that are interested in numpy to it.
-CHB
On Thu, Dec 22, 2016 at 8:44 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/ nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier Nicolas.Rougier@inria.fr:
Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/ nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
Wow!!! Beautiful.
Thanks for sharing.
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi Nicolas, that's a very nice work!
Comments/questions/fixes/ideas are of course welcome.
Boids example brought my attention too, some comments on it: - I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.) - you probably can speed up things with going to sparse arrays - and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares
Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo.
Thanks, Alex.
23 дек. 2016 г., в 12:14, Kiko kikocorreoso@gmail.com написал(а):
2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier <Nicolas.Rougier@inria.fr mailto:Nicolas.Rougier@inria.fr>:
Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ http://www.labri.fr/perso/nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
Wow!!! Beautiful.
Thanks for sharing.
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
On 30 Dec 2016, at 20:36, Alex Rogozhnikov alex.rogozhnikov@yandex.ru wrote:
Hi Nicolas, that's a very nice work!
Comments/questions/fixes/ideas are of course welcome.
Boids example brought my attention too, some comments on it:
- I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.)
- you probably can speed up things with going to sparse arrays
- and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares
Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo.
Thanks, Alex.
Thanks.
I'm not sure to know how to use complex with this example. Could you elaborate ?
For the preliminary splitting, a quadtree (scipy KDTree) could also help a lot but I wanted to stick to numpy only. A simpler square splitting as you suggest could make thing faster but require some work. I'm not sure yet I see how to restrict analysis to close squares.
Nicolas
23 дек. 2016 г., в 12:14, Kiko kikocorreoso@gmail.com написал(а):
2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier Nicolas.Rougier@inria.fr:
Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
Wow!!! Beautiful.
Thanks for sharing.
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
31 дек. 2016 г., в 2:09, Nicolas P. Rougier Nicolas.Rougier@inria.fr написал(а):
On 30 Dec 2016, at 20:36, Alex Rogozhnikov alex.rogozhnikov@yandex.ru wrote:
Hi Nicolas, that's a very nice work!
Comments/questions/fixes/ideas are of course welcome.
Boids example brought my attention too, some comments on it:
- I find using complex numbers here very natural, this should speed up things and also shorten the code (rotating without einsum, etc.)
- you probably can speed up things with going to sparse arrays
- and you can go to really large numbers of 'birds' if you combine it with preliminary splitting of space into squares, thus analyze only birds from close squares
Also I think worth adding some operations with HSV / HSL color spaces as those can be visualized easily e.g. on some photo.
Thanks, Alex.
Thanks.
I'm not sure to know how to use complex with this example. Could you elaborate ?
Position and velocity are encoded by complex numbers. Rotation is multiplication by exp(i \phi), translating is adding a complex number. Distance = abs(x - y).
I think, that's all operations you need, but maybe I miss something.
For the preliminary splitting, a quadtree (scipy KDTree) could also help a lot but I wanted to stick to numpy only. A simpler square splitting as you suggest could make thing faster but require some work. I'm not sure yet I see how to restrict analysis to close squares.
Nicolas
23 дек. 2016 г., в 12:14, Kiko kikocorreoso@gmail.com написал(а):
2016-12-22 17:44 GMT+01:00 Nicolas P. Rougier Nicolas.Rougier@inria.fr:
Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
Wow!!! Beautiful.
Thanks for sharing.
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion https://mail.scipy.org/mailman/listinfo/numpy-discussion
On 22 Dec 2016, at 17:44, Nicolas P. Rougier Nicolas.Rougier@inria.fr wrote:
Dear all,
I've just put online a (kind of) book on Numpy and more specifically about vectorization methods. It's not yet finished, has not been reviewed and it's a bit rough around the edges. But I think there are some material that can be interesting. I'm specifically happy with the boids example that show a nice combination of numpy and matplotlib strengths.
Book is online at: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ Sources are available at: https://github.com/rougier/from-python-to-numpy
Comments/questions/fixes/ideas are of course welcome.
I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.
I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.
I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself).
Nicolas
I also have been stalking this email thread. First, excellent book!
Regarding the vectorization example mentioned above, one thing to note is that it increases the order of the algorithm relative to the pure python. The vectorized approach uses correlate, which requires ~(len(seq) * len(sub)) FLOPs. In the case where the first element in sub is not equal to the vast majority of elements in seq, the basic approach requires ~len(seq) comparisons. Note that is the case in the SO answer. One fairly common thing I have seen in vectorized approaches is that the memory or operations required scales worse than strictly required. It may or may not be an issue, largely depends on the specifics of how its used, but it usually indicates a better approach exists. That may be worth mentioning here.
Given that, I tried to come up with an "ideal" approach. stride_tricks can be used to convert seq to a 2D array, and then ideally each row could be compared to sub. However I can't think of how to do that with numpy function calls other than compare each element in the 2D array, requiring O(n_sub*n_seq) operations again. array_equal https://docs.scipy.org/doc/numpy/reference/generated/numpy.array_equal.html is an example of that. Python list equality scales better, for instance if x = [0]*n and y = [1]*n, x == y is very fast and the time is independent of the value of n.
It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement. In the ideal case it is dramatically faster, and even if every element must be compared then its still probably meaningfully faster since the boolean intermediate array isn't created. Even better would be to get the axis argument in place for generalized ufuncs. Then this problem could be vectorized in one line with far better performance. If others think this is a good idea I will post an issue and attempt a solution.
On Sat, Dec 31, 2016 at 5:23 AM, Nicolas P. Rougier < Nicolas.Rougier@inria.fr> wrote:
I’ve seen vectorisation taken to the extreme, with negative consequences
in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.
I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself).
Nicolas _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
I coded up an all_equal gufunc here https://github.com/mattharrigan/numpy_logical_gufuncs. Benchmark results are also in that repo. For the specific problem in the book which started this, its 40x faster than the optimized code in the book. For large arrays which have any early non equal element, its dramatically faster (1000x) than the current alternative. For large arrays which are all equal, its ~10% faster due to eliminating the intermediate boolean array. For tiny arrays its much faster due to a single function call instead of at least two, but its debatable how relevant speed is for tiny problems. Disclaimer: this is my first ufunc I have every written.
On Tue, Jan 10, 2017 at 8:27 PM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
It seems a generalized ufunc "all_equal" with signature (i),(i)->() and
short circuit logic once the first non equal element is encountered would be an important performance improvement.
How does array_equal() perform?
-CHB _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion