I also have been stalking this email thread.  First, excellent book!

Regarding the vectorization example mentioned above, one thing to note is that it increases the order of the algorithm relative to the pure python.  The vectorized approach uses correlate, which requires ~(len(seq) * len(sub)) FLOPs.  In the case where the first element in sub is not equal to the vast majority of elements in seq, the basic approach requires ~len(seq) comparisons.  Note that is the case in the SO answer.  One fairly common thing I have seen in vectorized approaches is that the memory or operations required scales worse than strictly required.  It may or may not be an issue, largely depends on the specifics of how its used, but it usually indicates a better approach exists.  That may be worth mentioning here.

Given that, I tried to come up with an "ideal" approach.  stride_tricks can be used to convert seq to a 2D array, and then ideally each row could be compared to sub.  However I can't think of how to do that with numpy function calls other than compare each element in the 2D array, requiring O(n_sub*n_seq) operations again.  array_equal is an example of that.  Python list equality scales better, for instance if x = [0]*n and y = [1]*n, x == y is very fast and the time is independent of the value of n.

It seems a generalized ufunc "all_equal" with signature (i),(i)->() and short circuit logic once the first non equal element is encountered would be an important performance improvement.  In the ideal case it is dramatically faster, and even if every element must be compared then its still probably meaningfully faster since the boolean intermediate array isn't created.  Even better would be to get the axis argument in place for generalized ufuncs.  Then this problem could be vectorized in one line with far better performance.  If others think this is a good idea I will post an issue and attempt a solution.


On Sat, Dec 31, 2016 at 5:23 AM, Nicolas P. Rougier <Nicolas.Rougier@inria.fr> wrote:

> I’ve seen vectorisation taken to the extreme, with negative consequences in terms of both speed and readability, in both Python and MATLAB codebases, so I would suggest some discussion / wisdom about when not to vectorise.


I agree and there is actually a warning in the introduction about readability vs speed with an example showing a clever optimization (by Jaime Fernández del Río) that is hardly readable for the non-experts (including myself).


Nicolas
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion