[Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

Sebastian Berg sebastian at sipsolutions.net
Fri Mar 12 16:50:24 EST 2021


On Fri, 2021-03-12 at 21:36 +0100, PIERRE AUGIER wrote:
> Hi,
> 
> I'm looking for a difference between Numpy 0.19.5 and 0.20 which
> could explain a performance regression (~15 %) with Pythran.
> 
> I observe this regression with the script 
> https://github.com/paugier/nbabel/blob/master/py/bench.py
> 
> Pythran reimplements Numpy so it is not about Numpy code for
> computation. However, Pythran of course uses the native array
> contained in a Numpy array. I'm quite sure that something has changed
> between Numpy 0.19.5 and 0.20 (or between the corresponding wheels?)
> since I don't get the same performance with Numpy 0.20. I checked
> that the values in the arrays are the same and that the flags
> characterizing the arrays are also the same.
> 
> Good news, I'm now able to obtain the performance difference just
> with Numpy 0.19.5. In this code, I load the data with Pandas and need
> to prepare contiguous Numpy arrays to give them to Pythran. With
> Numpy 0.19.5, if I use np.copy I get better performance that with
> np.ascontiguousarray. With Numpy 0.20, both functions create array
> giving the same performance with Pythran (again, less good that with
> Numpy 0.19.5).
> 
> Note that this code is very efficient (more that 100 times faster
> than using Numpy), so I guess that things like alignment or memory
> location can lead to such difference.
> 
> More details in this issue 
> https://github.com/serge-sans-paille/pythran/issues/1735
> 
> Any help to understand what has changed would be greatly appreciated!
> 

If you want to really dig into this, it would be good to do profiling
to find out at where the differences are.

Without that, I don't have much appetite to investigate personally. The
reason is that fluctuations of ~30% (or even much more) when running
the NumPy benchmarks are very common.

I am not aware of an immediate change in NumPy, especially since you
are talking pythran, and only the memory space or the interface code
should matter.
As to the interface code... I would expect it to be quite a bit faster,
not slower.
There was no change around data allocation, so at best what you are
seeing is a different pattern in how the "small array cache" ends up
being used.


Unfortunately, getting stable benchmarks that reflect code changes
exactly is tough...  Here is a nice blog post from Victor Stinner where
he had to go as far as using "profile guided compilation" to avoid
fluctuations:

https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html

I somewhat hope that this is also the reason for the huge fluctuations
we see in the NumPy benchmarks due to absolutely unrelated code
changes.
But I did not have the energy to try it (and a probably fixed bug in
gcc makes it a bit harder right now).

Cheers,

Sebastian




> Cheers,
> Pierre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210312/888edf57/attachment.sig>


More information about the NumPy-Discussion mailing list