[Numpy-discussion] performance of numpy.array()

Sebastian Berg sebastian at sipsolutions.net
Wed Apr 29 11:47:30 EDT 2015


There was a major improvement to np.array in some cases.

You can probably work around this by using np.concatenate instead of
np.array in your case (depends on the usecase, but I will guess you have
code doing:

np.array([arr1, arr2, arr3])

or similar. If your use case is different, you may be out of luck and
only an upgrade would help.


On Mi, 2015-04-29 at 17:41 +0200, Nick Papior Andersen wrote:
> You could try and install your own numpy to check whether that
> resolves the problem.
> 
> 2015-04-29 17:40 GMT+02:00 simona bellavista <afylot at gmail.com>:
>         on cluster A 1.9.0 and on cluster B 1.8.2
>         
>         2015-04-29 17:18 GMT+02:00 Nick Papior Andersen
>         <nickpapior at gmail.com>:
>                 Compile it yourself to know the limitations/benefits
>                 of the dependency libraries.
>                 
>                 
>                 Otherwise, have you checked which versions of numpy
>                 they are, i.e. are they the same version?
>                 
>                 
>                 2015-04-29 17:05 GMT+02:00 simona bellavista
>                 <afylot at gmail.com>:
>                 
>                         I work on two distinct scientific clusters. I
>                         have run the same python code on the two
>                         clusters and I have noticed that one is faster
>                         by an order of magnitude than the other (1min
>                         vs 10min, this is important because I run this
>                         function many times). 
>                         
>                         
>                         I have investigated with a profiler and I have
>                         found that the cause of this is that (same
>                         code and same data) is the function
>                         numpy.array that is being called 10^5 times.
>                         On cluster A it takes 2 s in total, whereas on
>                         cluster B it takes ~6 min.  For what regards
>                         the other functions, they are generally faster
>                         on cluster A. I understand that the clusters
>                         are quite different, both as hardware and
>                         installed libraries. It strikes me that on
>                         this particular function the performance is so
>                         different. I would have though that this is
>                         due to a difference in the available memory,
>                         but actually by looking with `top` the memory
>                         seems to be used only at 0.1% on cluster B. In
>                         theory numpy is compiled with atlas on cluster
>                         B, and on cluster A it is not clear, because
>                         numpy.__config__.show() returns NOT AVAILABLE
>                         for anything.
>                         
>                         
>                         Does anybody has any insight on that, and if I
>                         can improve the performance on cluster B?
>                         
>                         
>                         _______________________________________________
>                         NumPy-Discussion mailing list
>                         NumPy-Discussion at scipy.org
>                         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>                         
>                 
>                 
>                 
>                 
>                 -- 
>                 Kind regards Nick
>                 
>                 _______________________________________________
>                 NumPy-Discussion mailing list
>                 NumPy-Discussion at scipy.org
>                 http://mail.scipy.org/mailman/listinfo/numpy-discussion
>                 
>         
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> 
> 
> -- 
> Kind regards Nick
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150429/3886aa17/attachment.sig>


More information about the NumPy-Discussion mailing list