[Numpy-discussion] performance of numpy.array()

Wed Apr 29 11:05:49 EDT 2015

I work on two distinct scientific clusters. I have run the same python code
on the two clusters and I have noticed that one is faster by an order of
magnitude than the other (1min vs 10min, this is important because I run
this function many times).

I have investigated with a profiler and I have found that the cause of this
is that (same code and same data) is the function numpy.array that is being
called 10^5 times. On cluster A it takes 2 s in total, whereas on cluster B
it takes ~6 min.  For what regards the other functions, they are generally
faster on cluster A. I understand that the clusters are quite different,
both as hardware and installed libraries. It strikes me that on this
particular function the performance is so different. I would have though
that this is due to a difference in the available memory, but actually by
looking with `top` the memory seems to be used only at 0.1% on cluster B.
In theory numpy is compiled with atlas on cluster B, and on cluster A it is
not clear, because numpy.__config__.show() returns NOT AVAILABLE for
anything.

Does anybody has any insight on that, and if I can improve the performance
on cluster B?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150429/bc6e64e0/attachment.html>