I have had good luck with Continuum's Miniconda Python distributions on
Linux.
http://conda.pydata.org/miniconda.html
The `conda` command makes it very easy to create specific testing
environments for Python 2 and 3 with many different packages. Everything is
precompiled, so you won't have to worry about system library differences
between the two clusters.
Hope that helps.
Ryan
On Thu, Apr 30, 2015 at 10:03 AM, simona bellavista
I have seen a big improvement in performance with numpy 1.9.2 with python 2.7.8, numpy.array takes 5 s instead of 300s.
On the other side, I have also tried numpy 1.9.2 and 1.9.0 with python 3.4 and the results are terrible: numpy.array takes 20s, but the other routines are slowed down, for example concatenate and astype and copy and uniform. Most of all, the sort function of numpy.dnarray is slowed down by a factor at least 10.
On the other cluster I am using python 3.3 with numpy 1.9.0 and it is working very well (but I think it is so also because of the hardware). I was trying to install python 3.3 on this cluster, but because of other issues (error at compile time of h5py library and bug at runtime in the dill library) I cannot test it right now.
2015-04-29 17:47 GMT+02:00 Sebastian Berg
: There was a major improvement to np.array in some cases.
You can probably work around this by using np.concatenate instead of np.array in your case (depends on the usecase, but I will guess you have code doing:
np.array([arr1, arr2, arr3])
or similar. If your use case is different, you may be out of luck and only an upgrade would help.
On Mi, 2015-04-29 at 17:41 +0200, Nick Papior Andersen wrote:
You could try and install your own numpy to check whether that resolves the problem.
2015-04-29 17:40 GMT+02:00 simona bellavista
: on cluster A 1.9.0 and on cluster B 1.8.2 2015-04-29 17:18 GMT+02:00 Nick Papior Andersen
: Compile it yourself to know the limitations/benefits of the dependency libraries. Otherwise, have you checked which versions of numpy they are, i.e. are they the same version?
2015-04-29 17:05 GMT+02:00 simona bellavista
: I work on two distinct scientific clusters. I have run the same python code on the two clusters and I have noticed that one is faster by an order of magnitude than the other (1min vs 10min, this is important because I run this function many times).
I have investigated with a profiler and I have found that the cause of this is that (same code and same data) is the function numpy.array that is being called 10^5 times. On cluster A it takes 2 s in total, whereas on cluster B it takes ~6 min. For what regards the other functions, they are generally faster on cluster A. I understand that the clusters are quite different, both as hardware and installed libraries. It strikes me that on this particular function the performance is so different. I would have though that this is due to a difference in the available memory, but actually by looking with `top` the memory seems to be used only at 0.1% on cluster B. In theory numpy is compiled with atlas on cluster B, and on cluster A it is not clear, because numpy.__config__.show() returns NOT AVAILABLE for anything.
Does anybody has any insight on that, and if I can improve the performance on cluster B?
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Kind regards Nick
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Kind regards Nick _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion