Hello everyone, First of all, let me apologize for my earlier message; I made the mistake of trying to indent my code using SquirrelMail's horrible interface -- and pressing Tab and Space resulted in sending my (incomplete) e-mail to the list. Cursed be Opera's keyboard shortcuts now :-). I'm currently planning to use a Python-based infrastructure for our HPC project. I've previously used NumPy and SciPy for basic scientific computing tasks, so performance hasn't been quite an issue for me until now. At the moment I'm not too sure as to what to do next though, and I was hoping that someone with more experience in performance-related issues could point me to a way out of this. The trouble lays in the following piece of code: === w = 2 * math.pi * f M = A - (1j*w*E) n = M.shape[1] B1 = numpy.zeros(n) B2 = numpy.zeros(n) B1[n-2] = 1.0 B2[n-1] = 1.0 -> slow part starts here umfpack.numeric(M) x1 = umfpack.solve( um.UMFPACK_A, M, B1, autoTranspose = False) x2 = umfpack.solve( um.UMFPACK_A, M, B2, autoTranspose = False) solution = scipy.array([ [ x1[n-2], x2[n-2] ], [ x1[n-1], x2[n-1] ]]) return solution ==== This isn't really too much -- it's generating a system matrix via operations that take little time, as I was expecting. Trouble is, the solve part takes significantly more time than Octave -- about 4 times. I'm using the stock version of UMFPACK in Ubuntu's repository; it's compiled against standard BLAS, so it's fairly slow, but so is Octave -- so the problem isn't there. I'm obviously doing something wrong related to memory management here, because the memory consumption is also rocketing, but I'm not sure what exactly it is that I'm doing wrong. Could you point me towards some relevant documentation describing what I could do in order to improve the performance, or give me some hint related to that? Best regards, Alexandru Lazar
I hope I won't get identified as a spam bot :-). While I have not resolved the problem itself, this is an issue that I cannot reproduce on our cluster. I wanted to get back with some actual timings from the real hardware we are going to be using and some details about the matrices, so as not to chase ghosts, but this proved to be a headache saver. It's still baffling because on the cluster I have also used stock packages (albeit from Fedora, which is what our system administrator insists on using) rather than my hand-compiled and optimized GotoBLAS and UMFPACK. It didn't even occur to me to try to reproduce this on another system in the last 4 hours I've been struggling with this, because I assumed that using stock packages was giving me the uniformity I required. It seems I was wrong. Nonetheless, I think it's safe to assume in this case that the problem is not in NumPy or my code, and it would be wiser to bring this up in Ubuntu's trackpad. Thanks for your patience, Alexandru On Thu, July 22, 2010 4:10 am, Ioan-Alexandru Lazar wrote:
Hello everyone,
First of all, let me apologize for my earlier message; I made the mistake of trying to indent my code using SquirrelMail's horrible interface -- and pressing Tab and Space resulted in sending my (incomplete) e-mail to the list. Cursed be Opera's keyboard shortcuts now :-).
I'm currently planning to use a Python-based infrastructure for our HPC project. I've previously used NumPy and SciPy for basic scientific computing tasks, so performance hasn't been quite an issue for me until now. At the moment I'm not too sure as to what to do next though, and I was hoping that someone with more experience in performance-related issues could point me to a way out of this.
The trouble lays in the following piece of code:
=== w = 2 * math.pi * f M = A - (1j*w*E) n = M.shape[1] B1 = numpy.zeros(n) B2 = numpy.zeros(n) B1[n-2] = 1.0 B2[n-1] = 1.0 -> slow part starts here umfpack.numeric(M) x1 = umfpack.solve( um.UMFPACK_A, M, B1, autoTranspose = False) x2 = umfpack.solve( um.UMFPACK_A, M, B2, autoTranspose = False) solution = scipy.array([ [ x1[n-2], x2[n-2] ], [ x1[n-1], x2[n-1] ]]) return solution ====
This isn't really too much -- it's generating a system matrix via operations that take little time, as I was expecting. Trouble is, the solve part takes significantly more time than Octave -- about 4 times.
I'm using the stock version of UMFPACK in Ubuntu's repository; it's compiled against standard BLAS, so it's fairly slow, but so is Octave -- so the problem isn't there.
I'm obviously doing something wrong related to memory management here, because the memory consumption is also rocketing, but I'm not sure what exactly it is that I'm doing wrong. Could you point me towards some relevant documentation describing what I could do in order to improve the performance, or give me some hint related to that?
Best regards, Alexandru Lazar
participants (1)
-
Ioan-Alexandru Lazar