Splitting sparsetools_wrap source file ?
Hi, I think we already had a discussion about that a few months ago, and I would like to reiterate my wish of splitting the sparsetools_wrap source file. C++ compilers are already dead slow, and with such a big source file, compiling this file alone makes for a big part of the whole scipy build time. Is is possible to split source files ? Or having something smaller ? cheers, David
On Sat, Mar 8, 2008 at 10:14 PM, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Hi,
I think we already had a discussion about that a few months ago, and I would like to reiterate my wish of splitting the sparsetools_wrap source file. C++ compilers are already dead slow, and with such a big source file, compiling this file alone makes for a big part of the whole scipy build time. Is is possible to split source files ? Or having something smaller ?
Bascially, no. You would have to split up the extension module itself, and that could be difficult. sparsetools_wrap.cxx is generated by SWIG. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Bascially, no. You would have to split up the extension module itself, and that could be difficult. sparsetools_wrap.cxx is generated by SWIG.
I know it is generated by swig. But from a quick look to the swig interface file, it looks like sparsetools is just a bunch of utilities functions, which could be split up, no ? For example, one swig generated file per function should be possible, without too much trouble ? Sorry if this is stupid, I have not been using swig for a while, so I do not remember how it works. cheers, David
On Sat, Mar 8, 2008 at 10:52 PM, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Robert Kern wrote:
Bascially, no. You would have to split up the extension module itself, and that could be difficult. sparsetools_wrap.cxx is generated by SWIG.
I know it is generated by swig. But from a quick look to the swig interface file, it looks like sparsetools is just a bunch of utilities functions, which could be split up, no ? For example, one swig generated file per function should be possible, without too much trouble ? Sorry if this is stupid, I have not been using swig for a while, so I do not remember how it works.
If anything that would just exacerbate the problem. All of the SWIG utility functions would be repeated for each module. In any case, I split the file into 2 and the total time for compiling both is about the same as as compiling the current sparsetools_wrap.cxx. This would be a win for parallel builds, but until numscons is the official build system, I'm against making changes as invasive as splitting up an extension module just to optimize for that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern <robert.kern <at> gmail.com> writes:
If anything that would just exacerbate the problem. All of the SWIG utility functions would be repeated for each module. In any case, I split the file into 2 and the total time for compiling both is about the same as as compiling the current sparsetools_wrap.cxx. This would be a win for parallel builds, but until numscons is the official build system, I'm against making changes as invasive as splitting up an extension module just to optimize for that.
It is not so much the build time that the memory consumption which I found problematic. It is becoming difficult to build scipy on my laptop with 512 Mb (compiling the sparsetools_wrap.cxx with gcc -O2 takes around 400 Mb, and some compilers are even worse), or on virtual machines. Is the patch really that invasive ? Splitting in 2 or 3 would already be good if the memory consumption reduction is decreased by the same order. cheers, David
On Sun, Mar 9, 2008 at 10:12 PM, David Cournapeau <cournape@gmail.com> wrote:
It is not so much the build time that the memory consumption which I found problematic. It is becoming difficult to build scipy on my laptop with 512 Mb (compiling the sparsetools_wrap.cxx with gcc -O2 takes around 400 Mb, and some compilers are even worse), or on virtual machines.
Is the patch really that invasive ? Splitting in 2 or 3 would already be good if the memory consumption reduction is decreased by the same order.
Splitting the file into multiple parts does reduce the memory usage, but not by the expected fraction. Aside from manually splitting the SWIG output into multiple files (which would be tedious, time consuming, and error-prone), I'm not sure how to remedy the situation. In the era of $25/GB RAM, is it not more expedient to simply increase your memory capacity? Using SWIG and C++ templates is a major convenience in sparsetools since adding new dtypes becomes trivial. However this implies that each function is instantiated ~15 times (once for each dtype) which results in high memory usage. If there's a simple solution that addresses your concerns, I'd be happy to make the necessary changes. Otherwise, I don't think the problem merits complicating sparsetools. -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/
Nathan Bell wrote:
Splitting the file into multiple parts does reduce the memory usage, but not by the expected fraction. Aside from manually splitting the SWIG output into multiple files (which would be tedious, time consuming, and error-prone), I'm not sure how to remedy the situation.
This is obviously a bad solution (splitting the generated swig file), and a nightmare to get right. I was more thinking about splitting the interface file, so that only a couple of functions are generated by each: this should be doable, no ? I can do it, if there is a chance for a patch to be included. There would be say N swig interface files (one for _diagonal, one for _scale, etc...), and sparsetools.py itself would be written manually, but would just import python functions from each generated python modules, that is would be a few lines only (I bet this python module could easily be generated, too, if wanted).
In the era of $25/GB RAM, is it not more expedient to simply increase your memory capacity? Using SWIG and C++ templates is a major convenience in sparsetools since adding new dtypes becomes trivial.
I am not suggesting giving up swig or C++ templates. But the problem is not the cost of memory: when virtual machines came into the game, you hit really quickly the 32 bits limits (or more exactly, the fact that most computers cannot physically handle more than 4 Gb of Memory). For example, when I test numscons on solaris, I use indiana, which is a binary distribution of open solaris available for free, and the VM takes more than 1 Gb of ram when compiling sparsetools. Even on my recent macbook with 2 Gb of Ram, I am at the limit. And virtual machines are the only way for me to test many platforms (and build bots too often run on vmware). cheers, David
On Mon, Mar 10, 2008 at 8:14 AM, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
and a nightmare to get right. I was more thinking about splitting the interface file, so that only a couple of functions are generated by each: this should be doable, no ? I can do it, if there is a chance for a patch to be included. There would be say N swig interface files (one for _diagonal, one for _scale, etc...), and sparsetools.py itself would be written manually, but would just import python functions from each generated python modules, that is would be a few lines only (I bet this python module could easily be generated, too, if wanted).
While better than manually splitting the _wrap file, this approach is still cumbersome. There are ~35 functions in sparsetools, so a 1 function : 1 file policy is not really scalable. I tried lumping all the CSR functions together and found only modest savings. Disabling the templates that unroll dense loops for the BSR matrix (see bsr_matvec) produced a measureable improvement in memory usage so I've committed this version of sparsetools.h to SVN.
I am not suggesting giving up swig or C++ templates. But the problem is not the cost of memory: when virtual machines came into the game, you hit really quickly the 32 bits limits (or more exactly, the fact that most computers cannot physically handle more than 4 Gb of Memory). For example, when I test numscons on solaris, I use indiana, which is a binary distribution of open solaris available for free, and the VM takes more than 1 Gb of ram when compiling sparsetools. Even on my recent macbook with 2 Gb of Ram, I am at the limit. And virtual machines are the only way for me to test many platforms (and build bots too often run on vmware).
Are you saying that g++ fails to compile on the VM, or that it starts swapping to disk? -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/
On Thu, Mar 13, 2008 at 10:15 AM, Nathan Bell <wnbell@gmail.com> wrote:
I am not suggesting giving up swig or C++ templates. But the problem is not the cost of memory: when virtual machines came into the game, you hit really quickly the 32 bits limits (or more exactly, the fact that most computers cannot physically handle more than 4 Gb of Memory). For example, when I test numscons on solaris, I use indiana, which is a binary distribution of open solaris available for free, and the VM takes more than 1 Gb of ram when compiling sparsetools. Even on my recent macbook with 2 Gb of Ram, I am at the limit. And virtual machines are the only way for me to test many platforms (and build bots too often run on vmware).
Are you saying that g++ fails to compile on the VM, or that it starts swapping to disk?
David, I came across this blog post which should address your problem: http://hostingfu.com/article/compiling-with-gcc-on-low-memory-vps I compiled sparsetools_wrap.cxx on with "g++ --param ggc-min-expand=10 --param ggc-min-heapsize=8192" and memory usage peaked at 225MB (on a 32-bit machine). Setting ggc-min-expand=0 takes ages (I threw in the towel at 20 minutes), so I don't recommend that setting :) Please let me know if the problem persists. -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/
Nathan Bell wrote:
David, I came across this blog post which should address your problem: http://hostingfu.com/article/compiling-with-gcc-on-low-memory-vps
I compiled sparsetools_wrap.cxx on with "g++ --param ggc-min-expand=10 --param ggc-min-heapsize=8192" and memory usage peaked at 225MB (on a 32-bit machine). Setting ggc-min-expand=0 takes ages (I threw in the towel at 20 minutes), so I don't recommend that setting :)
Please let me know if the problem persists.
Thanks for those info, I was not aware it was possible to control the memory manager of gcc, that's pretty interesting. Unfortunately, it won't work with other C++ compilers, so I am afraid it cannot be used in our case. cheers, David
On Sat, Mar 15, 2008 at 4:24 AM, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Thanks for those info, I was not aware it was possible to control the memory manager of gcc, that's pretty interesting. Unfortunately, it won't work with other C++ compilers, so I am afraid it cannot be used in our case.
David, I've split sparsetools into five separate modules, one for each matrix type. Previously sparsetools compiled to sparse/_sparsetools.so, whereas now there are sparse/sparsetools/_csr.so and _csc.so etc. I was able to preserve the old interface by combining the new modules in sparse/sparsetools/__init__py (which just imports * from each of the submodules). It may be necessary to do a clean build to test the new code since the old sparsetools.py will be competing with the new sparsetools directory. -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/
Nathan Bell wrote:
While better than manually splitting the _wrap file, this approach is still cumbersome. There are ~35 functions in sparsetools, so a 1 function : 1 file policy is not really scalable.
Well, having one file/function is overkill. I just think that 500-600 Mb of Ram to build one source file is really not good for various reasons; several MB source files is fundementaly a bad thing. Splitting it in half would already be pretty good, and that can be done easily: you just have to be careful about what functions to put in which file to get the memory reduction, though.
I tried lumping all the CSR functions together and found only modest savings.
With two files, I could get from 500 Mb to around 260 Mb, which is already pretty good I think.
Disabling the templates that unroll dense loops for the BSR matrix (see bsr_matvec) produced a measureable improvement in memory usage so I've committed this version of sparsetools.h to SVN.
the *matvec functions are indeed the ones which take a lot of memory for compilation, which is not surprising if they use expression template. Is it really useful ? I certainly do not want this to cause any performance penalty or anything. Putting those functions in a different file is what gives the most advantage in my experiment.
Are you saying that g++ fails to compile on the VM, or that it starts swapping to disk?
Yes, that's what I am saying. I need to allocate 1 Gb of memory for solaris when compiling sparsetools with sun compilers right now. Even by today standards, that's an awful lot of memory. cheers, David
participants (4)
-
David Cournapeau
-
David Cournapeau
-
Nathan Bell
-
Robert Kern