data.shape (42, 22300) mlab.corrcoef(data) Traceback (most recent call last): File "<interactive input>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\linear_algebra\mlab.py",
Hi, group, I have a big "Float64" matrix (42x22300) and I want to get its correlation coefficient matrix, but I got the error as the following: line 300, in corrcoef c = cov(x, y) File "C:\Python23\Lib\site-packages\numarray\linear_algebra\mlab.py", line 294, in cov val = squeeze(dot(transpose(m),conjugate(y)) / fact) File "C:\Python23\Lib\site-packages\numarray\numarraycore.py", line 1150, in dot return ufunc.innerproduct(array1, _gen.swapaxes(array2, -1, -2)) File "C:\Python23\Lib\site-packages\numarray\ufunc.py", line 2047, in innerproduct r = a.__class__(shape=adots+bdots, type=rtype) ValueError: new_memory: invalid region size: -633294592. I suspect corrcoef function can not handle such a big matrix. If so, what is the upper limit for array size? How can I get around this problem in numarray? BTW, I am using numarray 0.9/python 2.3.3 on win2kSP4 Thanks. Chunlei
On Tue, 2004-03-16 at 10:41, CL WU wrote:
Hi, group, I have a big "Float64" matrix (42x22300) and I want to get its correlation coefficient matrix, but I got the error as the following:
data.shape (42, 22300) mlab.corrcoef(data) Traceback (most recent call last): File "<interactive input>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\linear_algebra\mlab.py", line 300, in corrcoef c = cov(x, y) File "C:\Python23\Lib\site-packages\numarray\linear_algebra\mlab.py", line 294, in cov val = squeeze(dot(transpose(m),conjugate(y)) / fact) File "C:\Python23\Lib\site-packages\numarray\numarraycore.py", line 1150, in dot return ufunc.innerproduct(array1, _gen.swapaxes(array2, -1, -2)) File "C:\Python23\Lib\site-packages\numarray\ufunc.py", line 2047, in innerproduct r = a.__class__(shape=adots+bdots, type=rtype) ValueError: new_memory: invalid region size: -633294592.
I suspect corrcoef function can not handle such a big matrix. If so, what is the upper limit for array size?
The memory limit is appears to be driven by the numarray.memory and is 2G. Trying to run your function call winds up creating a dot product output array which is 22300**2. This is ~400M * 8 bytes per float just for the dot product output, which is 3.2G, hence the exception. I think 16384**2 is the ideal limit of what you can achieve with numarray, and in practice, think you'll get considerably less depending on how many arrays are needed at once to complete your computation.
How can I get around this problem in numarray?
One possibility is to consider using Float32 to stretch out your memory. I don't know whether that's numerically viable or not. Another way is 64-bit computing. That is largely unexplored territory, and Python itself has issues there. It will likely take some work because we haven't done it yet ourselves. I hope this at least sheds some light on the problem, if not the actual solution. Regards, Todd
BTW, I am using numarray 0.9/python 2.3.3 on win2kSP4
Thanks.
Chunlei
------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller
I'm puzzled that anyone would want to explore the correlation among 42 variables. I would suggest (1) having a go with the more significant of the 42 variables and (2) working with a a much smaller sample, say < 4,000. Todd Miller wrote: [snip]
On Tue, 2004-03-16 at 10:41, CL WU wrote:
One possibility is to consider using Float32 to stretch out your memory. I don't know whether that's numerically viable or not.
Another way is 64-bit computing. That is largely unexplored territory, and Python itself has issues there. It will likely take some work because we haven't done it yet ourselves.
Todd, I'd be grateful if you could clarify this, do we nor have _nt.Float64 now? Colin W.
I hope this at least sheds some light on the problem, if not the actual solution.
Regards, Todd
BTW, I am using numarray 0.9/python 2.3.3 on win2kSP4
Thanks.
Chunlei
------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
participants (3)
-
CL WU
-
Colin J. Williams
-
Todd Miller