[Numpy-discussion] Large symmetrical matrix

Wed Jun 11 16:14:38 EDT 2008

Actually that is very close to what I currently do, which is why I want to
throw it away.

The factor of two memory hit is not really the problem, it is the scaling
O(N^2) which is the limitation

I suspect I may end up using the 1-d array plus arithmetic as it is at least
efficient for retrieval.  I have just started working on a stack which sits
between the vector array and a lookup for a minimum, I think that could
improve performance a great deal.  Next up is a disk based memory map of
some kind which might free me from the constraints of physical memory.
Anyone know of an implementation of array() which automatically reads/writes
to disk?

To answer another suggestion, I'm reluctant to sample not because it is not
a good approach but rather because the commercial application at the end of
this is much harder to explain, justify and sell if I have to include
reasoning for not using all the data.  As scientists this is obvious, as a
lay person, and particularly a business person, it appears to be madness.
The fewer leaps of faith necessary the better.

On Wed, Jun 11, 2008 at 7:34 PM, Anne Archibald <peridot.faceted at gmail.com>
wrote:

> 2008/6/10 Simon Palmer <simon.palmer at gmail.com>:
> > Hi I have a problem which involves the creation of a large square matrix
> > which is zero across its diagonal and symmetrical about the diagonal i.e.
> > m[i,j] = m[j,i] and m[i,i] = 0.  So, in fact, it is a large triangular
> > matrix.  I was wondering whether there is any way of easily handling a
> > matrix of this shape without either incurring a memory penalty or a whole
> > whack of proprietary code?
> >
> > To get through this I have implemented a 1D array which has ((n-1)^2)/2
> > elements inside a wrapper class which manpulates the arguments of array
> > accessors with some arithmetic to return the approriate value.  To be
> honest
> > I'd love to throw this away, but I haven't yet come across a feasible
> > alternative.
> >
> > Any ideas?
>
> Well, there's a risk of shooting yourself in the foot, but here's a
> way to get at the elements quickly and conveniently: store the
> coefficients in a 1D array (as you do already). Then if your matrix is
> M by M, element (i,j) is at position (M-2)*j+i-1, assuming i<j. Of
> course, you can simply use this expression every time you want to look
> an element up, as you do now, but that's no fun. So now create a
> "view" A of this array having offset -1, shape (M,M), and strides
> (1,(M-2)). (This kind of weird view can be created with the ndarray
> constructor.) Then A[i,j] addresses the right element. A[j,i] is of
> course totally wrong. But as long as you avoid i<j, you can use all
> numpy's magical indexing tricks. Unfortunately you can't really use
> ufuncs, since they'll waste time operating on the useless lower half.
>
> Personally, I would live with the factor-of-two memory hit.
>
> Anne
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080611/600f9f4f/attachment.html>