[Numpy-discussion] sparse array data
francesc at continuum.io
Wed May 2 18:26:53 EDT 2012
On 5/2/12 4:20 PM, Nathaniel Smith wrote:
> On Wed, May 2, 2012 at 9:53 PM, Francesc Alted<francesc at continuum.io> wrote:
>> On 5/2/12 11:16 AM, Wolfgang Kerzendorf wrote:
>>> Hi all,
>>> I'm currently writing a code that needs three dimensional data (for the physicists it's dimensions are atom, ion, level). The problem is that not all combinations do exist (a sparse array). Sparse matrices in scipy only deal with two dimensions. The operations that I need to do on those are running functions like exp(item/constant) on all of the items. I also want to sum them up in the last dimension. What's the best way to make a class that takes this kind of data and does the required operations fast. Maybe some phycisists have implemented these things already. Any thoughts?
>> Curiously enough, I have recently been discussing with Travis O. about
>> how to represent sparse matrices with complete generality. One of the
>> possibilities is to use what Travis call "synthetic dimensions". The
>> idea behind it is easy: use a table with as many columns as dimensions,
>> and add another one for the actual values of the array. For a 3-D
>> sparse array, this looks like:
>> dim0 | dim1 | dim2 | value
>> 0 | 0 | 0 | val0
>> 0 | 10 | 100 | val1
>> 20 | 5 | 202 | val2
> This coordinate format is also what's used by the MATLAB Tensor
> Toolbox. They have a paper justifying this choice and describing some
> tricks for how to work with them:
> (Spoiler: you use a lot of sort operations. Conveniently, timsort
> appears to be perfectly adapted for their algorithmic requirements.)
Uh, I do not have access to the article, but yes, sorting makes a lot of
sense for these scenarios (this is why I was suggesting using indexes,
which is sort of sorting too).
> I'm not sure why one would make up a new term like "synthetic
> dimensions" though, it's just standard coordinate format...
Yeah, this seems a bit weird. Perhaps Travis was referring to other
things and I mixed concepts. Anyways.
More information about the NumPy-Discussion