scipy.sparse spdiags/dia_matrix additional calling mode
Dear all, Given that Python is not MATLAB, would it be possible to add an additional, in some cases more intuitive, way of calling the spdiags function? This function creates a sparse matrix given its non-zero diagonals. Currently, the first input has to be an array 'data' where every row corresponds to a diagonal - which one is specified in the next input, 'diags'. However, as the diagonals have different length, the actual values have to be zero-padded on either the left or right end to fit the array. This is not a big problem if all the elements in a diagonal are the same, for example spdiags([ones(4), -2*ones(4), ones(4)], (-1,0,1), 4, 4) works nicely as some values are just ignored. But when that is not the case, and all the values are already stored in some arrays, it does not make sense that the user should have to pad them with a number of zeros (remembering on which end to put them!) and then also construct one more temporary array to hold them. I suggest that spdiags accepts a list/tuple as data input, where every list element is supposed to be an array holding a diagonal of the correct size. That is, instead of doing something like spdiags(row_stack(r_[zeros(2), a], b, r_[c, zeros(3)]), (-2, 0, 3), n, n ) one would write spdiags([a,b,c], (-2, 0, 3), n, n) Does that make sense? What it boils down to is that you should not have to know how the matrix type is implemented to use it - right? The diagonals could just as well be stored in different ways than they are currently. When I had a look at the source code, it seems that all the work is done in the dia_matrix __init__ method, but the same argument applies there. Currently, you -can- send in a list [a,b,c] where a, b and c are arrays of different length, but the function atleast_2D() is applied to the list which turns it into an array of objects. This obviously causes problems. Kind regards, Tony Stillfjord
Hi, 06.10.2011 10:59, Tony Stillfjord kirjoitti:
Given that Python is not MATLAB, would it be possible to add an additional, in some cases more intuitive, way of calling the spdiags function? This function creates a sparse matrix given its non-zero diagonals.
Currently, the first input has to be an array 'data' where every row corresponds to a diagonal - which one is specified in the next input, 'diags'. However, as the diagonals have different length, the actual values have to be zero-padded on either the left or right end to fit the array. [clip]
I was just thinking about the same thing: the way it works now is a bit of a pain. Currently, I'm using this: def spdiags2(data, diags, M, N, **kw): n = min(M, N) data_mat = np.zeros((len(data), n), dtype=np.common_type(*data)) for j, q in enumerate(data): m = n - abs(diags[j]) if diags[j] >= 0: data_mat[j,-m:] = q else: data_mat[j,:m] = q return sparse.spdiags(data_mat, diags, M, N, **kw) But I agree the `spdiags` function itself should be fixed. -- Pauli Virtanen
06.10.2011 11:33, Pauli Virtanen kirjoitti: [clip]
Currently, the first input has to be an array 'data' where every row corresponds to a diagonal - which one is specified in the next input, 'diags'. However, as the diagonals have different length, the actual values have to be zero-padded on either the left or right end to fit the array. [clip]
I was just thinking about the same thing: the way it works now is a bit of a pain.
After a moment of thought, I believe the semantics of `spdiags` cannot be changed without also affecting the meaning of: sparse.spdiags([[1,2,3]], [1], 4, 4) The remaining options are either to add a keyword flag to the function, or to add a separate function (maybe called `diags` or something) to toggle the more convenient offset semantics. -- Pauli Virtanen
At first I thought that example would fail since the row(s) did not have length 4, but then I realised that there are actually no checks at all on the size of the diagonals, just that the number of them correspond to the number of offsets and that there are no duplicates... I tried it in MATLAB too just to check if this behaviour really is that strange. There, the equivalent expression spdiags([[1,2,3]'], 1, 4, 4) gives an index error (also strange kind of error). With the offset -1 it works fine, though, and also if I add elements to the diagonal so that it has length 4. So somewhat strange, but the scipy implementation does not seem to be fully equivalent to the MATLAB one. Still, the change I suggested would probably have to raise an exception if the list has diagonals of the wrong length, and this would still break more 'normal' constructions like my original example, spdiags([ones(4), -2*ones(4), ones(4)], (-1,0,1), 4, 4), so I agree that this shouldn't be done. I'm more in favour of a keyword flag rather than a new function. spdiags feels like the accepted name for what is to be done, and then how to do it more specifically should be an option within this framework (function). Maybe 'padding = False/True' or 'exact_length = False/True'? Regards, Tony Stillfjord On Thu, Oct 6, 2011 at 2:30 PM, Pauli Virtanen <pav@iki.fi> wrote:
06.10.2011 11:33, Pauli Virtanen kirjoitti: [clip]
Currently, the first input has to be an array 'data' where every row corresponds to a diagonal - which one is specified in the next input, 'diags'. However, as the diagonals have different length, the actual values have to be zero-padded on either the left or right end to fit the array. [clip]
I was just thinking about the same thing: the way it works now is a bit of a pain.
After a moment of thought, I believe the semantics of `spdiags` cannot be changed without also affecting the meaning of:
sparse.spdiags([[1,2,3]], [1], 4, 4)
The remaining options are either to add a keyword flag to the function, or to add a separate function (maybe called `diags` or something) to toggle the more convenient offset semantics.
-- Pauli Virtanen
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
06.10.2011 15:10, Tony Stillfjord kirjoitti: [clip]
I'm more in favour of a keyword flag rather than a new function. spdiags feels like the accepted name for what is to be done, and then how to do it more specifically should be an option within this framework (function). Maybe 'padding = False/True' or 'exact_length = False/True'?
Adding a new function for new semantics, rather than using a keyword argument, feels to me like a clearer design choice here. I think adding a new function called `diags` could be justified here. Numpy already has a function called `diag`, and with the new semantics one would have diags([a, b, c], [i, j, k]) == diag(a, i) + diag(b, j) + diag(c, k) which hangs together nicely. -- Pauli Virtanen
On Oct 6, 2011, at 9:26 AM, Pauli Virtanen wrote:
I think adding a new function called `diags` could be justified here. Numpy already has a function called `diag`, and with the new semantics one would have
diags([a, b, c], [i, j, k]) == diag(a, i) + diag(b, j) + diag(c, k)
which hangs together nicely.
+1
Well, when you put it like that, I guess I agree as well. :) On Thu, Oct 6, 2011 at 3:31 PM, Jonathan Guyer <guyer@nist.gov> wrote:
On Oct 6, 2011, at 9:26 AM, Pauli Virtanen wrote:
I think adding a new function called `diags` could be justified here. Numpy already has a function called `diag`, and with the new semantics one would have
diags([a, b, c], [i, j, k]) == diag(a, i) + diag(b, j) + diag(c, k)
which hangs together nicely.
+1
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
06.10.2011 15:34, Tony Stillfjord kirjoitti: [clip] There goes: https://github.com/scipy/scipy/pull/91 -- Pauli Virtanen
Wow, that was fast. Now I'm going to come up with a list of other changes/improvements to the sparse module, and maybe they will be implemented on Monday. :) Nah, seriously, thanks for the quick work. I had my own work-around function for the last few days too, but having it officially (soon, I guess) in SciPy is so much more convenient. Kind regards, Tony Stillfjord On Thu, Oct 6, 2011 at 10:38 PM, Pauli Virtanen <pav@iki.fi> wrote:
06.10.2011 15:34, Tony Stillfjord kirjoitti: [clip]
There goes:
https://github.com/scipy/scipy/pull/91
-- Pauli Virtanen
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
On Thu, 06 Oct 2011 15:26:14 +0200, Pauli Virtanen wrote:
Adding a new function for new semantics, rather than using a keyword argument, feels to me like a clearer design choice here.
I think adding a new function called `diags` could be justified here. Numpy already has a function called `diag`, and with the new semantics one would have
diags([a, b, c], [i, j, k]) == diag(a, i) + diag(b, j) + diag(c, k)
which hangs together nicely.
However this trailing 's' is not so consistent with other sparse matrix constructors (e.g. eye, kron, identity) which follow NumPy names. -- Denis Laxalde
06.10.2011 16:41, Denis Laxalde kirjoitti: [clip]
However this trailing 's' is not so consistent with other sparse matrix constructors (e.g. eye, kron, identity) which follow NumPy names.
Those, however, are just the sparse version of their Numpy counterparts. The Numpy `diag` function only takes a single diagonal, so one cannot use that as the name. -- Pauli Virtanen
On Thu, 06 Oct 2011 16:57:37 +0200, Pauli Virtanen wrote:
However this trailing 's' is not so consistent with other sparse matrix constructors (e.g. eye, kron, identity) which follow NumPy names.
Those, however, are just the sparse version of their Numpy counterparts. The Numpy `diag` function only takes a single diagonal, so one cannot use that as the name.
That's a good point, indeed.
participants (4)
-
Denis Laxalde -
Jonathan Guyer -
Pauli Virtanen -
Tony Stillfjord