Sparse matrix objects now support adding a sparse matrix to a dense matrix of the same dimensions with the syntax: newdense = sparse + dense I'd like to add support for the opposite order too: newdense = dense + sparse but this doesn't seem possible currently. This operation calls the __radd__() method of the sparse matrix object multiple times, each time passing a single element from the dense matrix. The problem is that the sparse matrix's __radd__ function also needs to know the shape of the matrix and the position of the element it's receiving. Could we change the default behaviour of ndarray objects to invoke the right-hand object's __radd__ function just once, passing the dense array? I think this is the default behaviour for Python objects. What code would break if we made this change? -- Ed
I'm wondering if the current flags interface couldn't be made a bit easier to use to allow mapping of dictionary keys in to attributes. For example: instead of arr.flags["CONTIGUOUS"] arr.flags.contiguous If using a dictionary is considered useful (e.g., for getting the whole state, it is still possible to allow arr.flags to return a dictionary derived object that can be used wherever dictionaries are accepted, and likewise, assigning to arr.flags should be able to take a dictionary of flags. Any reason we can't support this interface? Perry
Perry Greenfield wrote:
I'm wondering if the current flags interface couldn't be made a bit easier to use to allow mapping of dictionary keys in to attributes. For example:
instead of arr.flags["CONTIGUOUS"]
arr.flags.contiguous
If using a dictionary is considered useful (e.g., for getting the whole state, it is still possible to allow arr.flags to return a dictionary derived object that can be used wherever dictionaries are accepted, and likewise, assigning to arr.flags should be able to take a dictionary of flags. Any reason we can't support this interface?
Mmh. What happens when you say arr.flags.update = 1 and then pass arr.flags to a dict-expecting method which calls arr.flags.update(otherdict) ? I do like the idea of named attribute access, but the issue of name conflicts with the existing dict API needs to be dealt with first. Either certain attribute names are disallowed (case in which you can just steal IPython.Struct for the implementation, which already does all of this), or some other policy must be devised. Regards, f
On Oct 31, 2005, at 9:34 AM, Fernando Perez wrote:
Mmh. What happens when you say
arr.flags.update = 1
and then pass arr.flags to a dict-expecting method which calls
arr.flags.update(otherdict)
?
I do like the idea of named attribute access, but the issue of name conflicts with the existing dict API needs to be dealt with first. Either certain attribute names are disallowed (case in which you can just steal IPython.Struct for the implementation, which already does all of this), or some other policy must be devised.
Regards,
f
Isn't the flag attribute UPDATEIFCOPY? (admittedly, I'm going by a potentially dated version of the "Guide to SciPy") Perry
Perry Greenfield wrote:
On Oct 31, 2005, at 9:34 AM, Fernando Perez wrote:
Mmh. What happens when you say
arr.flags.update = 1
Isn't the flag attribute UPDATEIFCOPY? (admittedly, I'm going by a potentially dated version of the "Guide to SciPy")
I used 'update' simply as an example to illustrate the issue of potential name clashes. I didn't actually know what the flag names were, what I worry about is that now or in the future either the flag names or the dict method list can grow in a direction that causes a clash. Sorry if a poor choice of example caused confusion, I hope the issue I'm trying to point out is clear now. Regards, f
On Oct 31, 2005, at 9:45 AM, Fernando Perez wrote:
I used 'update' simply as an example to illustrate the issue of potential name clashes. I didn't actually know what the flag names were, what I worry about is that now or in the future either the flag names or the dict method list can grow in a direction that causes a clash.
Sorry if a poor choice of example caused confusion, I hope the issue I'm trying to point out is clear now.
Sure, it's a valid concern. (I did look at the list to see if any current one was a conflict; your mail make me think I missed something!) Another possible solution is to have .asdict() and .fromdict() methods to return a dictionary and a set from a dictionary and avoid the whole attribute conflict issue. Perry
Fernando Perez wrote:
Perry Greenfield wrote:
I'm wondering if the current flags interface couldn't be made a bit easier to use to allow mapping of dictionary keys in to attributes. For example:
instead of arr.flags["CONTIGUOUS"]
arr.flags.contiguous
If using a dictionary is considered useful (e.g., for getting the whole state, it is still possible to allow arr.flags to return a dictionary derived object that can be used wherever dictionaries are accepted, and likewise, assigning to arr.flags should be able to take a dictionary of flags. Any reason we can't support this interface?
Mmh. What happens when you say
arr.flags.update = 1
and then pass arr.flags to a dict-expecting method which calls
arr.flags.update(otherdict)
I'm not sure I follow what the problem is. Is there an update flag? First of all, arr.flags already returns a special dictionary (so that tests like Fortran but not contiguous are easy). If you want to use attribute access to get and set the dictionary items, I don't see how that couldn't be done. -Travis
Perry Greenfield wrote:
I'm wondering if the current flags interface couldn't be made a bit easier to use to allow mapping of dictionary keys in to attributes. For example:
instead of arr.flags["CONTIGUOUS"]
arr.flags.contiguous
If using a dictionary is considered useful (e.g., for getting the whole state, it is still possible to allow arr.flags to return a dictionary derived object that can be used wherever dictionaries are accepted, and likewise, assigning to arr.flags should be able to take a dictionary of flags. Any reason we can't support this interface?
Actually this would be easy, because arr.flags already returns a special object defined in scipy/base/_internal.py. So yes, it could be done. -Travis
Ed Schofield wrote:
Sparse matrix objects now support adding a sparse matrix to a dense matrix of the same dimensions with the syntax:
newdense = sparse + dense
I'd like to add support for the opposite order too:
newdense = dense + sparse
but this doesn't seem possible currently. This operation calls the __radd__() method of the sparse matrix object multiple times, each time passing a single element from the dense matrix.
Why does it do this? This does not seem to be the way Python would handle it. Can you track exactly what gets called when dense + sparse is performed? -Travis
Travis Oliphant wrote:
Ed Schofield wrote:
Sparse matrix objects now support adding a sparse matrix to a dense matrix of the same dimensions with the syntax:
newdense = sparse + dense
I'd like to add support for the opposite order too:
newdense = dense + sparse
but this doesn't seem possible currently. This operation calls the __radd__() method of the sparse matrix object multiple times, each time passing a single element from the dense matrix.
Why does it do this? This does not seem to be the way Python would handle it.
Can you track exactly what gets called when
dense + sparse
is performed?
Sure. Actually, it looks like the scipy 0.3.2 handled it fine. This script: ----------------------- class A: def __radd__(a, b=None, c=None): print "A.__radd__() called, with arguments (%s, %s, %s)" % (a, b, c) return "A.radd()\n" def __add__(a, b=None, c=None): print "A.__add__() called, with arguments (%s, %s, %s)" % (a, b, c) return "A.add()\n" class B: def __radd__(a, b=None, c=None): print "B.__radd__() called, with arguments (%s, %s, %s)" % (a, b, c) return "B.radd()\n" a = A() b = B() print "a + b is: " + str(a + b) print "b + a is: " + str(b + a) import scipy c = scipy.array([[1,2,3],[4,5,6]]) print "c + a is: " + str(c + a) print "a + c is: " + str(a + c) ------------------ produces this output with scipy 0.3.2: ------------------ A.__add__() called, with arguments (<__main__.A instance at 0xb7eaff8c>, <__main__.B instance at 0xb7eaffec>, None) a + b is: A.add() A.__radd__() called, with arguments (<__main__.A instance at 0xb7eaff8c>, <__main__.B instance at 0xb7eaffec>, None) b + a is: A.radd() A.__radd__() called, with arguments (<__main__.A instance at 0xb7eaff8c>, [[1 2 3] [4 5 6]], None) c + a is: A.radd() A.__add__() called, with arguments (<__main__.A instance at 0xb7eaff8c>, [[1 2 3] [4 5 6]], None) a + c is: A.add() --------------------- as expected. With newcore (0.4.3.1401) I get: --------------------- A.__add__() called, with arguments (<__main__.A instance at 0xb7e98f8c>, <__main__.B instance at 0xb7e98fec>, None) a + b is: A.add() A.__radd__() called, with arguments (<__main__.A instance at 0xb7e98f8c>, <__main__.B instance at 0xb7e98fec>, None) b + a is: A.radd() Importing io to scipy Importing interpolate to scipy Importing fftpack to scipy Importing special to scipy Importing cluster to scipy Importing sparse to scipy Importing signal to scipy Failed to import signal cannot import name comb Importing utils to scipy Importing lib to scipy Importing integrate to scipy Importing optimize to scipy Importing linalg to scipy Importing stats to scipy A.__radd__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, 1, None) A.__radd__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, 2, None) A.__radd__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, 3, None) A.__radd__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, 4, None) A.__radd__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, 5, None) A.__radd__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, 6, None) c + a is: [[A.radd() A.radd() A.radd() ] [A.radd() A.radd() A.radd() ]] A.__add__() called, with arguments (<__main__.A instance at 0xb7ecaf8c>, [[1 2 3] [4 5 6]], None) a + c is: A.add() -------------------- Tracing through the code, it seems the following functions are called: array_add() PyArray_GenericBinaryFunction(dense, newsparse, n_ops.add) ... a wrapper I don't understand ... then PyNumber_Add() from __umath_generated.c (which supposedly works elementwise) PyArray_FromAny() ... and ... array_fromobject() which queries various attributes like __array__, __array_shape__, and __len__ I haven't yet checked further than this ... e.g. whether the type == PyArray_NOTYPE succeeds ... -- Ed
Ed Schofield wrote: I think this is a result of the fact that arrays are now new style numbers and the fact that array(B) created a perfectly nice object array. I've committed a change that makes special cases this case so the reflected operands will work if they are defined for something that becomes an object array. -Travis
Travis Oliphant wrote:
Ed Schofield wrote:
I think this is a result of the fact that arrays are now new style numbers and the fact that array(B) created a perfectly nice object array.
I've committed a change that makes special cases this case so the reflected operands will work if they are defined for something that becomes an object array.
Is it possible, in principle, to call always the sparse matrix operator method when it enters an operation with a dense array? (a .. dense array, b sparse -- a + b calls b.__radd__, b + a calls b.__add__) The reason I would support this is, that IMHO the higher level object (a sparse matrix) knows well how to handle the lower level object (a dense array) in numeric operations, but not vice-versa -- there is already a number of sparse matrix formats, so the dense array object definitely cannot understand them all. r.
Robert Cimrman wrote:
Travis Oliphant wrote:
I think this is a result of the fact that arrays are now new style numbers and the fact that array(B) created a perfectly nice object array.
I've committed a change that makes special cases this case so the reflected operands will work if they are defined for something that becomes an object array.
Is it possible, in principle, to call always the sparse matrix operator method when it enters an operation with a dense array? (a .. dense array, b sparse -- a + b calls b.__radd__, b + a calls b.__add__)
The reason I would support this is, that IMHO the higher level object (a sparse matrix) knows well how to handle the lower level object (a dense array) in numeric operations, but not vice-versa -- there is already a number of sparse matrix formats, so the dense array object definitely cannot understand them all.
Travis's patch has this effect. Python calls the left operand's __op__ method, but a dense array on the left now yields control to the right operand's __rop__ if it can't interpret the right operand as a normal (non-Object) array. So the tests that were previously commented in test_sparse.py now pass for CSC and CSR matrices. There is one remaining problem with DOK matrices: it never gets this far, first raising a ValueError while trying to interpret it as a sequence. I committed a patch for more graceful handling of such objects that claim to be sequences but don't allow integer indexing. I then reverted this, thinking we should instead fix dok_matrix so PySequence_Check() doesn't return true. But I'm not sure if this is possible. It seems that since Python 2.2 PySequence_Check() returns true for dictionaries, even though they don't allow integer indexing. Isn't this a blatant violation of the sequence protocol? (cf http://docs.python.org/ref/sequence-types.html) Is there any way to override PySequence_Check() to return false for a subclassed dict like dok_matrix? If not I suggest we apply my patch after all -- and complain to the Python developers ;) -- Ed
Ed Schofield wrote:
Robert Cimrman wrote:
Travis Oliphant wrote:
I think this is a result of the fact that arrays are now new style numbers and the fact that array(B) created a perfectly nice object array.
I've committed a change that makes special cases this case so the reflected operands will work if they are defined for something that becomes an object array.
Is it possible, in principle, to call always the sparse matrix operator method when it enters an operation with a dense array? (a .. dense array, b sparse -- a + b calls b.__radd__, b + a calls b.__add__)
The reason I would support this is, that IMHO the higher level object (a sparse matrix) knows well how to handle the lower level object (a dense array) in numeric operations, but not vice-versa -- there is already a number of sparse matrix formats, so the dense array object definitely cannot understand them all.
Travis's patch has this effect. Python calls the left operand's __op__ method, but a dense array on the left now yields control to the right operand's __rop__ if it can't interpret the right operand as a normal (non-Object) array. So the tests that were previously commented in test_sparse.py now pass for CSC and CSR matrices.
Cool, Travis obviously works faster than I am able to write e-mails... :-)
There is one remaining problem with DOK matrices: it never gets this far, first raising a ValueError while trying to interpret it as a sequence. I committed a patch for more graceful handling of such objects that claim to be sequences but don't allow integer indexing. I then reverted this, thinking we should instead fix dok_matrix so PySequence_Check() doesn't return true. But I'm not sure if this is possible. It seems that since Python 2.2 PySequence_Check() returns true for dictionaries, even though they don't allow integer indexing. Isn't this a blatant violation of the sequence protocol? (cf http://docs.python.org/ref/sequence-types.html) Is there any way to override PySequence_Check() to return false for a subclassed dict like dok_matrix? If not I suggest we apply my patch after all -- and complain to the Python developers ;)
I have just tested with: PyObject *isSequence( PyObject *input ) { if (PySequence_Check( input )) { return( PyBool_FromLong( 1 ) ); } else { return( PyBool_FromLong( 0 ) ); } } print isSequence( [] ) print isSequence( (1,) ) print isSequence( {} ) print isSequence( scipy.sparse.dok_matrix( scipy.array( [[1,2,3]] ) ) ) and got (Python 2.4.2): True True False True ... so the problem is not in PySequence_Check(). The DOK matrix inherits not only from dict, but also from spmatrix. Could this cause such a behaviour?? r.
Robert Cimrman wrote:
I have just tested with:
PyObject *isSequence( PyObject *input ) {
if (PySequence_Check( input )) { return( PyBool_FromLong( 1 ) ); } else { return( PyBool_FromLong( 0 ) ); } }
print isSequence( [] ) print isSequence( (1,) ) print isSequence( {} ) print isSequence( scipy.sparse.dok_matrix( scipy.array( [[1,2,3]] ) ) )
and got (Python 2.4.2):
True True False True
... so the problem is not in PySequence_Check(). The DOK matrix inherits not only from dict, but also from spmatrix. Could this cause such a behaviour??
Ah, well done. So a dict doesn't define PySequence_Check(). But, according to my tests, neither does an instance of the spmatrix base class. Instead it seems that any class that inherits from a dict does define PySequence_Check(): class E(dict): pass d = {} e = E() print isSequence(d) print isSequence(e) gives False True Very strange. The same is true with a class derived from UserDict. Can someone explain this behaviour? Meanwhile it seems that any Python class instance that defines the __getitem__ method has PySequence_Check() true by default. Another cruel trick! Can this be overridden without using C? I've found this (somewhat old) comment by GvR, admitting that the sequence protocol is poorly defined: http://mail.python.org/pipermail/python-checkins/2001-September/021227.html Perhaps another PEP is in order? Meanwhile I'll reapply my patch to handle sequences more cautiously... -- Ed
participants (5)
-
Ed Schofield -
Fernando Perez -
Perry Greenfield -
Robert Cimrman -
Travis Oliphant