Asking for opinions: Priops
I just ran across the problem of priorities with ndarrays again and it keeps biting me. I did once ago a workaround to get my ``undarray`` class's methods be called when being the second operand of e.g. <ndarray> + <undarray>. But since I wrote it, always Python crashes on exit with the message: Python-32(68665) malloc: *** error for object 0x239680: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug (Of course only if I imported the module. Occasionally I also observed Bus errors, and even segfaults.) I overloaded the numpy ops via numpy.set_numeric_ops() with self-written classes, which are *not* derived from numpy.ufunc, and do not resemble numpy ufuncs completely. So I want to do it properly this time. I therefore started with writing a Letter of Intent, and put it online on http://github.com/friedrichromstedt/priops . Opinions? Friedrich P.S.: I will start coding anyway, but it would be nice. P.P.S.: The package this originates from is also online, under http://github.com/friedrichromstedt/upy, or http://upy.sourceforge.net. I will probably create a small example script demonstrating the crash.
Hello Friedrich,
I have read your proposal. You describe issues that I have also
encountered several times.
I believe that your priops approach would be an improvement over the
current overloading of binary operators.
That being said, I think the issue is not so much numpy but rather the
way Python implements operator overloading using methods like __add__
and __radd__. Hence, your suggestion seems to be a Python Enhancement
Proposal and should be discussed without any references to numpy or
bugs related to numpy.set_numeric_ops.
Maybe you could also have a look at Go's interfaces (Googles
programming language) which seems to be somewhat related to your
approach. Also, have you checked the Python mail archive? Issues like
that tend to be discussed from time to time.
On a more practical note: Why exactly do you use set_numeric_ops? You
could also
1) use numpy.ndarrays with dtype=object
2) or create new numpy.ndarray -like class and set __array_priority__ > 2
both approaches work well for me.
just my 2 cents,
Sebastian
On Thu, Sep 16, 2010 at 2:02 PM, Friedrich Romstedt
I just ran across the problem of priorities with ndarrays again and it keeps biting me. I did once ago a workaround to get my ``undarray`` class's methods be called when being the second operand of e.g. <ndarray> + <undarray>. But since I wrote it, always Python crashes on exit with the message:
Python-32(68665) malloc: *** error for object 0x239680: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug
(Of course only if I imported the module. Occasionally I also observed Bus errors, and even segfaults.) I overloaded the numpy ops via numpy.set_numeric_ops() with self-written classes, which are *not* derived from numpy.ufunc, and do not resemble numpy ufuncs completely.
So I want to do it properly this time.
I therefore started with writing a Letter of Intent, and put it online on http://github.com/friedrichromstedt/priops .
Opinions?
Friedrich
P.S.: I will start coding anyway, but it would be nice.
P.P.S.: The package this originates from is also online, under http://github.com/friedrichromstedt/upy, or http://upy.sourceforge.net. I will probably create a small example script demonstrating the crash. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Wed, Sep 22, 2010 at 1:31 PM, Sebastian Walter < sebastian.walter@gmail.com> wrote:
Hello Friedrich,
I have read your proposal. You describe issues that I have also encountered several times. I believe that your priops approach would be an improvement over the current overloading of binary operators. That being said, I think the issue is not so much numpy but rather the way Python implements operator overloading using methods like __add__ and __radd__. Hence, your suggestion seems to be a Python Enhancement Proposal and should be discussed without any references to numpy or bugs related to numpy.set_numeric_ops. Maybe you could also have a look at Go's interfaces (Googles programming language) which seems to be somewhat related to your approach. Also, have you checked the Python mail archive? Issues like that tend to be discussed from time to time.
On a more practical note: Why exactly do you use set_numeric_ops? You could also 1) use numpy.ndarrays with dtype=object 2) or create new numpy.ndarray -like class and set __array_priority__ > 2 both approaches work well for me.
just my 2 cents, Sebastian
On Thu, Sep 16, 2010 at 2:02 PM, Friedrich Romstedt
wrote: I just ran across the problem of priorities with ndarrays again and it keeps biting me. I did once ago a workaround to get my ``undarray`` class's methods be called when being the second operand of e.g. <ndarray> + <undarray>. But since I wrote it, always Python crashes on exit with the message:
Python-32(68665) malloc: *** error for object 0x239680: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug
(Of course only if I imported the module. Occasionally I also observed Bus errors, and even segfaults.) I overloaded the numpy ops via numpy.set_numeric_ops() with self-written classes, which are *not* derived from numpy.ufunc, and do not resemble numpy ufuncs completely.
So I want to do it properly this time.
I therefore started with writing a Letter of Intent, and put it online on http://github.com/friedrichromstedt/priops .
Opinions?
Friedrich
P.S.: I will start coding anyway, but it would be nice.
P.P.S.: The package this originates from is also online, under http://github.com/friedrichromstedt/upy, or http://upy.sourceforge.net. I will probably create a small example script demonstrating the crash. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi Sebastian,
Thanks for your reply!
2010/9/22 Sebastian Walter
[...] I think the issue is not so much numpy but rather the way Python implements operator overloading using methods like __add__ and __radd__. Hence, your suggestion seems to be a Python Enhancement Proposal and should be discussed without any references to numpy or bugs related to numpy.set_numeric_ops.
Yeah, I agree, but i refrained from spamming the list with two separate mails, although I see now it would have been better, and would have brought more eyes on it. For the PEP, I will look into this and will check the mailing lists. For the technical things, I think the implementation of the operations must lie in the classes, and thus __add__ etc. are in principle okay. But as described in the new README, there is need to organise this functions. priop (note the rename, "priops" turned out to be always spelled "priop" in the source code) could define this new layer.
Maybe you could also have a look at Go's interfaces (Googles programming language) which seems to be somewhat related to your approach.
I will try. Can you provide an URL?
On a more practical note: Why exactly do you use set_numeric_ops? You could also 1) use numpy.ndarrays with dtype=object
This is too slow. And it's eating up memory because of the Python objects stored with all their bells and whistles.
2) or create new numpy.ndarray -like class and set __array_priority__ > 2 both approaches work well for me.
I wanted to avoid exactly this because I think priop is a better approach, via set_numeric_ops(). The new URL is: http://github.com/friedrichromstedt/priop (just for the rename, which may be discussable). It contains now also an implementation, which was much less hard than expected ... Friedrich
Friedrich Romstedt wrote:
I just ran across the problem of priorities with ndarrays again and it keeps biting me. I did once ago a workaround to get my ``undarray`` class's methods be called when being the second operand of e.g. <ndarray> + <undarray>. But since I wrote it, always Python crashes on exit with the message:
Python-32(68665) malloc: *** error for object 0x239680: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug
(Of course only if I imported the module. Occasionally I also observed Bus errors, and even segfaults.) I overloaded the numpy ops via numpy.set_numeric_ops() with self-written classes, which are *not* derived from numpy.ufunc, and do not resemble numpy ufuncs completely.
So I want to do it properly this time.
I therefore started with writing a Letter of Intent, and put it online on http://github.com/friedrichromstedt/priops .
Opinions?
I haven't had time to go into the details, but I love the fact that somebody is about to deal with this problem, it's been bothering me as well. Something that is slightly related that one might as well test is the Sage coercion model. If you haven't, you may want to see if you get anything useful out of http://wiki.sagemath.org/coercion. Essentially, perhaps what you have sketched up + an ability to extend the graph with object conversion routes would be perfect for my own uses. So you can define a function with overloads (A, B) and (A, C), but also that objects of type D can be converted to C (and how). For instance, consider: np.array([1,2,3]) + [1,2,3] Here, list-> array could be handled through a defined coercion to array, rather than having to add an overload for list for every method taking an array. Dag Sverre
Dag Sverre Seljebotn wrote:
Friedrich Romstedt wrote:
I just ran across the problem of priorities with ndarrays again and it keeps biting me. I did once ago a workaround to get my ``undarray`` class's methods be called when being the second operand of e.g. <ndarray> + <undarray>. But since I wrote it, always Python crashes on exit with the message:
Python-32(68665) malloc: *** error for object 0x239680: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug
(Of course only if I imported the module. Occasionally I also observed Bus errors, and even segfaults.) I overloaded the numpy ops via numpy.set_numeric_ops() with self-written classes, which are *not* derived from numpy.ufunc, and do not resemble numpy ufuncs completely.
So I want to do it properly this time.
I therefore started with writing a Letter of Intent, and put it online on http://github.com/friedrichromstedt/priops .
Opinions?
I haven't had time to go into the details, but I love the fact that somebody is about to deal with this problem, it's been bothering me as well.
Something that is slightly related that one might as well test is the Sage coercion model. If you haven't, you may want to see if you get anything useful out of http://wiki.sagemath.org/coercion.
Essentially, perhaps what you have sketched up + an ability to extend the graph with object conversion routes would be perfect for my own uses. So you can define a function with overloads (A, B) and (A, C), but also that objects of type D can be converted to C (and how). For instance, consider:
np.array([1,2,3]) + [1,2,3]
Here, list-> array could be handled through a defined coercion to array, rather than having to add an overload for list for every method taking an array.
Btw, I was just using numPy as an example, not suggesting that NumPy adopt priops (we can "eliminate" NumPy through __array_priority__ , as long as everyone else use priops?). MyObject() + [1,2,3], with MyObject only knowing about np.ndarray, would have been better... Dag Sverre
2010/9/23 Dag Sverre Seljebotn
Essentially, perhaps what you have sketched up + an ability to extend the graph with object conversion routes would be perfect for my own uses. So you can define a function with overloads (A, B) and (A, C), but also that objects of type D can be converted to C (and how). For instance, consider:
np.array([1,2,3]) + [1,2,3]
Here, list-> array could be handled through a defined coercion to array, rather than having to add an overload for list for every method taking an array.
This seems to be a good thing. Let's reason about this for some moment. Atm, the relation is *not* transitive. (I.e., (A, B) and (B, C) does imply nothing for (A, C). A, B, C classes.) But such kind of transitivity is what you mean, if (A, B) is defined and there is a graph egde in the "conversion graph" (B, C), then (A, C) can choose the (A, B) route via C -> B conversion; here the notation (B, C) in the conversion graph means "conversion from C to B". I don't see a clear solution at the end satisfying me. It seems that one really has to conduct a search in the additional conversion graph. Since this is expensive, I believe it would probably be good to derive a "ConversionPriop" from priop.Priop. What are your thoughts now? I feel it useful to add this conversion graph, since it creates many egdes in the resulting effective coercion graph, which do not have all to be specified explicitly, but which exist. Before, this conversion graph exists already in the sense of subclassing, i.e., if your second object is a subclass of `list`, in your example, it will be sufficient to define the edge with `list`, and it will find the edge. Would this suffice for your needs? It could be that this is even safer than magic conversion pathes. Maybe it's even better, more straightforward. I'm really unsure about this. Please give me your feedback.
np.array([1,2,3]) + [1,2,3]
For this, a) numpy.ndarray could be subclassing list, if it doesn't do already, but then it would look like list concatenation. b) It would be sufficient to add the edge with `list` as outlined before. So, now I tend to *not* adding the functionality ... :-/ Maybe the strongest argument in favour of adding it nevertheless is, that, in our example, ndarray can be seen at a "view" of a list. So operations with objects that can be seen as something else should be supported, especially if the class it is seen as is more deep in the class hierarchy. This means, edges (ndarray, ndarray) in fact should match on (ndarray, list), although `ndarray` is more deep than `list`. (Opposed to the case one defines (ndarray, list), and this matches (ndarrary, list_subclass), seeing only the `list` functionality in `list_subclass`.) Would it, having this in mind, suffice to add a transitivity not only for subclasses, but also for superclasses? If (A, B) shall be coerced and C is a subclass of B, i.e. B a superclass of C, then (A, B) should translate to (A, C), using C(b) for the operand? This means, C *extends* the functionality of B only. I'm feeling quite good with this, because it's less arbitrary than allowing for *all* conversion routes. Actually it's nothing more than giving the class C as the conversion function, since classes are callable. And since not all subclasses maybe need more than a superclass instance for construction ..., the graph still has to created manually. Looking further into this, it seems to suffice classes which can be constructed from their superclass instances only. E.g. class C just needs to be registered as an "extending class", and if in (A, B) it holds that ``isinstance(b, C)`` while (A, B) is not defined but (A, C), then the (A, C) edge will be called via the C(b) constructor. Conversion would just happen using a *take* argument to the registration, optionally being a string defining the kwarg used in the C constructor for construction from base class instances. E.g.: priop.extender(numpy.ndarray, constructor=numpy.asarray, extends=list) priop.extender(my_class, take='universal') # *universal* may be some late kwarg in the constructor. # Automatically applies to all classes `my_class` derives from. Unfortunately, ndarray isn't a subclass of list. Maybe this would be worth a NEP, if it is technically possible. Otherwise, the first form would suffice it. ? It got a bit longish, Friedrich P.S.: I like the fact that I understood that "conversion a->b" means "B extends A". As opposed to the fact "B functionality is a subset of A functionality".
Friedrich Romstedt wrote:
2010/9/23 Dag Sverre Seljebotn
: Essentially, perhaps what you have sketched up + an ability to extend the graph with object conversion routes would be perfect for my own uses. So you can define a function with overloads (A, B) and (A, C), but also that objects of type D can be converted to C (and how). For instance, consider:
np.array([1,2,3]) + [1,2,3]
Here, list-> array could be handled through a defined coercion to array, rather than having to add an overload for list for every method taking an array.
This seems to be a good thing. Let's reason about this for some moment. Atm, the relation is *not* transitive. (I.e., (A, B) and (B, C) does imply nothing for (A, C). A, B, C classes.) But such kind of transitivity is what you mean, if (A, B) is defined and there is a graph egde in the "conversion graph" (B, C), then (A, C) can choose the (A, B) route via C -> B conversion; here the notation (B, C) in the conversion graph means "conversion from C to B".
I don't see a clear solution at the end satisfying me. It seems that one really has to conduct a search in the additional conversion graph. Since this is expensive, I believe it would probably be good to derive a "ConversionPriop" from priop.Priop.
What are your thoughts now?
You can just cache all lookups in the conversion graph, so that after some initialization all lookups are O(1). There's a limited number of types in a Python runtime, and the actual distinct lookups performed are likely to not be many. I don't see that as a problem at all. I'll try to remember to get back to the rest in a week or so...I'm handing in my MSc on Thursday :-) Dag Sverre
participants (4)
-
Charles R Harris
-
Dag Sverre Seljebotn
-
Friedrich Romstedt
-
Sebastian Walter