Support for __getitem__ in rpython?
Hi, I've started to play around with the pypy codebase with the intention to make obj[i] act like obj.__getitem__(i) for rpython objects. The approach I tried was to add: class __extend__(pairtype(SomeInstance, SomeObject)): def getitem((s_array, s_index)): s=SomeString() s.const="__getitem__" p=s_array.getattr(s) return p.simple_call(s_index) and then do something like: class __extend__(pairtype(AbstractInstanceRepr, Repr)): def rtype_getitem((r_array, r_key), hop): hop2=hop.copy() ... hop2.forced_opname = 'getattr' hop2.dispatch() hop3=hop.copy() ... hop3.forced_opname = 'simple_call' hop3.dispatch() But I am having a hard time understanding the rtyper and if this is the right approach? Is there anything similar in the code/docs I could look at to get a better understanding on how to write this? Would it be a better solution to add an intermediate step between the annotator and the rtyper that converts getitem(SomeInstance,SomeObject) into vx=getattr(SomeInstance,'__getitem__'); simple_call(vx,SomeObject)? Any suggestions will be appreciated. Thanx! -- Håkan Ardö
On Thu, 4 Dec 2008 10:15:34 +0100 "Hakan Ardo" <hakan@debian.org> wrote:
Hi, I've started to play around with the pypy codebase with the intention to make obj[i] act like obj.__getitem__(i) for rpython objects.
Woohoo!!
The approach I tried was to add:
class __extend__(pairtype(SomeInstance, SomeObject)): def getitem((s_array, s_index)): s=SomeString() s.const="__getitem__" p=s_array.getattr(s) return p.simple_call(s_index)
and then do something like:
class __extend__(pairtype(AbstractInstanceRepr, Repr)): def rtype_getitem((r_array, r_key), hop): hop2=hop.copy() ... hop2.forced_opname = 'getattr' hop2.dispatch() hop3=hop.copy() ... hop3.forced_opname = 'simple_call' hop3.dispatch()
um...
But I am having a hard time understanding the rtyper and if this is the right approach? Is there anything similar in the code/docs I could look at to get a better understanding on how to write this?
I had lots of fun last year writing code in rpython/numpy . There is plenty of getitem goodness there. Unfortunately it is probably impenetrable. Um. Perhaps you could revert back to when the code was sane (but much less functional). I also had a whack at __str__ for classes but failed horribly. Keep pestering me and i'm likely to become interested in this stuff again. Cheering from afar, Simon.
Hi Hakan, On Thu, Dec 04, 2008 at 10:15:34AM +0100, Hakan Ardo wrote:
I've started to play around with the pypy codebase with the intention to make obj[i] act like obj.__getitem__(i) for rpython objects. The approach I tried was to add:
Calling __getitem__ in this way is not really supported, but here is how I managed to get it work anyway. It's only slightly more complicated: class __extend__(pairtype(SomeInstance, SomeObject)): def getitem((s_array, s_index)): # first generate a pseudo call to the helper bk = getbookkeeper() s_callable = bk.immutablevalue(do_getitem) args_s = [s_array, s_index] bk.emulate_pbc_call(('instance_getitem', s_array.knowntype), s_callable, args_s) # then use your own trick to get the correct result s=SomeString() s.const="__getitem__" p=s_array.getattr(s) return p.simple_call(s_index) # this is the helper def do_getitem(array, key): return array.__getitem__(key) do_getitem._annspecialcase_ = 'specialize:argtype(0)' # ^^^ specialization; not sure I have done it right... and then for the code in rclass: from pypy.annotation import binaryop from pypy.objspace.flow.model import Constant class __extend__(pairtype(AbstractInstanceRepr, Repr)): def rtype_getitem((r_array, r_key), hop): # call the helper do_getitem... hop2 = hop.copy() bk = r_array.rtyper.annotator.bookkeeper v_func = Constant(binaryop.do_getitem) s_func = bk.immutablevalue(v_func.value) hop2.v_s_insertfirstarg(v_func, s_func) hop2.forced_opname = 'simple_call' return hop2.dispatch() I think the _annspecialcase_ should be able to sort out between multiple unrelated calls to the helper. The code for the annotator is a bit bogus, btw, because it emulates a call to the function but also computes the result explicitly; but I couldn't figure out a better way. A bientot, Armin.
Hi, thanx for all help. For anyone else interested I've placed a few files on http://hakan.ardoe.net/pypy/ namely: getitem_support.py - The suggested implementation getsetitem_support.py - Generalisation to handle __setitem__ aswell special_methods.py - Generalisation to handle several __xxx__ methods test_getitem.py - Tests for __getitem__ test_matrix.py - Tests using __getitem__, __setitem__ and __add__
do_getitem._annspecialcase_ = 'specialize:argtype(0)' # ^^^ specialization; not sure I have done it right...
I think the _annspecialcase_ should be able to sort out between multiple unrelated calls to the helper.
It seems to be doing it's job. But if I try to apply the same trick to the __getitem__ method it does not seem to work, e.g. if I try to compile the code below it only works if I either do a[i] or a[i,j] calls not a mix of the two. class arr2d: def __init__(self,w,h): self.width=w self.height=h self.data=[i for i in range(w*h)] def __getitem__(self,i): if isinstance(i,int): return self.data[i] elif len(i)==2: return self.data[i[1]*self.width + i[0]] else: raise TypeError __getitem__._annspecialcase_ = 'specialize:argtype(0)' -- Håkan Ardö
Hi Hakan, On Tue, Dec 09, 2008 at 08:14:35PM +0100, Hakan Ardo wrote:
class arr2d: def __init__(self,w,h): self.width=w self.height=h self.data=[i for i in range(w*h)] def __getitem__(self,i): if isinstance(i,int): return self.data[i] elif len(i)==2: return self.data[i[1]*self.width + i[0]] else: raise TypeError __getitem__._annspecialcase_ = 'specialize:argtype(0)'
That's the wrong annotation. For this case, it should be 'specialize:argtype(1)' in order to get two versions of __getitem__, compiled for the two types that can be seen: integers and tuples. A bientot, Armin.
On Sun, Dec 7, 2008 at 6:04 PM, Armin Rigo <arigo@tunes.org> wrote:
class __extend__(pairtype(SomeInstance, SomeObject)): def getitem((s_array, s_index)): # first generate a pseudo call to the helper bk = getbookkeeper() s_callable = bk.immutablevalue(do_getitem) args_s = [s_array, s_index] bk.emulate_pbc_call(('instance_getitem', s_array.knowntype), s_callable, args_s) # then use your own trick to get the correct result s=SomeString() s.const="__getitem__" p=s_array.getattr(s) return p.simple_call(s_index)
unrelated calls to the helper. The code for the annotator is a bit bogus, btw, because it emulates a call to the function but also computes the result explicitly; but I couldn't figure out a better way.
How about instead doing: class __extend__(pairtype(SomeInstance, SomeObject)): def getitem((s_array, s_index)): return call_helper('do_getitem', (s_array, s_index)) def call_helper(name,s_args): bk = getbookkeeper() s_callable = bk.immutablevalue(eval(name)) s_ret=bk.emulate_pbc_call(('instance_'+name,)+tuple([s.knowntype for s in s_args]), s_callable, s_args) for graph in bk.annotator.pendingblocks.values(): if graph.name[0:len(name)]==name: bk.annotator.notify[graph.returnblock][bk.position_key]=1 return s_ret; Is there some way to get hold of the mangled function name of the created graph? The above code might add too many notifies if there are several graphs in pendingblocks with a name starting with the name of the helper. Or is there some better way to get hold of the created graph object if any was created?
__getitem__._annspecialcase_ = 'specialize:argtype(0)'
That's the wrong annotation. For this case, it should be 'specialize:argtype(1)' in order to get two versions of __getitem__,
Right. Sorry about that. -- Håkan Ardö
Hi Hakan, On Fri, Dec 12, 2008 at 04:49:17PM +0100, Hakan Ardo wrote:
How about instead doing:
(...)
Ah, using 'notify' to force a reflow. Obscure :-/
Is there some way to get hold of the mangled function name of the created graph?
Don't look up graphs by name; the name is only there to get information about it when printing the graph. You should probably pass the function object instead of a string giving the name into your helper. Then you can get from the function to the graph(s) with the translator. Armin
On Fri, Dec 12, 2008 at 6:11 PM, Armin Rigo <arigo@tunes.org> wrote:
Hi Hakan,
On Fri, Dec 12, 2008 at 04:49:17PM +0100, Hakan Ardo wrote:
How about instead doing:
(...)
Ah, using 'notify' to force a reflow. Obscure :-/
OK, what's the intended use of the notify feature? The reflow is happening with the previous solution as well. Presumable because p.simple_call(s_index) gets the getitem opperation registered as a call site of the __getitem__ method? Maybe a better solution is to register as a call site of the helper? The following (from rpython/controllerentry.py) seems to do the trick: def call_helper(func,s_args): bk = getbookkeeper() s_callable = bk.immutablevalue(func) return bk.emulate_pbc_call(bk.position_key, s_callable, s_args, callback = bk.position_key) At http://hakan.ardoe.net/pypy/ there is now an implementation of __add__/__radd__ combination in getsetitem_support.py that calls the correct method in all cases I could come up with (test_add.py). It cannot yet handle that the methods return NotImplemented. Would it be possible to handle that in a similar manner to how None is handled? That would remove all unneeded tests if the annotator can prove that a call will always/never return NotImplemented, right? -- Håkan Ardö
Hi Hakan, On Mon, Dec 15, 2008 at 08:47:26PM +0100, Hakan Ardo wrote:
cannot yet handle that the methods return NotImplemented. Would it be possible to handle that in a similar manner to how None is handled?
Not easily. The annotation framework of PyPy was never meant to handle the full Python language, but only a subset reasonable for writing interpreters. Anyway, None-or-integer is not supported either, simply because there is no way to represent that in a single machine word. A bientot, Armin.
On Tue, Dec 23, 2008 at 12:23, Armin Rigo <arigo@tunes.org> wrote:
Hi Hakan,
On Mon, Dec 15, 2008 at 08:47:26PM +0100, Hakan Ardo wrote:
cannot yet handle that the methods return NotImplemented. Would it be possible to handle that in a similar manner to how None is handled?
Not easily. The annotation framework of PyPy was never meant to handle the full Python language, but only a subset reasonable for writing interpreters. Anyway, None-or-integer is not supported either, simply because there is no way to represent that in a single machine word.
There are at least two ways, once you have a singleton (maybe static) None object around: - box all integers and use only pointers - the slow one; - tagged integers/pointers that you already use elsewhere. So integers of up to 31/63 bits get represented directly, while the other ones are through pointers. -- Paolo Giarrusso
Hi Paolo, On Tue, Dec 23, 2008 at 12:29:01PM +0100, Paolo Giarrusso wrote:
There are at least two ways, once you have a singleton (maybe static) None object around: - box all integers and use only pointers - the slow one; - tagged integers/pointers that you already use elsewhere. So integers of up to 31/63 bits get represented directly, while the other ones are through pointers.
Yes, we're using both ways, but for app-level integers, not for regular RPython-level integers. That would be a major slow-down. A bientot, Armin.
Paolo Giarrusso wrote:
There are at least two ways, once you have a singleton (maybe static) None object around: - box all integers and use only pointers - the slow one; - tagged integers/pointers that you already use elsewhere. So integers of up to 31/63 bits get represented directly, while the other ones are through pointers.
I think you are confusing level: here we are talking about RPython, i.e. the language which our Python interpreter is implemented in. Hence, RPython ints are really like C ints, and you don't want to manipulate C ints as tagged pointer, do you? ciao, Anto
On Tue, Dec 23, 2008 at 15:59, Antonio Cuni <anto.cuni@gmail.com> wrote:
Paolo Giarrusso wrote:
There are at least two ways, once you have a singleton (maybe static) None object around: - box all integers and use only pointers - the slow one; - tagged integers/pointers that you already use elsewhere. So integers of up to 31/63 bits get represented directly, while the other ones are through pointers.
I think you are confusing level: here we are talking about RPython, i.e. the language which our Python interpreter is implemented in. Hence, RPython ints are really like C ints, and you don't want to manipulate C ints as tagged pointer, do you?
I understood the difference, but writing "there's no way to represent both of them in a machine word" was a statement that prompted me to write something - actually, I was thinking to just the return convention of __add__ and __radd__. If those method start returning NotImplemented or None, any _sound_ static type analysis won't assign type "int" to them, so it looks (to me, who ignore the content of RPython, I'm aware of that) that it may be possible to do this without tagging _all_ integers. And there are examples of compiled languages with tagged integers (I know at least of OcaML). But can you currently live in RPython without anything which could be a pointer or an integer? Can you have a list like [1, None] in RPython? Then I wonder how do you get an omogeneous call interface for all __add__ methods (i.e. how to force the one returning just integers to also have type NotImplementedOrInteger). And I also wonder if the RPython compiler can inline the __add__ call and optimize the tagging away. That said, I do not know if what I'm suggesting is implementable in RPython, or if it would be a good idea. Just my 2 cents, since this might be what Hakan is looking for. Regards -- Paolo Giarrusso
participants (5)
-
Antonio Cuni
-
Armin Rigo
-
Hakan Ardo
-
Paolo Giarrusso
-
Simon Burton