
The attached script is an attempt to get __str__ to work in rpython. After annotation, it looks for str(someinstance), rewriting the block to call __str__. Then the modified blocks are fed back into the annotator. It seems to work. (And now i know a lot more about the internals of the translation process...) I had a think about over-riding consider_op_str. It would seem natural for this code to go there, but the annotator is not really set up to handle rewriting. Why do i not need to manually call the flow operation on the __str__ ? (see the flow_str function) It seems the annotator somehow must re-trigger the flowing operation when it finds that __str__ is being called. thanks, Simon.

On 26.03.2007, at 23:30, Simon Burton wrote:
This looks good, congrats! I had a different approach in mind which would work without explicitly adding a pass to the annotator, no idea which idea is more practical. What I wanted to do is a tiny patch to flow space that allows to add little plugins for extension. My plugin would intercept things like str(...) during flowing, and rewrite accordingly. This is quite similar to what you did, but does things earlier. I'm not sure about this yet, but I think when doing these things after annotation, then there might be some trouble with annotating certain things, especially when I think of supporting __add__ and friends. It might be necessary to do this expansion earlier to make annotation work. As I'm writing this, I'm now pretty sure that this is true. Conclusion (should think before writing) My idea (and plan) in general is to add an interceptor plugin to flow space (or a subclass, trying to make tiny additions only that don't break anything). Then all special methods are checked for all operations and the code is inserted, like a preprocessor during flowing. cheers - chris

On 28.03.2007, at 20:19, Simon Burton wrote:
Yes you are right. Well maybe not so right. Is there any object that has no __str__ method? If yes, then this is an RPython syntax error. I thing we *always* can call the object's __str__ method if it is RPython code. What needs to be added is a handler routine. It must be registered as a function that specializes on the type of its argument. This way it will be the right __str__ call for every type. ciao -- chris

Hi Simon, On Mon, Mar 26, 2007 at 02:30:15PM -0700, Simon Burton wrote:
The "official" approach is quite different. It would involve a consider_op_str() on SomeInstance, as you also thought. It is in some sense harder, but more robust - I certainly wouldn't be happy to check in code in PyPy that adds a rewriting pass in the middle of annotation... For example, your approach only supports direct 'str(x)' calls, which is somehow the easy case - because they can be manually replaced by 'x.__str__()' in the source code anyway - but not indirect cases like 'str([x, y, z])' where the x, y and z have a custom __str__() method. To do this properly you need a consider_op_str() using bookkeeper.emulate_pbc_call(), a lot of patience understanding what rpbc.py is doing, and probably a call to hlinvoke() in the ll_str() of rclass.py... and then the same for the oo type system, if you want to be complete. Argh. All in all I'll stick to the point of view that adding support for special methods in RPython is a very dangerous direction to go: where do you stop? Is __add__() RPython? Is the full logic of __add__() versus __radd__() RPython? A bientot, Armin.

On 29.03.2007, at 18:00, Armin Rigo wrote:
Agreed, that's the hard but complete way.
I think the idea was exactly to only support the simple cases, where you can manually use x.__str__ I would certainly not go the path to add full support for these things to the annotator. Instead, I would just expand things in a way to support the simple cases, and not as an addition to PyPy's core, but as an add-on.
Simple, just a little nicer. I would not support __radd__ at all, but just __add__ and always enforce the same types. This idea is really not about a huge extension, but some simple optional additions that let code look a little more like Python. Shortly put: anything that needs to seriously change the annotator should not be considered. Some syntactic sugar does not hurt. You think even that makes no sense, right? At least it should not hurt... ciao - chris

Hi Christian, On Thu, Mar 29, 2007 at 07:16:36PM +0200, Christian Tismer wrote:
I think it hurts because it's obscure to describe: "you can write str(x) and x.__str__() will be called, but if you write str([x]) then x.__str__() will not be called"... I'm always open to the possibility that there are use cases where such a hack would be enough, though. A bientot, Armin

Hi Michael, On Thu, Mar 29, 2007 at 06:26:37PM +0100, Michael Hudson wrote:
Well, if you write str([x]) then x.__str__() will certainly not be called :-)
Ah right. Then '%s' % (x,) . So far the rtyper has no notion of the "repr" of an object being something else than its "str", so str([x]) is really the same as '[' + str(x) + ']'. All this stuff would need clean-ups before we go more in the direction of fully supporting custom __str__() methods... A bientot, Armin.

On Thu, 29 Mar 2007 18:00:47 +0200 Armin Rigo <arigo@tunes.org> wrote:
Yes, it does seem to lack delicacy.
Yes, i see. String comprehensions don't work either.
Well, thanks for the keywords. I already understand the codebase much better from hacking around with this, so it's not entirely a waste of time if you end up saying "nah that sucks". :)
What is your concern here ? Does it screw up the JIT, or some other aspect I am missing ? I guess, the full __add__ / __radd__ semantics is a little tricky, and implementing it statically would likely (at least if i tried) produce the kind of brutal code that is perhaps difficult to maintain.. However, just having __str__ support would be really handy. bye for now, Simon.

Hi Simon, On Thu, Mar 29, 2007 at 10:18:06AM -0700, Simon Burton wrote:
What is your concern here ? Does it screw up the JIT, or some other aspect I am missing ?
No, just the obscurity of these methods: the full Python __add__/__radd__ semantics are more than a little tricky. They are impossible to implement statically, because in order to know which one to call first you need to know the two exact subclasses of the arguments, which you only know at run-time. The RPython approach so far has at least a clear message: no special methods, apart from __init__() and __del__(). I'm not against adding a few of them, to be honest; e.g. __getitem__() would be my favorite. But then they should be fully implemented. For example, I just realized that without the full rtyper solution, your patch can work for str(x) but not for '%s' % (x,), which looks rather inconsistent. A bientot, Armin.

On Thu, 29 Mar 2007 18:00:47 +0200 Armin Rigo <arigo@tunes.org> wrote:
I am very curious about this suggestively named hlinvoke.. It seems that if we can get the ll_str of rclass to call the low-level-ized version of the __str__ method then we are done. Is this the idea behind hlinvoke ? (... looking now at test_rpbc...) Simon.

Hi Simon, On Thu, Mar 29, 2007 at 06:37:44PM -0700, Simon Burton wrote:
Yes. It's used to call back RPython functions from low-level helpers, where it's normally not possible. It's used by objectmodel.rdict() to call the RPython functions that implement the custom key equality and hash. See lltypesystem/rdict.py for an example... You don't get method dispatch for free - you have to call an RPython function, not a method, so you'd need to manually add in the vtable of the class a field for the low-level function pointer to the RPython function. Note also that hlinvoke() is not yet implemented for the ootypesystem. A bientot, Armin.

On 26.03.2007, at 23:30, Simon Burton wrote:
This looks good, congrats! I had a different approach in mind which would work without explicitly adding a pass to the annotator, no idea which idea is more practical. What I wanted to do is a tiny patch to flow space that allows to add little plugins for extension. My plugin would intercept things like str(...) during flowing, and rewrite accordingly. This is quite similar to what you did, but does things earlier. I'm not sure about this yet, but I think when doing these things after annotation, then there might be some trouble with annotating certain things, especially when I think of supporting __add__ and friends. It might be necessary to do this expansion earlier to make annotation work. As I'm writing this, I'm now pretty sure that this is true. Conclusion (should think before writing) My idea (and plan) in general is to add an interceptor plugin to flow space (or a subclass, trying to make tiny additions only that don't break anything). Then all special methods are checked for all operations and the code is inserted, like a preprocessor during flowing. cheers - chris

On 28.03.2007, at 20:19, Simon Burton wrote:
Yes you are right. Well maybe not so right. Is there any object that has no __str__ method? If yes, then this is an RPython syntax error. I thing we *always* can call the object's __str__ method if it is RPython code. What needs to be added is a handler routine. It must be registered as a function that specializes on the type of its argument. This way it will be the right __str__ call for every type. ciao -- chris

Hi Simon, On Mon, Mar 26, 2007 at 02:30:15PM -0700, Simon Burton wrote:
The "official" approach is quite different. It would involve a consider_op_str() on SomeInstance, as you also thought. It is in some sense harder, but more robust - I certainly wouldn't be happy to check in code in PyPy that adds a rewriting pass in the middle of annotation... For example, your approach only supports direct 'str(x)' calls, which is somehow the easy case - because they can be manually replaced by 'x.__str__()' in the source code anyway - but not indirect cases like 'str([x, y, z])' where the x, y and z have a custom __str__() method. To do this properly you need a consider_op_str() using bookkeeper.emulate_pbc_call(), a lot of patience understanding what rpbc.py is doing, and probably a call to hlinvoke() in the ll_str() of rclass.py... and then the same for the oo type system, if you want to be complete. Argh. All in all I'll stick to the point of view that adding support for special methods in RPython is a very dangerous direction to go: where do you stop? Is __add__() RPython? Is the full logic of __add__() versus __radd__() RPython? A bientot, Armin.

On 29.03.2007, at 18:00, Armin Rigo wrote:
Agreed, that's the hard but complete way.
I think the idea was exactly to only support the simple cases, where you can manually use x.__str__ I would certainly not go the path to add full support for these things to the annotator. Instead, I would just expand things in a way to support the simple cases, and not as an addition to PyPy's core, but as an add-on.
Simple, just a little nicer. I would not support __radd__ at all, but just __add__ and always enforce the same types. This idea is really not about a huge extension, but some simple optional additions that let code look a little more like Python. Shortly put: anything that needs to seriously change the annotator should not be considered. Some syntactic sugar does not hurt. You think even that makes no sense, right? At least it should not hurt... ciao - chris

Hi Christian, On Thu, Mar 29, 2007 at 07:16:36PM +0200, Christian Tismer wrote:
I think it hurts because it's obscure to describe: "you can write str(x) and x.__str__() will be called, but if you write str([x]) then x.__str__() will not be called"... I'm always open to the possibility that there are use cases where such a hack would be enough, though. A bientot, Armin

Hi Michael, On Thu, Mar 29, 2007 at 06:26:37PM +0100, Michael Hudson wrote:
Well, if you write str([x]) then x.__str__() will certainly not be called :-)
Ah right. Then '%s' % (x,) . So far the rtyper has no notion of the "repr" of an object being something else than its "str", so str([x]) is really the same as '[' + str(x) + ']'. All this stuff would need clean-ups before we go more in the direction of fully supporting custom __str__() methods... A bientot, Armin.

On Thu, 29 Mar 2007 18:00:47 +0200 Armin Rigo <arigo@tunes.org> wrote:
Yes, it does seem to lack delicacy.
Yes, i see. String comprehensions don't work either.
Well, thanks for the keywords. I already understand the codebase much better from hacking around with this, so it's not entirely a waste of time if you end up saying "nah that sucks". :)
What is your concern here ? Does it screw up the JIT, or some other aspect I am missing ? I guess, the full __add__ / __radd__ semantics is a little tricky, and implementing it statically would likely (at least if i tried) produce the kind of brutal code that is perhaps difficult to maintain.. However, just having __str__ support would be really handy. bye for now, Simon.

Hi Simon, On Thu, Mar 29, 2007 at 10:18:06AM -0700, Simon Burton wrote:
What is your concern here ? Does it screw up the JIT, or some other aspect I am missing ?
No, just the obscurity of these methods: the full Python __add__/__radd__ semantics are more than a little tricky. They are impossible to implement statically, because in order to know which one to call first you need to know the two exact subclasses of the arguments, which you only know at run-time. The RPython approach so far has at least a clear message: no special methods, apart from __init__() and __del__(). I'm not against adding a few of them, to be honest; e.g. __getitem__() would be my favorite. But then they should be fully implemented. For example, I just realized that without the full rtyper solution, your patch can work for str(x) but not for '%s' % (x,), which looks rather inconsistent. A bientot, Armin.

On Thu, 29 Mar 2007 18:00:47 +0200 Armin Rigo <arigo@tunes.org> wrote:
I am very curious about this suggestively named hlinvoke.. It seems that if we can get the ll_str of rclass to call the low-level-ized version of the __str__ method then we are done. Is this the idea behind hlinvoke ? (... looking now at test_rpbc...) Simon.

Hi Simon, On Thu, Mar 29, 2007 at 06:37:44PM -0700, Simon Burton wrote:
Yes. It's used to call back RPython functions from low-level helpers, where it's normally not possible. It's used by objectmodel.rdict() to call the RPython functions that implement the custom key equality and hash. See lltypesystem/rdict.py for an example... You don't get method dispatch for free - you have to call an RPython function, not a method, so you'd need to manually add in the vtable of the class a field for the low-level function pointer to the RPython function. Note also that hlinvoke() is not yet implemented for the ootypesystem. A bientot, Armin.
participants (4)
-
Armin Rigo
-
Christian Tismer
-
Michael Hudson
-
Simon Burton