The JVM backend and Jython

Hi all, It was nice meeting up with many of you at PyCon! I've been thinking about the first steps towards collaboration between the Jython project and the PyPy project. It looks like it isn't going to be too long before we are all (CPython, PyPy, IronPython, Jython, etc) working on a single shared repository for all of our standard library .py code. In my ideal world there would come a day when there is also no standalone Java code in the Jython project: that is the shared standard library would contain all of Jython's .py files, and all of the Java would be generated from PyPy and Jython as a standalone project would disappear. It is possible that this is too ambitious, but big goals are more fun, right? In reality even if this where to get going, I imagine it would be a 10+ year plan :) So to my question - just how broken is the JVM backend? Are there workarounds that would allow the Java code to get generated? I ask because I would like to evaluate the generated Java parser as a potential replacement for our current ANTLR based parser. It seems like a nice baby step towards real collaboration since it seems like a relatively easy place to start. Clearly it would need adjustments to actually work for Jython - but I'd be able to look into that part. I don't think I have the time to try to unbreak the translation though... -Frank

Hi Frank, On 30/03/11 04:40, fwierzbicki@gmail.com wrote:
wow, that's definitely a nice (and big) plan :-)
So to my question - just how broken is the JVM backend? Are there workarounds that would allow the Java code to get generated?
"not much broken". Last time I tried, the only broken thing was "virtualrefs", which is something needed for the jit, but that at the moment is not supported at all by ootype and thus it blocks the translation. However, I think that fixing it is probably very easy: there is a branch for this which was started by Ademan: https://bitbucket.org/pypy/pypy/src/ootype-virtualrefs Dan, do you plan to finish the work on it? Else, I can just do it probably.
I have to warn you that at the moment, you cannot invoke any Java code from RPython. Implementing it has been on my todo list for years now :-(, but I never managed to find the time and the motivation to do it. However, for using the PyPy parser inside Jython it should be enough to do the other way around, i.e. call RPython code from Java, which should be possible. ciao, Anto

On Wed, Mar 30, 2011 at 2:37 PM, fwierzbicki@gmail.com <fwierzbicki@gmail.com> wrote:
IIRC the jvm backend generates java bytecode directly in text form for a java assembler (I forgot the name of it), maybe a step would be to see if there is any way to import the .class back in a java program. -- Leonardo Santagada

On 30/03/11 19:37, fwierzbicki@gmail.com wrote:
yes, I think it makes sense. Actually, as Leonardo says we don't generate java code but assembler which is converted to .class by jasmin. However, it should not change anything.
that would be extremely cool :-) Ok, so if Ademan tells me that he's not going to work on the ootype-virtualref branch, I'll try to finish the work so you can start playing with it. ciao, Anto

On 31/03/11 21:57, Maciej Fijalkowski wrote:
well, no. Virtualrefs were introduced for the JIT, but they also need to be supported by normal backends. This is why translation is broken at the moment. It is true that the implementation is straightforward, though (I suppose this is what you meant originally :-)) ciao, Anto

On Mar 30, 2011, at 12:18 AM, Antonio Cuni wrote:
FYI there's an interesting solution on how to call into arbitrary Java code from an invokedynamic enabled language via Atilla Szegedi's somewhat experimental Meta Object Protocol: https://github.com/szegedi/dynalink Basically on Java 7 invokedynamic a dynamic language invocation instruction is something along the lines of: obj.someattr = someobj -> invokedynamic "__setattr__"(Ljava/lang/String;Ljava/lang/Object;I)V; Which might dispatch to a PyObject.__setattr__(String name, PyObject value) method (or easily something completely different in invokedynamic land). The MOP ads another layer in between the invocation and the call site. So as the language implementor you'd use the MOP library to 'relink' your __setattr__ call site to the meta object protocol's more generic version (it only supports a few features but one of them is generic property access). Then you can do the invocation via the MOP, something like: invokedynamic "dyn:setProp:someattr"(Ljava/lang/Object;I)V; The point being that other JVM languages will eventually support the MOP protocol and then you'd get property access, invocation, etc. to those languages for free. More importantly, out of the box the MOP lib implements the protocol for plane old Java objects. If you're already using invokedynamic the library seems simple to hook into and there's basically no call overhead added. I'm not sure this would even be applicable to RPython as it's more static in nature. But it will certainly help in calling Java from regular Python. -- Philip Jenvey

fwierzbicki@gmail.com, 30.03.2011 04:40:
On a somewhat related note, the Cython project is pushing towards reimplementing parts of CPython's stdlib C modules in Cython. That would make it easier for other projects to use the implementation in one way or another, rather than having to reimplement and maintain it separately by following C code. http://thread.gmane.org/gmane.comp.python.devel/122273/focus=122716 The advantage for other-than-CPython-Pythons obviously depends on the module. If it's just implemented in C for performance reasons (e.g. itertools etc.), it would likely end up as a Python module with additional static typing, which would make it easy to adapt. If it's using lots of stuff from libc and C I/O, or even from external libraries, the code itself would obviously be less useful, although it would likely still be easier to port changes/fixes. Stefan

Hi Frank, On 30/03/11 04:40, fwierzbicki@gmail.com wrote: [cut]
So to my question - just how broken is the JVM backend? Are there workarounds that would allow the Java code to get generated?
so, now the jvm (and cli) translation works again. You can just type ./translate.py -b jvm, and the fish the relevant .class/.j files from /tmp/usession-default-*/pypy. ciao, Anto

Hi Frank, On 30/03/11 04:40, fwierzbicki@gmail.com wrote:
wow, that's definitely a nice (and big) plan :-)
So to my question - just how broken is the JVM backend? Are there workarounds that would allow the Java code to get generated?
"not much broken". Last time I tried, the only broken thing was "virtualrefs", which is something needed for the jit, but that at the moment is not supported at all by ootype and thus it blocks the translation. However, I think that fixing it is probably very easy: there is a branch for this which was started by Ademan: https://bitbucket.org/pypy/pypy/src/ootype-virtualrefs Dan, do you plan to finish the work on it? Else, I can just do it probably.
I have to warn you that at the moment, you cannot invoke any Java code from RPython. Implementing it has been on my todo list for years now :-(, but I never managed to find the time and the motivation to do it. However, for using the PyPy parser inside Jython it should be enough to do the other way around, i.e. call RPython code from Java, which should be possible. ciao, Anto

On Wed, Mar 30, 2011 at 2:37 PM, fwierzbicki@gmail.com <fwierzbicki@gmail.com> wrote:
IIRC the jvm backend generates java bytecode directly in text form for a java assembler (I forgot the name of it), maybe a step would be to see if there is any way to import the .class back in a java program. -- Leonardo Santagada

On 30/03/11 19:37, fwierzbicki@gmail.com wrote:
yes, I think it makes sense. Actually, as Leonardo says we don't generate java code but assembler which is converted to .class by jasmin. However, it should not change anything.
that would be extremely cool :-) Ok, so if Ademan tells me that he's not going to work on the ootype-virtualref branch, I'll try to finish the work so you can start playing with it. ciao, Anto

On 31/03/11 21:57, Maciej Fijalkowski wrote:
well, no. Virtualrefs were introduced for the JIT, but they also need to be supported by normal backends. This is why translation is broken at the moment. It is true that the implementation is straightforward, though (I suppose this is what you meant originally :-)) ciao, Anto

On Mar 30, 2011, at 12:18 AM, Antonio Cuni wrote:
FYI there's an interesting solution on how to call into arbitrary Java code from an invokedynamic enabled language via Atilla Szegedi's somewhat experimental Meta Object Protocol: https://github.com/szegedi/dynalink Basically on Java 7 invokedynamic a dynamic language invocation instruction is something along the lines of: obj.someattr = someobj -> invokedynamic "__setattr__"(Ljava/lang/String;Ljava/lang/Object;I)V; Which might dispatch to a PyObject.__setattr__(String name, PyObject value) method (or easily something completely different in invokedynamic land). The MOP ads another layer in between the invocation and the call site. So as the language implementor you'd use the MOP library to 'relink' your __setattr__ call site to the meta object protocol's more generic version (it only supports a few features but one of them is generic property access). Then you can do the invocation via the MOP, something like: invokedynamic "dyn:setProp:someattr"(Ljava/lang/Object;I)V; The point being that other JVM languages will eventually support the MOP protocol and then you'd get property access, invocation, etc. to those languages for free. More importantly, out of the box the MOP lib implements the protocol for plane old Java objects. If you're already using invokedynamic the library seems simple to hook into and there's basically no call overhead added. I'm not sure this would even be applicable to RPython as it's more static in nature. But it will certainly help in calling Java from regular Python. -- Philip Jenvey

fwierzbicki@gmail.com, 30.03.2011 04:40:
On a somewhat related note, the Cython project is pushing towards reimplementing parts of CPython's stdlib C modules in Cython. That would make it easier for other projects to use the implementation in one way or another, rather than having to reimplement and maintain it separately by following C code. http://thread.gmane.org/gmane.comp.python.devel/122273/focus=122716 The advantage for other-than-CPython-Pythons obviously depends on the module. If it's just implemented in C for performance reasons (e.g. itertools etc.), it would likely end up as a Python module with additional static typing, which would make it easy to adapt. If it's using lots of stuff from libc and C I/O, or even from external libraries, the code itself would obviously be less useful, although it would likely still be easier to port changes/fixes. Stefan

Hi Frank, On 30/03/11 04:40, fwierzbicki@gmail.com wrote: [cut]
So to my question - just how broken is the JVM backend? Are there workarounds that would allow the Java code to get generated?
so, now the jvm (and cli) translation works again. You can just type ./translate.py -b jvm, and the fish the relevant .class/.j files from /tmp/usession-default-*/pypy. ciao, Anto
participants (6)
-
Antonio Cuni
-
fwierzbicki@gmail.com
-
Leonardo Santagada
-
Maciej Fijalkowski
-
Philip Jenvey
-
Stefan Behnel