It turns out there is some work in progress in the Spark project to share its memory with non JVM programs. See https://issues.apache.org/jira/browse/SPARK-10399.  Once this is completed it should be fairly trivial to expose it to Python and then maybe JIT integration could be discussed at that time.  This is a huge step forward over sharing Java objects.  From the title of the ticket it appears it would be a c++ interface but looking at the pull request it looks like it will be a c interface.

In the end the blocker may just come down to PyPy having complete support for Numpy. Without Numpy the success of this would be somewhat limited based on user expectations and without PyPy it maybe to slow for many applications.

On Thu, Mar 24, 2016 at 1:11 PM, John Camara <john.m.camara@gmail.com> wrote:
Hi Armin,

At a minimum tighter execution is required as well as sharing memory.  But on the other hand you have raised the bar so high with cffi, having a clean and unbloated interface, that it would be nice if a library with a similar spirit existed for java. Having support in PyPy's JIT to remove all the marshalling types would be a big plus on top of the shared memory as well as some integration between the 2 GCs would likely be required.

Maybe the best approach would be a combination of existing libraries and a new interface that allows for sharing of memory.  Maybe similar to numpy arrays with a better API that avoids the pit falls of numpy relying on CPython semantics/implementation details.  After all the only thing that needs to be eliminated is the copying/serialization of large data arrays/structures.

John