[pypy-dev] connecting multiple interpreters (was: Re: change of strategy for the py3k branch?)

Thu May 31 23:18:09 CEST 2012

On Thu, May 31, 2012 at 11:27 +0200, Martijn Faassen wrote:
> On Thu, May 31, 2012 at 11:12 AM, holger krekel <holger at merlinux.eu> wrote:
> > On Wed, May 30, 2012 at 19:24 +0200, Martijn Faassen wrote:
> >> Just throwing in my little bit: any change that is made that would
> >> make it easier to run Python 2 and Python 3 interpretors in the same
> >> process would interesting, as I'm still vaguely dreaming (nothing
> >> more) of a combined interpreter that can run both Python 2 and Python
> >> 3 code.
> >
> > Is there a strong reason you want this in the same process?
> >
> > If not you might look into using execnet [1] for connecting python2 and
> > python3 interpreters which then run in two separate processes.  One can
> > build something higher level on top of the base execnet communication
> > along the lines of your "python3_import" suggestion.  It seems you anyway
> > need largely disconnected interpreter states.
> 
> That's an interesting idea. I don't think it would accomplish exactly
> the same goals though.
> 
> I can see two reasons not to do so:
> 
> * developer simplicity: you just start up your Python 3 as usual, you
> can now use Python 2 modules in your project. No need to come up with
> a networked system.

not sure i understand what you mean with "networked" system here.  With

    gw = execnet.makegateway("python3")

a subprocess is created running with python3.  It's true that the
channel send/receive uses a network metapher but underlying is
process-to-process communication, no network involved.

> * efficiency: if you do a lot of calls using a Python 2 library in a
> Python 3 project, you'd like this to work pretty quickly. I think
> using in-process proxies can be made to be quite inexpensive.

it all depends i guess.  It seems that for Quora it was fast enough
to call from PyPy into several libraries deployed on cpython.

Moreover, if you need to munge data coming out from library function calls
the proxy approach may require a lot of communication between
the two interpreters and even if this happens in-process it is overhead.  
With execnet you can execute the munging code with the interpreter
running the library and only send back the result you need.  (On a
side note, with a proxy approach you also need to carefully design 
lifecycle/GC issues for out-of-interpreter references).

> I think a networked approach is useful if you want communicating
> applications, but I'm talking more about a single application that
> uses libraries that might be written in another language. That is why
> I like the FFI analogy; when you interface with a C library from
> Python you generally also wouldn't want to use a networked approach.
> You *would* do this to interface with a networked application written
> in C. To use libxml2 in Python I'd use lxml. If there's already a
> C-based web service that uses libxml2 I'd use that service, but that's
> a different situation. I wouldn't want to have to *have* to do this
> just to be able to use libxml2.
> 
> I think execnet blurs the line between library and application
> integration somewhat, as it allows very intimately communicating
> applications, but isn't the line still there?

There is a line, sure.  It is blurred because one side can send
code to the other.  Which makes a difference in a similar way
how sending Javascript to the client makes a difference - it reduces
communication overhead and makes things faster on the client side.

To conclude, i wouldn't be overly concerned by process-to-subprocess
communication costs.  If i had to combine py3 and py2 code (throw
PyPy in to your likening) i'd go down the Quora route and see how far 
it carries.  After all, this is only an intermediate solution until
everthing happily runs on Python3 anyway, right? ;)

best,
holger