[Python-Dev] Free threading

Guido van Rossum guido@python.org
Wed, 08 Aug 2001 00:09:55 -0400


[me]
> > Anyway, I don't see the point: if you want two 100% independent
> > interpreters, you can get that easily by forking two separate
> > processes.  That's what separate processes are *for*...

[PP]
> I am 95% in agreement with you and that's what I told the Perl guys.
> That's why I see their "ithreads" solution as a little weird. But the
> other 5% is when you are running in an already-threaded environment and
> it is only the Python stuff that you want to be separate (for
> performance). e.g. Apache, IIS, Sendmail, COM etc.

Yeah, but it's just not worth rewriting all of Python to get rid of
the GIL for *that*.  There are plenty of other ways to deal with it,
including pre-forked longer-running processes -- and remember, it's
only necessary if you need more than one CPU just for running your
Python code!  (E.g. Zope, which has a perfectly good Zope-specific
solution if you need multiple CPUs: ZEO.)

> Also, Windows doesn't have "fork" so spawning a process means actually
> running an executable from disk AFAIK. That's somewhat inefficient. But
> I'll repeat that in the big picture I agree with you. The combination of
> shared-data threads and independent-data processes is good enough for
> most real-world apps.

> >...
> > I bet the level of sharing exposed here can be implemented easily on
> > top of the mmap module.
> 
> Please excuse my ignorance: how do you do locking in an mmap solution?

Good question.  I've never used mmap myself. :-)  I know Unix shared
memory has locks and semaphores; the mmap module apparently doesn't
(possibly because Windows has a different philosophy there).

I would probably hide mmap and all other gory details in a module
whose interface is similar to that of Queue.  It could limit the data
type for items stored in the to strings, or it could use marshal or
pickle (or even XML :-) for more flexibility.

> >...
> > That sounds to me a quicker road to an efficient SMP solution than
> > trying to get rid of the GIL.
> 
> That's why I mentioned it. For whatever reason, the GIL project seems to
> only be doable by Greg S. (or he's the only one with interest?) so I
> thought a new approach might catch someone's interest.

IMO there's nothing about Greg here.  He did it once, that's all. :-)

> Does your mild statement of approval indicate that you see the benefit
> of having independent in-process interpreters?

My approval of what?  I like the independence of in-process
interpreters as far as it goes (and I would like to see the exception
bug fixed) but I don't think we need something more independent.  I
think I was arguing that for full independence of the GIL you should
fork a new process (Windows notwithstanding -- hey, it's not *that*
bad there unless you are trying to fork several per second, in which
case I would question your sanity anyway :-).

> Or are you still thinking about using real processes?

Yes.

> If the former, it still seems that the
> interpreter itself needs a little work so that it is safely reentrant in
> the sense of any other C library. There especially needs to be a way to
> NOT share a GIL and NOT share any other mutable data.

That would still be very hard to fix, and I don't think it's worth it.

> If you are talking about real processes, then "all" that's needed is an
> efficient RPC mechanism.

That's right, that's what my Queue-like idea above intended to be.

--Guido van Rossum (home page: http://www.python.org/~guido/)