[Python-Dev] "Fixing" the new GIL

Mon Mar 15 00:29:48 CET 2010

On Sun, Mar 14, 2010 at 4:31 AM, Nir Aides <nir at winpdb.org> wrote:
> There are two possible problems with Dave's benchmark:
> 1) On my system setting TCP_NODELAY option on the accepted server socket
> changes results dramatically.
> 2) What category of socket servers is dave's spin() function intended to
> simulate?
> In a server which involves CPU intensive work in response to a socket
> request the behavior may be significantly different.
> In such a system, high CPU load will significantly reduce socket
> responsiveness which in turn will reduce CPU load and increase socket
> responsiveness.
> Testing with a modified server that reflects the above indicates the new GIL
> behaves just fine in terms of throughput.
> So a change to the GIL may not be required at all.
> There is still the question of latency - a single request which takes long
> time to process will affect the latency of other "small" requests. However,
> it can be argued if such a scenario is practical, or if modifying the GIL is
> the solution.

Such a scenario is practical and is an area that we really should not
fall flat on our face in.  Python should not regress to performing
worse when an application adds a cpu intensive thread that isn't
tempered by the activity of its IO threads.

As for the argument that an application with cpu intensive work being
driven by the IO itself will work itself out...  No it won't, it can
get into beat patterns where it is handling requests quite rapidly up
until one that causes a long computation to start comes in.  At that
point it'll stop performing well on other requests for as long (it
could be a significant amount of time) as the cpu intensive request
threads are running.  That is not a graceful degration in serving
capacity / latency as one would normally expect.  It is a sudden drop
off.  (followed by a sudden ramp up, and a sudden drop off and a
sudden ramp up, and drop off... etc.  not pretty.  not predictable.
not easy to deploy and manage as a service)

If we don't fix this issue, we need to document as an official part of
the Python threading module docs that people should not mix
computation with IO in threaded applications when using CPython if
they care about IO performance.  This will blow people's mind.

> If a change is still required, then I vote for the simpler approach - that
> of having a special macro for socket code.
> I remember there was reluctance in the past to repeat the OS scheduling
> functionality and for a good reason.

I don't consider needing to modify code specifically for it to
indicate if it is IO bound or CPU bound to be the simpler approach.

So +1 on the interactiveness calculation for me.

+0.5 on annotating the IO lock release/acquisitions.

Basically: I want to see this fixed, even if its not my preferred
approach I still think it is important.

-gps