[Python-Dev] Summing up

Wed May 19 01:10:24 CEST 2010

On Tue, 18 May 2010 21:43:30 +0200
"Martin v. Löwis" <martin at v.loewis.de> wrote:
> 
> I can understand why Antoine is being offended: it's his implementation
> that you attacked. You literally said "At has been shown, it also in
> certain cases will race with the OS scheduler, so this is not already
> fixed", claiming that it is not fixed
> 
> I believe Antoine does consider it fixed, on the grounds that all
> counter-examples provided so far are made-up toy examples, rather than
> actual applications that still suffer from the original problems.

Ok, this is a good opportunity to try to sum up, from my point of view.

The main problem of the old GIL, which was evidenced in Dave's original
study (not this year's, but the previous one) *is* fixed unless someone
demonstrates otherwise.

It should be noted that witnessing a slight performance degradation on
a multi-core machine is not enough to demonstrate such a thing. The
degradation could be caused by other factors, such as thread migration,
bad OS behaviour, or even locking peculiarities in your own
application, which are not related to the GIL. A good test is whether
performance improves if you play with sys.setswitchinterval().

Dave's newer study regards another issue, which I must stress is also
present in the old GIL algorithm, and therefore must have affected, if
it is serious, real-world applications in 2.x. And indeed, the test I
recently added to ccbench evidences the huge drop in socket I/Os per
second when there's a background CPU thread; this test exercises the
same situation as Dave's demos, only with a less trivial CPU workload:

== CPython 2.7b2+.0 (trunk:81274M) ==
== x86_64 Linux on 'x86_64' ==

--- I/O bandwidth ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 23034.5 packets/s.
CPU threads=1: 6.4 ( 0 %)
CPU threads=2: 15.7 ( 0 %)
CPU threads=3: 13.9 ( 0 %)
CPU threads=4: 20.8 ( 0 %)

(note: I've just changed my desktop machine, so these figures are
different from what I've posted weeks or months ago)

Regardless of the fact that apparently noone reported it in real-world
conditions, we *could* decide that the issue needs fixing. If we
decide so, Nir's approach is the most rigorous one: it tries to fix
the problem thoroughly, rather than graft an additional heuristic. Nir
also has tested his patch on a variety of machines, more so than Dave
and I did with our own patches; he is obviously willing to go forward.

Right now, there are two problems with Nir's proposal:

- first, what Nick said: the difficulty of having reliable
  high-precision cross-platform time sources, which are necessary for
  the BFS algorithm. Ironically, timestamp counters have their own
  problems on multi-core machines (they can go out of sync between
  CPUs). gettimeofday() and clock_gettime() may be precise enough on
  most Unices, though.

- second, the BFS algorithm is not that well-studied, since AFAIK it
  was refused for inclusion in the Linux kernel; someone in the
  python-dev community would therefore have to make sense of, and
  evaluate, its heuristic.

I also don't consider my own patch a very satisfactory "solution",
although it has the reassuring quality of being simple and short (and
easy to revert!).

That said, most of us are programmers and we love to invent ways
of fixing technical issues. It sometimes leads us to consider some
things issues even when they are mostly theoretical. This is why I
am lukewarm on this. I think interested people should focus on
real-world testing (rather than Dave and I's synthetic tests) of the new
GIL, with or without the various patches, and share the results.

Otherwise, Dj Gilcrease's suggestion of waiting for third-party reports
is also a very good one.

Regards

Antoine.