[Python-Dev] Summing up

David Beazley dave at dabeaz.com
Wed May 19 02:35:48 CEST 2010


Antoine,

This is a pretty good summary that mirrors my thoughts on the GIL matter as well.   In the big picture, I do think it's desirable for Python to address the multicore performance issue--namely to not have the performance needlessly thrashed in that environment.   The original new GIL addressed this.

The I/O convoy effect problem is more subtle.   Personally, I think it's an issue that at least merits further study because trying to overlap I/O with computation is a known programming technique that  might be useful for people using Python to do message passing, distributed computation, etc.   As an example, the multiprocessing module uses threads as part of its queue implementation.  Is it impacted by convoying?  I honestly don't know.  I agree that getting some more real-world experience would be useful.

Cheers,
Dave


> From: Antoine Pitrou <solipsis at pitrou.net>
> 
> Ok, this is a good opportunity to try to sum up, from my point of view.
> 
> The main problem of the old GIL, which was evidenced in Dave's original
> study (not this year's, but the previous one) *is* fixed unless someone
> demonstrates otherwise.
> 
> It should be noted that witnessing a slight performance degradation on
> a multi-core machine is not enough to demonstrate such a thing. The
> degradation could be caused by other factors, such as thread migration,
> bad OS behaviour, or even locking peculiarities in your own
> application, which are not related to the GIL. A good test is whether
> performance improves if you play with sys.setswitchinterval().
> 
> 
> Dave's newer study regards another issue, which I must stress is also
> present in the old GIL algorithm, and therefore must have affected, if
> it is serious, real-world applications in 2.x. And indeed, the test I
> recently added to ccbench evidences the huge drop in socket I/Os per
> second when there's a background CPU thread; this test exercises the
> same situation as Dave's demos, only with a less trivial CPU workload:
> 
> == CPython 2.7b2+.0 (trunk:81274M) ==
> == x86_64 Linux on 'x86_64' ==
> 
> --- I/O bandwidth ---
> 
> Background CPU task: Pi calculation (Python)
> 
> CPU threads=0: 23034.5 packets/s.
> CPU threads=1: 6.4 ( 0 %)
> CPU threads=2: 15.7 ( 0 %)
> CPU threads=3: 13.9 ( 0 %)
> CPU threads=4: 20.8 ( 0 %)
> 
> (note: I've just changed my desktop machine, so these figures are
> different from what I've posted weeks or months ago)
> 
> 
> Regardless of the fact that apparently noone reported it in real-world
> conditions, we *could* decide that the issue needs fixing. If we
> decide so, Nir's approach is the most rigorous one: it tries to fix
> the problem thoroughly, rather than graft an additional heuristic. Nir
> also has tested his patch on a variety of machines, more so than Dave
> and I did with our own patches; he is obviously willing to go forward.
> 
> Right now, there are two problems with Nir's proposal:
> 
> - first, what Nick said: the difficulty of having reliable
>  high-precision cross-platform time sources, which are necessary for
>  the BFS algorithm. Ironically, timestamp counters have their own
>  problems on multi-core machines (they can go out of sync between
>  CPUs). gettimeofday() and clock_gettime() may be precise enough on
>  most Unices, though.
> 
> - second, the BFS algorithm is not that well-studied, since AFAIK it
>  was refused for inclusion in the Linux kernel; someone in the
>  python-dev community would therefore have to make sense of, and
>  evaluate, its heuristic.
> 
> I also don't consider my own patch a very satisfactory "solution",
> although it has the reassuring quality of being simple and short (and
> easy to revert!).
> 
> 
> That said, most of us are programmers and we love to invent ways
> of fixing technical issues. It sometimes leads us to consider some
> things issues even when they are mostly theoretical. This is why I
> am lukewarm on this. I think interested people should focus on
> real-world testing (rather than Dave and I's synthetic tests) of the new
> GIL, with or without the various patches, and share the results.
> 
> Otherwise, Dj Gilcrease's suggestion of waiting for third-party reports
> is also a very good one.
> 
> Regards
> 
> Antoine.


More information about the Python-Dev mailing list